Protecting and Classifying Your Data using Azure Information Protection

The Azure Information Protection (AIP) client is a much-welcomed improvement from the previous Azure RMS Sharing application.  The AIP client can be downloaded for free and its supported-on Windows 7++ and MacOS10.8++.  The AIP app also supports mobile devices running IOS or Android.  The AIP app replaces the RMS sharing app on both platforms. 

The AIP client provides enhanced usability for the everyday user to protect and classify files in a simple and straight forward manner.  The AIP client can protect most file types out of the box.  Users can easily protect other files types such as images, PDFs, music, and videos all through the AIP client.    The user can also use the AIP client to protect sensitive emails.  In this article, I am going to explain how users can protect and classify files by using the AIP client within Microsoft Office Word, Excel and PowerPoint 2016.  We will then touch on the configuring Azure Information Protection labels and policies within the Azure portal.

Azure Information Protection Requirements

Let’s use a real-world business use case as the foundation for this walkthrough.  This will provide a real example that can be replicated throughout your own organization if desired.  Here is a bulleted breakdown of the requirements:

  • All Office files and emails created by the Finance Management group must be automatically classified as confidential
  • The AIP policy should be scoped to the Azure AD group BR Management Team and should not affect all users in the organization
  • When a user that belongs to the BR Management Team group creates a new email the email should be automatically classified as confidential and protected
  • Emails that are classified as confidential cannot be forwarded
  • Users can override the recommended label classification but should be warned when doing so
  • A watermark should be applied to all files and emails classified as confidential in the footer
  • Protected data should be accessible offline

Now that we have gone through the requirements for the use case lets jump into how we can accommodate all of them in our final solution.  It is worth mentioning that there are some prerequisites for using the AIP client that I will not be covering in this article.  Please find that information in the getting started with AIP article found here.

Let’s begin with what the user sees within Office 20016 when AIP has been activated and installed.  As you can see in the screenshot below from Word the AIP client is an add-on to Office 2016.  Once installed you will see the protect button in the ribbon.

aip-1.png

If you click on the show bar option you will notice the sensitivity settings bar as shown below in the screenshot.  The sensitivity labels can be manually set by an end-user.  Labels can also be set automatically based on the file/email content though.  Labels belong to a default AIP global policy which includes all users within your organizations Azure AD.  The different default sensitivity labels are also shown in the screenshot below.  These labels can be customized and new labels can be created through the Azure Information Protection resource in the Azure Portal.

aip-2.png

Additionally, AIP administrators have the ability in the Azure portal to create scoped policies.  These scoped policies can be created for specific groups of users and edge cases where customized labels and protection is required. All users in a specific department such as finance management require a stricter set of standards for labeling and classification because of the sensitivity of the files and emails they deal with daily.

Configuring AIP Policies

Below I have created a new scoped policy called Finance Management Confidential.  I have selected the appropriate management team group.  This is important to note because this is the group of users who will get the Finance Management Confidential AIP policy.  When we customize this policy, we are customizing what the group of users we have selected will see in their sensitivity bars throughout all of the Office 2016.  Additional labels and sub-labels can be created specifically for the selected group of users.

aip-3.png

As you can see in the image above I have created a new sub-label under the Confidential label.  Sub-labels provide a further level of classification that can be scoped to a subset of users within your organization. 

In the sub-label configuration image below, I have configured the footer text to show the text “confidential”.  This is also where you can setup Azure protection for the specific AIP label that you are creating.

aip-4.png

Once you have selected Azure RMS under the protection heading you can then begin to configure the different Azure RMS permissions.  In here we will make sure that data that is classified with this sub-label cannot be printed or forwarded.  Now that we have configured the protection for our sub-label we can now save this sub-label.  This sub-label is officially configured with AIP and all files that are classified with this sub-label will be automatically protected with the permissions that were setup in the previous step.  Once you have saved the sub-label to the policy make sure that you publish your scope policy. 

aip-5.png

Using AIP in Office 2016

Once the policy has been published it will be pushed to the users detailed in the policy.  Users who belong to this policy will see that all files they create or open will have the recommended sub-label that was created in the previous steps.  If the user hovers over the recommended labeling the tool tip description will pop up which provides valuable information to the users when they are deciding the classification of the document.  It’s important to be concise and spend some extra time on the description of your organizational labels.  These will help guide users in making the right decision when classifying new files. 

aip-6.png

Of course, you can always force the classification and labeling of files and emails instead of recommending a label.  This is useful when using conditions with your policy.  You can force the label of a document or email if for example the condition detects that there is sensitive data such as social security numbers or credit card numbers.  Forcing could potentially erroneously label a file causing additional administrative overhead.  In most cases providing a recommendation and specifying in the policy that the user be warned when reclassifying files that have less restrictive protection.  Such as reclassifying a file recommended as confidential to public.  This would require an auditable action that the user in fact acknowledged that they were reclassifying the file.

Once the file is labeled it will inherit all the classification and protection rules that were applied while editing the policy in the Azure portal.  This includes any protection that was setup for the labels by administrators.  The image below shows a Word document that has been classified by the sub-label Finance Management that was created earlier in this article.  Notice the classification in the left-hand corner of the image below and the footer text which was automatically applied after selecting the recommended label.

aip-7.png

Using the AIP client, the user can decide to downgrade a classification if needed.  Users will be prompted with the image below to set a lower classification label.  This will deter users from simply declassifying files that may be sensitive.  The user acknowledgement is an auditable action.

aip-8.png

Users can manually setup custom Azure RMS permissions if needed by selecting the AIP protect button in the ribbon within their favorite Office 2016 application. 

aip-9.png

The one disadvantage with using this method is users will only be able to configure permissions for one level of rights.  To clarify, if you want to provide two groups of users with two different levels of permissions for example, read only and edit, you will need to use the protect document button within Office 2016.  To do this first select File then Info, then select the Protect button as shown in the image below.  You will notice that our custom confidential AIP sub-label that we configured is also showing up in the restricted access context menu. 

aip-10.png

A user could easily select a label if they wanted to from here.  To get around the issue with applying multi-level custom permissions users can select the restricted access menu item.  Using the permissions dialog box that pops up users can now assign multiple levels of permissions to users and groups.

aip-11.png

Now let’s open up Outlook as a user who belongs to the finance management group.  As you can see in the image below the policy is automatically recommended on all new emails.  The behavior for classification in the Outlook 2016 client for email classification is similar to the rest of the AIP supported Office applications (Word/Excel/PowerPoint).  Once the label is selected all policies are applied to that email.

aip-12.png

Conclusion

The Azure Information Protection client provides the easiest way to classify and protect files and emails when creating or editing them from within the Office desktop applications.  The client is just one piece to the entire puzzle that is AIP.  The real key is in the planning and creation of meaningful labels and classification policies for your users.  This helps to drive users to begin using these classification policies with ease.  I must say from past experience the less the users have to think about the better.  If the classification labels are clear and help guide the user than the users are more likely to engage.  Additionally, forcing users to classify files and emails isn’t always the answer except in specific highly sensitive scenarios.  The AIP client is constantly being improved and added to.  In fact, there was a new version with new capabilities pushed out just this week and can be downloaded here.

 
calltoaction-paas.png

B&R can help you leverage Azure Information Protection

Information Management Policies and Complex Workflows

One of the key Enterprise Content Management (ECM) features provided by Microsoft in SharePoint Server is the Information Management Policy feature.  These policies can be used to establish multi-stage retention policies, but the scheduled nature of this feature opens it up for so much more. 

Note:  If you are not familiar with the information management policies and would like a general overview, see Plan for information management policy in SharePoint Server 2013.

For our purposes, I chose to leverage these features to support the multi-stage activity which supports scheduling each stage, defining an action to execute, and a recurrence schedule if applicable.  Using these retention schedules, we can execute scheduled activities to support business processes such as content retention, disposition, or to build a contract management solution.

Scheduling a Stage Date

The ability to schedule the start of the stage is both simple, and powerful.  You simply select one of the document’s date fields, which can be either a system field such as Created or Modified or can be a custom field as in the example below.

20170913-1.png

Action

Next, we focus on the action to take within this stage.  The configuration comes with the following actions:

  • Move to Recycle Bin:  Moves to the recycle bin for orderly removal
  • Permanent Delete:  Bypasses recycle bin and is immediately deleted
  • Transfer to another location:  Move to another site such as an archival or records site
  • Start a workflow:  Start a workflow that is associated with the content type
  • Skip to next stage:  Move directly to the next stage
  • Declare record:  Declare the item as a record in the system (in place records management)
  • Delete previous drafts:  Cleanup previous draft, minor version copies
  • Delete all previous versions:  Cleanup all previous versions

While these actions can be helpful, this is where most people start hitting the brakes.  If you have an important legal agreement or contract, you probably don’t want to just delete it or move it to a recycle bin when it is scheduled to expire.  You probably want somebody to review it and make sure it is actually no longer needed or does not need to be renewed.  For those that are familiar with the power of workflows the “Start a Workflow” action sounds great until you click that list and see an empty list of available workflows.  This is the single biggest hurdle for most people, and the point where many turn back.  Do not worry, we will come back to this shortly. 

Retention

The recurrence settings are also straight forward allowing you to repeat a stage based on a number of days, months, or years as the image below illustrates. 

20170913-2.png

Complex Workflow!

As I mentioned earlier, the “Start a workflow” action list is blank by default.  This is where our ability to implement complex workflows comes to the rescue.  These workflows can be developed using SharePoint Designer, Visual Studio, or our preferred tool Nintex Workflow.  The trick is that whatever path we choose, we need to be able to associate the workflow with the specific content type(s) for it to be available in the list of workflows within the “Start a workflow” action. 

To create a workflow that can be associated with a content type in SharePoint Server, navigate through Site Actions menu, select Nintex Workflow (2013/2016), and then Create Reusable Workflow Template as illustrated below.

20170913-3.png

We then define our workflow name, description, and associate it with a content type.

20170913-4.png

Here is an example of a Contract Review workflow we created for demo purposes.

20170913-5.png

Once our workflow is saved, we can now visit the Site Content Type Information page (Site Settings-> Site content types -> select our content type) and click the Workflow settings action under settings.

20170913-6.png

Next, we can select our workflow template and provide a unique name for the process.  For workflows that are triggered by the Info Mgt Policies, you can set the start options to enable “Allow this workflow to be manually started” and disable the new and edit options. 

20170913-7.png

Now that the workflow is associated with the content type, we can configure our Retention Policy.  From the Site Content Type Information page, select the Information management policy settings action. 

20170913-8.png

Select the “Enable Retention” option to enable the retention options and then click the “Add a retention stage” action to load the stage configuration form. 

20170913-9.png

The retention stage configuration form options were explained previously.  Define an appropriate stage schedule based on a date comparison with a date field.  The comparison can be based on days, months, or years. 

20170913-10.png

Next, select the “Start a workflow” option from the Action list and select the workflow you previously configured for the given content type. 

If applicable, configure an appropriate recurrence schedule. 

Then, click the Ok button to save your changes and continue. 

If needed, you can configure multiple stages.  For this example, you can see for the given contract content type, there is an initial stage for review.  After it progresses through the “review” stage, the second stage was configured to have a contract disposition workflow one year after expiration if the contract was not renewed as illustrated in the image below. 

20170913-11.png

Once the changes are fully saved, the document will be reviewed based on the internal process schedule and the workflow initiated. 

Single versus Multiple Stages (Multiple Workflows)

While it is possible to design and implement a single workflow that can handle the logic from the individual stages, there are some advantages to breaking the workflows down into the individual workflows for each stage.  It certainly makes the workflow easier to manage within the designer, but it also gives you more granular tracking for executions leading to clearer insights and reporting without having to build in a lot of extra actions within the workflow to break out and report on the individual stages.  Ultimately, the requirements can be fulfilled either way, but we find it easier to maintain and support with individual workflows for each stage.

 
calltoaction-general.png

Need assistance with retention and disposition workflows?

Keys to Designing and Managing Large Repositories

The B&R team has some deep experience managing large structured document repositories within SharePoint. In some cases, those repositories were established and grew within SharePoint organically over time, while in others there were large sets of content that were migrated in from networked file shares or other ECM solutions like Documentum, OpenText, or FileNet.

Throughout SharePoint’s long history there has often been confusion between software limitations and best practices. To make matters worse in many cases there is no global agreement on what the best practices are. In our experience, many of the best practices are really guidelines that are context specific. There is no generic answer, but rather a starting point from which the requirements can then shape the most appropriate solution within the right context. In our experience, some of those key decision points are:

Structured vs. Unstructured Content

paperclips-unorganized-sm.png

Loosely categorized unstructured content stored in desperate locations across the organization.

paperclips-organized-md.png

Centrally managed structured content that has been properly categorized.

While SharePoint can be used in many ways, the first contextual decision point is structured versus unstructured content. In this post, we will be specifically focused on structured content storage repositories with content types and meta-data defined, and not unstructured repositories. This is an important differentiator for us, since the organization and usability of content in unstructured systems is radically different.

Software Boundaries and IT Capabilities

Understanding the limits of your platforms and systems is vitally important.

When thinking of the actual software and storage boundaries, SharePoint as a platform is very flexible, and continues to increase limits as modern storage and database limits increase. Here are references to the software boundaries for 2013, 2016, and Office 365.

One of the common misconceptions is that the infamous “List View Threshold” implemented in SharePoint 2010 is a boundary limiting the number of items in a list or library to 5,000. This is not a limit to the number of items in a library, but rather the number of records that can be read as part of a database query. This specific topic will be addressed as part of the System User Experience and Content Findability section.

For on-premises versions of SharePoint Server including 2010, 2013, and 2016 our focus has been on establishing system sizes that can be reliably maintained including backup and recovery procedures. This is a pretty important point because in my experience the capabilities and expectations of my clients vary widely. In some cases, they have deep experience with multi-terabyte databases and have plenty of room to work with to both backup and restore databases as needed. In other cases some customers struggle with backing up and restoring databases that are just a few hundred gigabytes due to backup technologies or lack of available working storage space. With this in mind, our initial guiding points are to look at how to isolate the structured repositories into dedicated site collections, each with a dedicated content database. The number and size of those site collections vary depending on the customer’s requirements and backup and recovery capabilities. We frequently start by advising on smaller database sizes of around 100 GB and then adjust based on their comfort levels and capabilities, but they should never exceed the Sys Admin’s ability to capture and restore a database backup.

For Office 365, Microsoft has taken ownership of the system maintenance and regular backup and recovery operations. Within, the service they have also extended the software boundaries which can make it much easier to support systems with larger repositories and fewer site collections, pushing much of the decision to the next two points relating to system usability and content findability.

System User Experience and Content Findability

The user experience of the repository is essential to its long-term success.

We will focus on looking at the processes to initially add or author content and then how it will be later discovered. Patterns and techniques that work fine in other sites or repositories can completely fail with large repositories.

While SharePoint as a platform is typically thought of in terms of collaboration and collaborative content, the scenarios for structured content in large repositories is often different. In some scenarios, the content may be comprised of scanned documents or images that are sent directly to SharePoint, while in others they could be bulk loaded electronic documents.

Unlike the collaborative scenarios, you very rarely want to add the content to the root of a SharePoint library, but either organize the content across libraries and/or sub-folders. To better handle this scenario, we will often incorporate the Content Organizer feature that Microsoft made available with SharePoint Server 2010 which offers a temporary drop off library and rules to selectively route content to another site collection, library, or folder. This rules based approach provides great automation capabilities that help to keep things properly organized, while making it much easier to add content to the system. While the Content Organizer covers most of our common scenarios, we are able to support even more advanced scenarios for automation by leveraging a workflow tool or customization when needed.

Previously, the List View Threshold feature was mentioned. While it is often discussed as a boundary or limitation, it is actually a feature intended to help maintain system performance. For SharePoint Server 2010, 2013, and 2016 it is a system setting that can be set at the web application level. The intention of this feature is to provide protection against unoptimized queries being run against the back-end SQL server. The default value of 5,000 was chosen because that is the point in which queries are processed differently by the database’s query engine, and you will start to see performance related problems. While it is safe to make small changes beyond the default limit, quickly you will experience the performance impacts the feature was designed to avoid.

The important thing to remember is that the threshold is for a given query, so the key task is to plan and implement your views to be optimized. We do this by thinking about a few key things:

Configure the Default View:

By default, SharePoint points to the All Items for the default view. Ideally, no view will be without a viable filter, but the All Items view absolutely should not be the default view in these libraries.

Column Indexes:

Key columns used to drive views or as the primary filter within your list can be indexed to improve performance. Additional information can be found here.

View Filters:

Ideally, all views will be sufficiently filtered to be below the List View Threshold of 5,000 items. This will keep view load time low.

Lookup Fields:

Avoid the use of lookup fields, as these lookup fields will require inefficient queries that perform tables scans to return content. Even smaller repositories of just a few hundred items can exceed the List View Threshold because of the query formatting.

Avoid Group By, Collapsed Option:

While the ability to group by your meta-data can be powerful, we typically instruct our clients to avoid using the option to collapse the Group By selections. The collapse option has some unexpected behavior that will result in additional database queries for each of the group by levels and values and disregard the item limits and paging configuration. It is possible to limit a view to say 30 items, but if you configure it to group by a value and collapse it by default, the first category could have 1,000 items and the system will query and load the full list, ignoring the 30-item limit. This can have severe performance implications, and is typically the primary culprit when we are asked to help troubleshoot page load performance in a specific repository.

While the ability to easily and effectively locate content has a big impact on the user experience of the system, I would argue that it is the most critical and therefore one that needs to be thought through when working within the SharePoint platform so I have broken the topic out into its own section.

If you think about SharePoint sites on a scale or continuum from the small team sites with a few libraries containing a handful of documents up to large, enterprise repositories with millions of documents it should be clear that how you find and interact with content on the two opposite ends of the spectrum needs to evolve. As systems grow larger, well beyond the minimalistic List View Threshold levels, the system needs to become more sophisticated and move away from manual browsing to content or unstructured search keyword queries to a more intelligent search driven experience.

While most systems include a properly configured search service, a much smaller percentage have it optimized in a way that it can leveraged to provide structured searches or dynamically surface relevant content. This optimization takes place at two levels; first within the search service itself, and then with customizations available in the system.

Within the Search Service we will work to identify meta-data key fields that should be established as managed properties for doing specific property searching, determine which fields need to be leveraged within a results screen for sorting and use within refinements. These changes allow us to execute more precise search queries and optimize the search results for our specific needs.

Within the site, we will then look to define the scenarios that people need to look for and find content to define structured search forms and result pages optimized for those scenarios. In some cases, they are generic to the content in the repository, while in others the scenarios are specific to a given task or role helping to simply things for specific business processes. By leveraging structured search pages, we can provide an improved user experience that dramatically reduces the time it takes to locate the relevant content as the initial search results are smaller, and then easily paired down through relevant search refiners. In addition, on common landing pages we will leverage the additional search driven web parts to highlight relevant, related, or new content as needed to support the usage scenarios.

Our Approach to Designing Record Center

As we set out to design and implement our Record Center product, we knew that it must scale to tens of millions of records both with regards to technical performance and from a user experience perspective. To accomplish this, we automated the setup and configuration process in ways to help optimize the solution for our specific purpose and use case.

While doing a product feature overview is outside the scope of this post, we are happy to report that our approach and techniques have been successfully adopted by our clients and that today the average repository size is in the hundreds of thousands of documents while still meeting performance, usability, and system maintenance goals.

Next Steps

I hope that this post provided a good overview of how to plan and maintain large repositories. It is a big topic with lots of nuances and techniques that are learned over time in the trenches. If your group is struggling with designing and managing large repositories and needs help, reach out and setup a consultation. We can either assist your team with advisement services, or help with the implementation of a robust system.

Can We Help?

Contact us today for a free consultation!

Not All Customizations Are Bad

As I meet with technology and business executives, one of the topics that keeps frequently coming up is whether to customize SharePoint of SharePoint Online in Office 365.  There continues to be a lot of misunderstanding around what can be done safely, and what is going to cause long term stability or maintenance problems. 

In the Beginning

The first thing to understand is that not all customizations are the same, or have the same level of risk or impact within the system.  In the early days of SharePoint, the platform was completely open for customizations and in some cases developers had free reign to do whatever they wanted or needed.  In some cases, poor decisions were made or bad code was written.  Generally, the mistakes fall into a few categories; the developers were inexperienced with the platform and didn’t know any better, or they were not forthcoming with information on what kind of impact there would be maintaining the solution or when upgrading SharePoint to the next version. 

Over time, SharePoint started to get a bit of a bad rap as being difficult to upgrade if there were any customizations or commercial add-ons within the system.  The people that know the system, of course, know how to mitigate this risk, and address the upgrade challenges -- but again, that assumes knowledge and competencies that only a small percentage of people have. 

Evolution of Customizations

Over time, Microsoft and the community-at-large learned some valuable lessons and responded with better guidance, as well as new options for how to interact with and customize SharePoint. The focus shifted more toward client-side development and interacting through standardized web services. 

  • Full Trust Solutions:  Server code that runs within the server, this is the traditional SharePoint Server customization, which is typically deployed with a .WSP file. 
  • Sandbox Solutions:  Microsoft’s first attempt at providing a system for customizations that runs in a safe, isolated space to ensure customizations have little to no negative impact on SharePoint sites.  Unfortunately, this did not prove to be a powerful enough solution and so it was deprecated.
  • Add-In (initially called App) Model:  Isolated applications that can interact with SharePoint through published APIs in a safe manner.
  • SharePoint Framework (SPFx):  A new, lighter method of developing SharePoint interface customizations through client script, without a full Add-In package. 

As Microsoft has evolved, the public APIs for SharePoint have also evolved.  The client side APIs for interacting with SharePoint have become very robust, and while you cannot do everything that you used to be able to do with the server side APIs, it could be argued that it provides a good set of safety rails for the average SharePoint developer, ensuring that the risk to long-term stability is minimized. 

Management Tools

What spurred the idea for this blog post was a customer conversation around a request for a tracking system that needed to be setup and then reset for each new calendar year.  Using the published APIs, it is simple to create a process that can safely automate the setup in a repeated manner.  The customer in question was concerned with implementing a customization that could have a negative impact on the system in the future, and stated that they have a "no customization" policy.  It proved to be an interesting conversation as I uncovered the root of the concerns and perceptions.  In the case of the solution we were proposing, there would be no customizations deployed to SharePoint, and therefore no artifacts left behind that could impact SharePoint in any way.  Since our tools would be run locally and interact with SharePoint through the REST services, our solution would be safe and effective.  This is the same approach taken by most SharePoint management tool vendors, like our partners at Sharegate and Metalogix. 

Our Approach

At B&R, our approach is always to understand the goals of what the client is trying to accomplish and then figure out the most appropriate way to accomplish those goals.  In some cases, we still get requests that are best addressed with the traditional Full-Trust code model.  In those cases, we have to have an open and honest conversation about what that means, and what the ongoing costs will be for the customer.  At times, that is the only way to address the requirements (for on-premises) customers, while in other cases it may be a more cost-effective and attractive fit than adopting a provider hosted add-in. Whatever the path forward, we make sure that the pros and cons are fully understood. 

While we do still find ourselves building some Full-Trust solutions, it is difficult to argue against client-side code being the future, and many of the modern SharePoint development techniques find their way into our solutions one way or another.   

Can We Help You on Your Journey?

Are you experiencing issues upgrading a farm with customizations, or are you looking for assistance in getting a solid development plan in place for the foreseeable future?  We would like to help you make the most of the platform, and leverage all the options that are available to you.  If you would like to talk through your goals and challenges, please reach out and setup a consultation.