Book Review-Information Governance: Concepts, Strategies and Best Practices

Unlike most of my reviews, I should start with a small disclaimer. Robert Smallwood, the primary author, reached out to me over a year ago and asked if I’d consider helping him with some SharePoint and Office 365 content in his Information Governance: Concepts, Strategies, and Best Practices book. I’ve received a brief mention in the book and a small section of the content is something I wrote. With the disclaimer out of the way, let’s dive in.

Everything and the Kitchen Sink

One of the challenges to information governance is that it covers so many topics, many of which are full-time disciplines themselves. It’s a monumental challenge to pull together a resource with this kind of breadth and one that Smallwood has executed faithfully. I won’t say that all the content is perfect, because it’s not. However, I will say that is a great overview to several topics that are important to implementing an effective information governance plan.

Information governance has a singular goal to maximize the value of information including how it’s used and the mitigation of risk. While the goal is simple, the implementation is far from it. The number of different considerations from different disciplines is numerous and potentially overwhelming. If you need a quick summary of information governance, see my post, Explaining What Information Governance Is.

Dark Data

Astronomers estimate that dark matter represents about 85% of the matter in the universe and about a quarter of the overall density. Its presence is implied by observations, including gravitational effects, but because it doesn’t appear to interact with the electromagnetic spectrum, it’s difficult to detect directly. This gives rise to the term “dark data,” which refers to information that isn’t categorized properly and therefore is difficult to find and use.

The degree to which dark data is a problem is debated, but estimates on the amount of data that is “dark” are somewhere around 50% of all the data collected and stored by an organization. When you add in other forms of data like redundancy, about 69% of information has no business, legal, or regulatory value (according to the Compliance, Governance, and Oversight Council (CGOC)). In short, much of the data that we’re storing is data that we shouldn’t be storing, and it’s only getting worse.

Exponential Growth

The largest problem facing information governance is the velocity with which the volume of data is growing. It’s estimated that 90% of the existing worldwide data was created in the past two years. How can you keep up if every time you get a handle on things, the entire scope of the problem changes? The answer cannot be found in control, though it may be found in a combination of guiding and control.

It may be that the only way to manage the onslaught of data is to control some aspects – such as the retention of data from IoT (the Internet of Things) sources – and suggest standards for how people manage the information they work with directly.

Information Value

One of Smallwood’s key tenets to information governance is the subject of the Infonomics book. That is, that information should be treated like an asset. The only way to extract value out of something is for that something to be an asset. If information isn’t valued like an asset, then it will be impossible to extract value from it.

There’s an awareness that we continue to create more data, information, and knowledge than at any point in human history, and we spend immense sums to store and manage this information. There is relatively less awareness of the value that could be derived. We’re holding most of the data “just in case.”

Going Phishing

Holding on to information is particularly challenging because of the risk that its value will be unwittingly discovered by a nefarious third party. The most voluminous approach to infiltrating the corporate infrastructure comes in the form of phishing attempts. Every day, attackers are trying to trick users into using their credentials to authenticate to a fake website. They’re trying to convince users to open documents sent to them, which have been intentionally crafted to exploit vulnerabilities, in the hope that your organization hasn’t yet patched the vulnerability.

The propensity for employees to trust email and to do the attackers’ bidding causes it to be the most common attack vector and the one which is the hardest to address. In Transformational Security Awareness, Perry Carpenter explains why this is the case and what to do about it.

Containing the Leaks (DLP)

Sometimes, the reason that corporate information escapes the walls of the organization isn’t because of nefarious individuals trying to hack into the corporate treasure trove. Instead, employees are subverting the security controls by sending copies of the files they work with to their personal email accounts and uploading them to their personal file sharing repositories. It’s also employees sharing information to third parties in a careless manor.

Some types of information, particularly personally identifiable information (PII), should not leave the bounds of the organization’s network without clear rules and agreements, yet it happens every day. Social security numbers are transmitted in clear text via email and subjected to unauthorized observation.

Solutions for addressing these problems are called digital loss prevention (DLP) – though I believe that they would more accurately be described as digital leakage prevention solutions, since the information isn’t lost, it’s leaked.

Long Term Digital Preservation

Sometimes, the loss isn’t the direct result of a failure of hardware. Sometimes, an inability to recover important information happens because of the frailty of media. In these situations, creating multiple copies and periodically refreshing the media can help. However, another more challenging problem exists as data is locked away inside of file formats that can no longer be decoded.

Consider video recordings that were encoded in Adobe’s Flash SWF format. Most of the modern video players will not play video that was encoded in this format. If you have videos that you must maintain for a long period of time, the file format you choose matters. The MP4 H.264 AVC format is a stable format that’s likely to be supported for some time – but that means converting the file into this format for long-term preservation.

Luckily, long-term preservation of images and documents can be accomplished using the PDF/A standard format that is likely to be supported by most file viewers and operating system for the foreseeable future. Other file formats must be managed so that their file format can be read in the future.

Too Many Records to Manage

The truth is that the volume of information we’re producing now greatly exceeds the capacity of users to properly manage and classify the information. We know that users will not invest the time to properly tag and provide metadata for files in a general case. Whether they see this as not important or not their job doesn’t matter – what matters is that a substantial amount of the information being captured today is difficult to find, because it’s not been properly tagged.

The idea that employees are willfully not complying with requests for metadata information assumes at some level that they’ve been informed of what is expected and been made aware not only of the consequences of failure both personally and to the organization – but also that they’ve been given the tools to accomplish the appropriate tagging.

“The tools” means more than just the software. It means a guide for identifying what metadata needs to be supplied to which files and how to properly identify records that must be protected and preserved. Most people in the organization are focused on getting their job done, and rarely is the preservation of records considered to be a part of that process.

Developing Business Objectives

Ultimately, the most important aspect of an information governance program is the identification of the business goals for the program. What specific value does the organization believe they can get through better management of the information, including how they can extract the value of the information and how to mitigate risk? An information governance program is doomed if the only selling point to the program is reduced risk.

Organizations deal in risk, and it’s impossible to cover every risk. As a result, the organization often must decide which risks it must accept and move on. You don’t want your information governance program to be killed, because the risks it mitigates for the organization aren’t important enough when stacked up with the other competing priorities and risks.

In the end, Information Governance can put you on the right path towards extracting better value out of the information that your organization already has.

Information Governance and Water: The Results of Control

Some information governance programs are focused on command and control. Thou shalt do this or that. Thou will not do something else. And while, on the surface, these tactics seem to work, they drive behaviors underground and expose information to more risks and simultaneously reduce productivity. While, on the surface, control looks like a good solution, it typically fails in the end.

Pressure

Pressure vessels are a marvel of the modern world. We can compress a gas and keep it contained. The release of pressure is what drove the industrial age through steam engines that increased in pressure and drove us forward. The problem is that pressure vessels fail. Early in the Industrial Age, there were numerous deaths due to the spontaneous destruction of a pressure vessel. Steam engine boiler tanks burst and killed people.

Every pressure vessel that we create has a point at which it can no longer contain the pressure and fails. The problem is not so much that there is a point of failure. The problem is that, when there is a failure, it’s unpredictable and so destructive. That’s why applying too much pressure in your information governance program in the form of control can lead to some disastrous consequences.

Information Governance Pressure

How, one might ask, does information governance apply pressure to an organization? The answer lies in the pressure that is exerted between the normal and desired behaviors and the behaviors that the governance plan tries to enforce. Like a dam holding back water, there is pressure against the policies to allow the individual and the organization to do their normal work. Like a dam, these policies hold back the normal flow, which may cause useful reservoirs, but those dams have limits.

We’ve seen stunning examples of dam failures. One moment, everything seems fine. The next moment, there’s a wall of water flowing down. In most failures, the problems start well before the final moment of failure. There’s some erosion in an earthen dam. Water slowly seeps through and erodes the base that the dam needs until it fails, and that failure causes the remainder – or a substantial portion – of the dam to fail.

Information governance programs do need to shape the flow of the information in an organization, but to do so without recognizing the limits is inviting people to subvert the official processes and do something less secure.

Escaping Pressure

Like water finding its way to and through weak spots in a dam, so, too, will users find ways to do their jobs even when the information governance program prevents such activities directly.

Consider password rules. NIST (The National Institute of Standards and Technology) has changed their guidance on passwords, because the degree of complexity in managing passwords necessitated that people start writing them down and storing them in places that made them less secure than simply having a single password that never changes – or a password that never changes plus a second factor authentication. There’s the tacit acknowledgement that the password complexity rules and change frequency forced people into behaviors that actually reduced rather than increased security.

What about sharing rules? Organizations want their workers to collaborate with external partners and consultants, but when users are prohibited from sharing the documents directly, they place corporate information in personal cloud storage and share with third parties from those locations. Not only does this break the intent of the guidance but it also removes corporate information from the boundaries of the corporation, so it’s not available for others to search and may be lost when the person leaves the organization.

Some organizations have approached these problems with more aggressive controls that block access to private file sites – which only causes users to start saving copies of their files in their emails and making it more difficult to manage the information.

Much like water will always find a way to get lower eventually, even the craftiest of strategies to block users from doing bad behaviors will fail if you don’t design in a way for them to get their work done.

Explaining What Information Governance Is

Information governance is a set of words thrown around as if they have some deep, profound, and universal meaning. However, the truth is that there isn’t a clean and globally accepted definition to what information governance is – and is not. As a result, professionals in many disciplines are confused as to what it is, why they care about it, and how to implement it.

The Principle

Behind information governance is a single principle. It’s a single guiding strategy that drives why everyone should care. The principle is to maximize the value of information in an organization. While this is a seemingly short and straightforward principle, the multiple ways it’s interpreted creates confusion about what information governance is. As a principle, it defines what – not how nor why. The why is self-explanatory. Getting value both personally and organizationally is what we all want.

What is Information?

The tricky part about information governance – and its relationship to both data governance and knowledge management – is the line between what we call data, what we call information, and what we’d call knowledge.

There are varying definitions of the distinction between data and information about the degree of added meaning. However, as this is context dependent, it makes little sense to entertain this as a part of the definition. Most folks have settled for an imprecise but acceptable answer that data exists in rows and columns. It’s structured. Unstructured data is called information. This division is reasonable and allows us to confine our efforts to improve utilization to things outside of the rows and columns of the core transactional system.

The other side of the continuum is knowledge and wisdom. While knowledge has some useful criteria, they’re not perfect either. Knowledge is often divided by the differences between explicit and tacit information. Explicit information can be written down and is contextless. Tacit information, on the other hand, resists being translated into explicit knowledge, which is written down and captured. The conversion between tacit and explicit is fraught with challenges, and some insist that tacit knowledge can never be removed from its context and made fully explicit.

While knowledge managers struggle with the difference between connecting people to get to knowledge versus connecting people with the explicit knowledge, we can constrain the principles of information governance to that knowledge which knowledge management would consider explicit. That doesn’t mean it’s easy to work with – it’s just easier than tacit.

To Steer

The other aspect to address is the idea of governance. In many organizations, governance committees control and prescribe instead of collaborate and suggest. Instead of helping people make the best decisions, they respond by stopping projects and blocking things that maybe should have never been started in the first place. However, the problem is that this isn’t what governance is supposed to be about. Governance is supposed to be steering – like the rudder of a ship.

Implemented as a back-end enforcement process – which is the way many governance programs are implemented – diverges from the original intent of the word and the best use of governance. To be fair, information governance includes regulatory compliance and some things where the guidance is quite rigid and does need to be enforced. This should be the last resort reserved for things that are unable to be addressed with guidance.

The Conflict

There is, in information governance, a fundamental conflict. The conflict is between retaining information too long and thereby exposing the organization to additional risk of disclosure and discarding information too soon, thereby depriving the organization of the value of the information. The key to good information governance lies in identifying which information must be retained and which information should be disposed of.

Regulations often require that information be retained for a minimum amount of time, thereby forming the lower bounds of how long the information must be maintained. Other regulations related to individual privacy focus on the maximum timeframe that information may be held by an organization. In between these two boundaries, organizations are free to decide what their retention schedule is.

The limitation is that organizations are required to maintain retention policies that are defensible. This means it’s well understood and consistently applied. While this seems like an obvious thing, courts don’t want to see a large amount of information disposed of immediately before they order the disclosure of that information to opposing council in a case.

The Sorting

Retention of important information is only the tip of the iceberg, but it surfaces the need to be able to meaningfully classify information. Different kinds of information have different value over time. To be able to keep and dispose of information properly, it must be categorized appropriately. The problem is substantially harder than it may at first appear. It starts with the ability to identify the major areas of information that you have in the organization.

This process is easy enough for transactional type data but gets more complicated when you wander into the world of things that are done rarely – or even once. It’s not feasible to capture every one-off thing that the organization has ever done, yet to build proper categories, this is exactly the kind of information that’s needed.

Humans are hard-wired to categorize things. That’s why we have such a hard time avoiding stereotypes – it’s how we’re made. Even with the innate ability to categorize and group things, we don’t always pick the right groups at first. We create too many categories in one area and not enough in others. Even the famed Dewey Decimal System reserved nine of ten spots for variants of Christianity, leaving only one category for all other religions. Dewey’s beliefs and experience caused him to bias his system toward Christianity at the expense of other religions.

The categorization process often reflects our own quirky experiences rather than an absolute best way to do things. As a result, the best categorization schemes are ones that reflect the unique way the organization views the world.

Finally, assigning items to categories can be challenging, as there may be no clear option that relates to the item the person is trying to store – or retrieve. It seems odd, but all too often, users store things in places in the system that were never intended for that kind of information.

The Value

To get value from information, it must have been retained, and it must be findable. Findability can be accomplished through navigation to the valuable information or through using the search system. Both of these, however, require that the categories are set up correctly and, in the case of search, correct metadata has been applied. Getting users to enter the correct and complete metadata about the things they’re doing is hard. This is in part due to the barriers of reminding them to provide the information and in part due to the additional burden it places on them.

To get to the value of information governance, we must find ways to motivate all users to file things properly and enter the metadata that will make the item findable again when it’s needed. That’s why information architecture is a keystone skill for information governance and it’s one that few people are taught.