Stop Data Hoarding: Why You Need an Information Governance Strategy

Page 1

STOP DATA HOARDING Why You Need an Information Governance Strategy

INDUSTRY PERSPECTIVE


Today’s rapid pace of data growth provides agencies with unprecedented insight into the needs of the citizens they serve. With more data, however, comes new challenges around organizing, storing and protecting this data. Information governance provides agencies with the analytical tools needed to move beyond a “store everything� mentality to a strategic data management plan, helping you keep your data organized and usable. In this industry perspective, GovLoop and Veritas have partnered to share why your agency needs an information governance strategy and best practices for data management.

2

Industry Perspective


Gaining a Picture of Your Data Information is the cornerstone of any government agency. The need to collect it, analyze it and store it has been necessary essentially since the days of the Founding Fathers. Improvements in technology in recent decades have made the influx of data even greater, yet most offices have been following the same “store it all just in case” policies as when information came in at a relative trickle. A data inventory survey conducted at the 2012 Compliance, Governance & Oversight Council showed that, on average, 1 percent of an agency’s data was on litigation hold, 5 percent was in records and 25 percent had actual business value. This implied that, on average, 69 percent of an agency’s data had no value to the business. on litigation hold (1%) in records (5%) has business value (25%)

has no value (69%)

This hasn’t changed much in the last few years. The Association for Information and Image Management found in 2014 that 80 percent of an organization’s electronically stored information is redundant, outdated or trivial. Also in 2015, Forrester conducted a survey of over 1,800 technology decision-makers and found that 78 percent believed that less than half of their unstructured data had any value or was mined for content in any way.

To confirm this trend, in 2016, Veritas, a leader in public-sector information governance, released the findings of the Data Genomics Project, a consortium of data scientists, industry experts and thought leaders studying enterprise data environments. The first thing they discovered is that data is growing at an average rate of 39 percent per year. “But the thing that’s surprising from a cost perspective is, despite how fast data is growing, purchase of storage devices is growing 9 percent faster,” said Bill Duffy, Archiving and eDiscovery Specialist at Veritas. “This means that government agencies will have spent twice the amount of money on storage and associated costs within six years.” Further, they discovered that, on average 41 percent of an agency’s total environment has not been modified in the past three years, and, within that, 12 percent of the total hasn’t been used in the past seven years. “So, it’s safe to say there is some low-hanging fruit to go after in terms of information governance,” Duffy said. This 41 percent figure presents an even bigger problem when you consider data recovery and continuity of operations sites, and the number of backups and snapshots that have been taken of data that hasn’t been touched in three years or more.

69%

ON AVERAGE, 69% OF AN AGENCY’S DATA HAS NO VALUE TO THE BUSINESS

39%

DATA IS GROWING AT AN AVERAGE RATE OF 39% PER YEAR

48%

PURCHASE OF STORAGE DEVICES IS GROWING AT A RATE OF 48%

41%

ON AVERAGE, 41% OF AN AGENCY’S TOTAL ENVIRONMENT HASN’T BEEN MODIFIED IN THE PAST 3 YEARS

A recent study from Gartner tells us that the annual total-cost-of-ownership for one petabyte of Tier 1 storage is $5 million. This works out to 39 cents per GB per month, meaning organizations could save a little over $2 million a year per petabyte of data by doing something else with their stale data.

Stop Data Hoarding: Why You Need an Information Governance Strategy

3


Rapid Pace of Data Growth More data and information is being generated than ever. New data and formats are being created at a nearly unassailable rate. Also, when new information is formed, the metadata counterpart that is generated means that the space requirements are increasing dramatically. Not only is the sheer amount of data we collect increasing, but developments such as the Internet of Things, the increased use of smart, wearable and mobile devices and big data technologies will soon or have already significantly added to the amount of information an agency needs to deal with. The interconnectivity

of federal agencies also means that potential redundancies in collecting and distributing data are likely to occur. While new technologies have clearly benefited the public, agencies face many challenges brought about by this growth of data and information. Managing information and data at scale, the lack of standardization around the handling of information across organizations and the ability to maintain awareness of who and what is on your network all make it necessary to examine current data policies, and consider new tactics such as strong information governance strategies.

Challenges Caused by Data Growth MANAGING INFORMATION & DATA AT SCALE LACK OF STANDARDIZATION

MAINTAINING NETWORK AWARENESS

Why an Information Governance Strategy? Information governance incorporates the policies, controls and information lifecycle management processes organizations and government agencies utilize to control cost and risk. With the explosive growth of data storage and retention requirements, every department and agency is impacted.

Key Principles of Information Governance

Information as a Strategic Asset

An information governance strategy may vary in the details from agency to agency, as no two sets of needs are exactly the same. Some common qualities of any well thought-out plan will emerge, however, regardless of the specific organization.

An information governance strategy underscores an organization’s commitment to managing its data and information strategically. It establishes important policies concerning an agency’s data, which give an administrator the increased ability to prioritize investments. It also creates trust through protection of information, information assets and privacy, as well as providing accountability for managing information.

First, the quality of the data must be consistent. This can be achieved through well-documented and repeatable processes and accurate data collection automation. Next, the information must conform to certain standards, possibly via automated data tagging and/or schemas. Lastly, the volume and complexity of the data must be easily manageable. This is usually done with automated data processing for analytics, and visualizations such as dashboard interfaces.

4

Industry Perspective

Such strategies also promote standardization, not only within an agency, but between organizations. Its processes become more efficient, repeatable and independent of bias. And it is easier to put controls on the protection of information.


Implementing a Strategy In order to utilize an information governance strategy, it is necessary to rethink how your data is being handled. This may involve changing the way things have been done in your organization since who knows when, and the core beliefs that engender those processes.

The Capstone Requirement

Compliance and Security

The National Archives Record Association (NARA) has developed the Capstone approach for email records management to improve transparency in government and as an aid to resolve the massive information governance problems at agencies that have recently been in the news. These agencies have found that locating emails in the world of unstructured data is like searching for needles in a haystack. This becomes even more difficult when the whole world is watching and waiting for your results.

Adopting an information governance strategy can also make it easier to maintain compliance across the network, as well as focus on the security needs of your more business-crucial data.

Capstone requires federal agencies to improve their management of electronic records, beginning with email. Adopting this approach simplifies management of email retention, implements controls of email deletion, and provides a more practical method for managing email accounts. Risk is a significant concern when organizations store large volumes of unstructured data. This data may contain personally identifiable information (PII) or other sensitive content that could imply noncompliance with regulatory requirements and expose organizations to unauthorized access. To mitigate this risk a governance policy must also include the option of automatic classification. Auto-classification should bring with it custom retention and access policies to the documents.

The best way to minimize risk is to expose it. And the best way to do that is by understanding an agency’s information structure by determining the age of information, its location and ownership. Understanding the risk profile of information allows agencies to shift from the “store everything” mentality to a value-focused perspective. Once agencies gain visibility into their information footprint, they must take action. Ultimately, the choice is between retention, protection and deletion. By leveraging critical insights into the value of their information, agencies can assign classifications, deploy policies and initiate cleanup. With so much of a network’s information having no legal, business or regulatory value, it is important for agencies to reduce and organize their data to keep it usable and reduce storage and ownership cost.

The best way to minimize risk is to expose it. And the best way to do that is by understanding an agency’s information structure by determining the age of information, its location and ownership.

Stop Data Hoarding: Why You Need an Information Governance Strategy

5


How to Handle Data Management Information governance doesn’t occur overnight – it happens when agencies bring together the right people, processes and technology. Administrators must develop sustainable policies that outlast a single project. Technologies that integrate and automate will drastically reduce the manual effort required to manage the information governance workflow and improve an agency’s ability to mitigate information risk.

Gaining Some Insight

Integrating the Two Sides

The sheer size of today’s data storage has made direct human management pretty much impossible. Analytics software is essential to help people make decisions about their data.

When you look at information governance, there are two aspects to consider. One is the metadata side – who is accessing what files, how much they are using them, where are they located, etc. The second and more crucial part is the actual content of the data that the users are accessing. The true trick is determining who should have rights to what data, including which content.

Through the Data Genomics Project, Veritas developed Information Map, which is just such an analytics program. With it, a network admin can break data down geographically, by user or by cost. It will also let you look at orphaned or stale data that may be plaguing your system. Below you see one possible dashboard that provides all of this information with just a click (figure 1). It will also track any access or compliance risks that your files might have, which is more important than ever in today’s open share method of collaboration. The social network map (figure 2) may also allow you to determine inappropriate or unnecessary access by employees. For instance, in the graph below, one immediately would ask, “Who is the user in the middle and why do they have access to so many groups?”

figure 1

Once you have a grasp on your data’s attributes, you can then make the hard decisions on what to delete, what to move to permanent archive and more. And this is more important to do than ever -- the Association for Information and Image Management found in 2014 that 80 percent of an organization’s electronically stored information is redundant, outdated, or trivial (ROT). Once data has aged 10 to 15 days, its probability of ever being looked at again approaches one percent.

6

Industry Perspective

figure 2

“The metadata side can be set up using an infomap program like Data Insight,” said Duffy. “The content side needs to be set up through a data classification engine like in our archiving solution, Enterprise Vault. That will link content with metadata and really identify whether Norm should have access to all this personally identifiable information in the HR database. So, solving each problem is important, but to solve both you really do need an integrated solution.”


Conclusion Information governance is ultimately about the discovery of authoritative data. This includes known quality parameters and clearly stated purposes, and fulfills well-understood data needs. Additionally, providing all the protections required by law is also about providing greater visibility and accessibility of data through enterprise-level services. While CIOs have the perception that they are responsible for information technology, in reality they are becoming more and more responsible for empowering their organization to use information. So while information does not equate to IT, it does by and large support it. Information governance practices not only bring transparency to federal agencies as well as cost savings, but also enable compliance with Capstone. The result is knowledge that allows your organization to make better decisions about your information. Realizing the power in information and then properly governing that information will drive future investments and lower costs in storage, data centers, cloud and even staffing.

About Veritas

About GovLoop

Veritas Technologies enables agencies to harness the power of their information to drive mission success, with solutions designed to serve the world’s most complex, heterogeneous environments. Veritas works with Federal agencies to help them improve their data availability and unlock insights to make them more successful in support of the overall mission.

GovLoop’s mission is to “connect government to improve government.” We aim to inspire public-sector professionals by serving as the knowledge network for government. GovLoop connects more than 250,000 members, fostering crossgovernment collaboration, solving common problems and advancing government careers. GovLoop is headquartered in Washington, D.C., with a team of dedicated professionals who share a commitment to connect and improve government.

From traditional data centers to private, public, and hybrid clouds, Veritas helps agencies protect, identify, and manage data regardless of the environment through a comprehensive product strategy and roadmap focused on Federal agency needs. Veritas’ products help automate information management and reduce manual efforts.

For more information about this report, please reach out to info@govloop.com.

Stop Data Hoarding: Why You Need an Information Governance Strategy

7


1152 15th Street NW, Suite 800 Washington, DC 20005 Phone: (202) 407-7421 | Fax: (202) 407-7501 www.govloop.com @GovLoop

8

Industry Perspective