Avoiding ‘partial information loss’ in ECM

Enterprise content management (ECM) systems may require data–protection solutions that go beyond traditional backup–and–recovery software.

By Mark Ferelli

Data loss is a part of every business executive’s recurring nightmares. Losing data undermines business continuity, the 24x7 task of keeping a business running smoothly. Although considered by some to be a politically correct way to say disaster recovery, business continuity is not the same thing. Business continuity is the overall strategy and tactics for ensuring operations and transactions continue despite disruption, be it natural or man–made. Disaster recovery is a subset of business continuity that focuses on bringing a business’ IT infrastructure back online after a full system failure or disaster.

When some people think of disaster recovery, their thoughts typically revolve around fires, floods, hurricanes, etc., while others contemplate the actions of hackers, disgruntled employees, and various kinds of malware that can halt the IT infrastructure. But there is another kind of data loss—sometimes referred to as partial information loss—that may not crash the entire IT infrastructure, but it bleeds a business of resources and revenue.

Partial information loss is the loss of one, several, or thousands of key files, folders, e–mails, images, and/or documents. This can be particularly devastating in enterprise content management (ECM) systems. These systems are designed to manage the convergence of unstructured data with structured data, so that a business can effectively meet business goals, streamline operations, serve its customers, protect itself against lawsuits and non–compliance with regulatory requirements, and generally manage an extremely large and rapidly growing amount of information online.

The information stored in ECM systems comprises content and metadata. Metadata is “data about data”—audit trails, annotations, digital signatures, renditions, and other important information associated with content. Content, metadata, and the relationships between them must be protected, as the loss or corruption of just one piece of critical information can cause non–compliance, halt worker productivity, and in worst–case scenarios, bring operations to a standstill.

Partial information loss accounts for the majority (more than 80%) of all ECM system information loss, and therefore must be treated as a serious threat. The partial information loss incidents that plague ECM systems are caused by a variety of common occurrences, including customized workflows and lifecycles that aren’t fully debugged, accidental deletions, application failures, network power drops, using incorrect date ranges during ECM maintenance, malicious tampering, and viruses. It is critical to fully protect your organization from these incidents before, and not after, they are discovered to avoid their potentially devastating impact.

The cost of lost data

In many business environments a small amount of information can impact millions of dollars in revenues or liabilities. For example, in the pharmaceutical industry, potential drug sales of well over $1 million are lost for every extra day spent in bringing a drug to market. If a pharmaceutical company is missing just one piece of data within the electronic–submission forms that it submits to the FDA to get its drug approved, those forms will be rejected, causing time–to–market delays that will cost the company dearly.

If it’s a small amount of information that’s lost, why not just re–create it? According to a “Cost of Downtime” survey from Deloitte, re–creating lost data will cost $2,000 to $8,000 per megabyte. And most likely you won’t be able to re–create the critical metadata associated with the content.

The costs of partial data loss are not limited to dollars and cents. Partial data loss can cripple customer service functions, resulting in customer dissatisfaction, and undermine future business opportunities. Loss of market share and negative brand exposure can also be collateral damage. In addition, companies that cannot recover from partial information loss incidents can incur significant compliance risk, the effects of which may not fully manifest themselves until someone tries to access the corrupted information for an audit or to comply with e–discovery requirements. Partial information loss can wreak havoc on all types of companies, large and small, in all types of industries. Crippling costs are evident in real–world events, and companies need to actively prepare for when partial information loss hits home because it’s not a matter of if; it’s a matter of how much and how often.

One folder = $100,000+

Take the case of a Fortune 200 pharmaceutical firm. In January 2007, an IT administrator accidentally removed a folder containing critical drug manufacturing information that was linked to 15,000 other users’ files, resulting in the loss of all documents, metadata corruption of more than 100,000 links, and interrupted workflows. This caused an immediate manufacturing shutdown, halting the production of drugs worth millions of dollars in revenue and putting the company at compliance risk due to missing Standard Operating Procedure (SOP) documents that were necessary for meeting Current Good Manufacturing Process (CGMP) guidelines.

Business unit leaders, IT, and a vice president had to take immediate action to address the loss. Since the files were imperative to continuing operations and subject to retention requirements, the decision was made to recover the files.

To recover the lost content, the company had to use its traditional backup–and–recovery solution to roll the entire ECM system back to the last known good state using a 14–hour–old backup, which resulted in taking productive employees offline, and the loss of all additions and changes to the repository since the time the backup was created. Furthermore, the metadata was permanently lost. Extensive ECM system downtime was incurred, and more than 1,100 employee hours were spent retrieving content from tape, reloading the content into the ECM system, and attempting the manual re–creation of metadata.

Emergency costs associated with this incident were $12,000, and estimated costs for recovery procedures were approximately $111,000. Further investigation showed that the company had suffered 46 partial information loss events over a three–year period, resulting in more than $270,000 of lost productivity and manufacturing shutdowns. If you also consider the financial and legal liabilities associated with unrecoverable information, the total costs of these incidents are staggering.

Protect against partial data loss

Traditional backup–and–recovery solutions aren’t suitable for recovering metadata, or content for that matter, that is lost or corrupted due to a partial information loss incident because they are designed only for full system failures or disasters; they don’t provide the granular recovery capabilities needed to effectively recover from partial information loss. They require companies to either suffer the repercussions of the partial loss or recover from the last full backup, which brings employees offline and causes additional data loss.

Recovering from a partial data loss event requires software that can recover content and metadata in its original state, at a granular level, while the ECM system remains online.

A hidden threat

If partial data loss is such a devastating and common occurrence, why haven’t you heard about it before?

  • Many executives haven’t experienced the effects of partial information loss firsthand, and therefore assume these incidents have not and will not impact their organizations. However, it’s almost certain their organizations already have, and will again, experience partial information loss—for instance, when they’re asked to produce documents during a trial only to find out that those documents are corrupted and unrecoverable in their original state.
  • Many incidents are not reported because the IT department, stretched thin meeting a variety of needs and shrinking backup windows, cannot justify conducting a full–system restore for a granular amount of information, forcing users to spend time re–creating content, usually unbeknownst to management (and the crucial metadata usually cannot be re–created).
  • The literature surrounding data loss tends to focus on massive incidents where the complete infrastructure shuts down, giving many companies the impression that the main thing they have to worry about is a full system failure. As previously stated, however, well over half of data loss is caused by partial loss incidents.

Your bottom line

Next to personnel, a company’s most important resource is information. Any block that inhibits that flow of information is a threat to business continuity.

Without a granular recovery solution, the only way to recover lost ECM repository information in its original state is to do a full system rollback and incur system downtime and introduce potential inconsistencies, blocking access to information. However, 24x7 access to information is increasingly critical in today’s global business environment, and the costs associated with the ECM system being down for even one hour can be significant (see table).

ECM protection

Partial data loss can be as great a financial and operational burden as complete infrastructure unavailability. The tangible and intangible costs associated with partial information loss will only continue to rise over time, particularly as the regulatory environment is expected to grow more burdensome and demanding. Any company that cares enough about information availability to install and operate an ECM solution needs to go the extra mile and secure the information within it to ensure business continuity, optimize productivity, and minimize compliance risk.

Text documents, financial records, contact records, tax filings, contracts, e–mail messages, and anything else you have stored in your ECM system may disappear forever unless you take immediate action to prevent it. Recovering information lost or corrupted due to partial information loss incidents can take a lot of time, money, and resources that most companies can ill afford.

Bare–bones survival depends on the prompt protection and recovery of your ECM files and records. However, too many companies simply neglect taking backup precautions. They leave their data files unprotected, and thus expose their businesses to the danger of bankruptcy, fines, etc.

Why do managers do it? Mainly because they don’t understand the threat of partial information loss and/or they’re under the delusion that it will never happen to them.

Take action now to deploy an ECM–focused, granular recovery solution so that your files are fully protected and your business is ensured against partial data loss.

Mark Ferelli is a freelancer writer.

This article was originally published on May 01, 2008