By Heidi Biggar
One of the biggest problems users run into when trying to meet regulatory requirements for unstructured data types (e.g., e-mail and files) is determining what data needs to be saved for compliance purposes and what data doesn’t.
Saving the wrong data-or too much data-can have significant, and potentially costly, implications for organizations in audit situations and it can lead to unnecessary storage consumption.
Unlike structured data (e.g., databases), IT administrators generally know very little about the content of unstructured data other than its file name, access history, etc. The problem is heightened as data is distributed around the organization and structured data becomes involved (see figure).
StoredIQ 3.0 limits “leakage” without disrupting the existing workflow process, allowing users to better meet regulatory compliance guidelines or other internal business processes.
“It’s a huge issue,” says Bob Fernander, CEO of StoredIQ (formerly Deepfile Corp.). “We know all about structured data, but comparatively little about unstructured data. We know when the data was last accessed, but we don’t really know how to manage it based on what is in the file.”
Deepfile recently changed its name to StoredIQ to reflect its new focus on regulatory compliance; Deepfile previously focused more generally on file management, although compliance had become an increasing part of the company’s strategy over its three-year history. The new StoredIQ 3.0 content-driven compliance platform is based largely on Deepfile’s Sentinel, Auditor, and Enforcer applications. StoredIQ’s first Solutions Pack is targeted at HIPAA compliance.
“Deepfile’s roots were in file and metadata management, so the new company’s positioning isn’t far from that,” says Peter Gerr, senior analyst for data management and data-protection solutions and emerging technologies at the Enterprise Strategy Group (ESG) consulting firm. “In fact, StoredIQ is leveraging the same technologies and products Deepfile had developed but is focusing them on a specific set of requirements, [i.e., compliance].”
According to Gerr, the company’s repositioning was both an acknowledgment that its general approach wasn’t working and that it needed to do more than develop technology to differentiate itself in the increasingly crowded compliance market. “They had to-and appear to be on their way to doing so-to develop intellectual property around their base technology,” says Gerr. “Focusing on one set of regulations [i.e., HIPAA] should eventually enable StoredIQ to become subject-matter experts in this area, which will help further differentiate them from competitors.”
While there is some overlap between the StoredIQ 3.0 platform and content management products from vendors such as Documentum, FileNet, and Vignette, as well as e-mail archival applications from vendors such as KVS (which is now part of Veritas) and iLumin, StoredIQ’s Fernander says these types of technologies do not have the extensive discovery/indexing capabilities that StoredIQ 3.0 does. As such, StoredIQ 3.0 is more complementary than competitive with the other products, according to Fernander.
Though StoredIQ’s initial focus is HIPAA compliance, the company says that the “lexicons” (templates) the company developed for its HIPAA Solutions Pack can be easily modified for other regulations, including SEC 17a-4, Sarbanes-Oxley, etc. (see figure on p. 20). Users can also develop their own lexicons for specific regulatory requirements not covered by StoredIQ or for other business processes such as legal discovery.
The StoredIQ architecture addresses privacy, records retention, and information security issues.
StoredIQ’s HIPAA lexicon is designed to find (or discover) electronic personal health information (ePHI), meeting a variety of criteria, including entities (people, places, things, etc.), proximity types (date of birth, relative name, etc.), patterns (ICD-9, HCPCS, social security number, etc.), inclusion lists (medical terms, drug names, etc.), and exclusion lists (employee names, institution names, etc.).
The StoredIQ 3.0 software finds all the files that match specific regulatory requirements, protects these files via encryption, and then moves the files to the appropriate back-end storage. Support for EMC and Network Appliance hardware is in the works.
The software works only with files at rest; it runs on an appliance, is agent-less, and currently supports NFS, CIFS, as well as other standard network protocols. WAN support is planned. Until then, users must install an appliance (within the firewall) of each physical location.
The HIPAA Solutions Pack is pre-configured to meet HIPAA requirements and can reportedly support more than 200 file types (e.g., Word, Excel, PDF, etc.).
Pricing for the appliance depends on the configuration and the amount of structured data being filtered. The platform with the HIPAA Solutions Pack is priced from $135,000 for 1TB of managed data; a similarly configured appliance without the pre-configured lexicon lists for $100,000. A three-year subscription rate is also available.