Active Archive
Active Archive is a method of tiered storage which gives the user access to data across a virtualized file system that migrates data between multiple storage systems and media types including solid-state drive/flash, hard disk drives, magnetic tape, optical disk, and cloud. The result of an active archive implementation is that data can be stored on the most appropriate media type for the given retention and restoration requirements of that data.[1] This allows less time sensitive or infrequently accessed data to be stored on less expensive media, and eliminates the need for an administrator to manually migrate data between storage systems. Additionally since storage systems such as tape libraries have very low power consumption, the operational expense of storing data in an active archive is greatly reduced.[2]
Active archives provide organizations with a persistent view of the data in their archives and make it easy to access files whenever needed. Active archives take advantage of metadata in order to keep track of where primary, secondary, and sometimes tertiary copies of data reside within the system, in order to maintain online to near-online accessibility to any given file in a file system, regardless of the storage medium being utilized.[3] The impetus for active archive applications, or the software involved in an active archive, was the growing amount of unstructured data in the typical data center and the need to be able to manage and efficiently store that data.[4] As a result, active archive applications tend to be focused on file systems and unstructured data, rather than all of the collective data; however, many have features and functions that address traditional backup needs as well.[5]
Active archives provide online access, searchability and retrieval to long-term data and enable virtually unlimited scalability to accommodate future growth. In addition, active archives enhance the business value of the data by enabling users to directly access the data online, search it and use it for their business purposes.[6]
Performance
Since an active archive is built around a cost-performance ratio, the performance standards of these systems vary significantly based on each individual implementation.[5] Within an active archive the quantities and types of media used are determined by the retention and access requirements of the varying types of data. This gives a company the flexibility to determine their own tolerance levels for accessing any given type of data. However, in general, active archive systems can recall data to a use ranging from milliseconds to 2 minutes, depending on what type of media the data is residing.[7]
Important aspects
Because an active archive is being used for storing both primary, secondary, and tertiary copies of data there are several factors that become necessary for the implementation of an active archive beyond simply the ability to move and access data: data integrity, media monitoring, energy efficiency, and interoperability are all important components of an active archive. Many active archive components include features such as self-healing data within the software, versioning, encryption, and media health monitoring. Reversely, since an active archive is also being used as an archive, features such as automatic migration between storage devices and technologies, vendor neutral formatting, and ILM management are all important components to an active archive as well. Many of these standards are driven due to specific industry compliance requirements such as HIPAA, SOX, PCI Compliance, etc.[8]
Difference between Active Archive and Hierarchical Storage Management (HSM)
While active archiving is often compared to HSM, the two methods have very different implementations. Unlike an HSM, data in an active archive remains online regardless of the age or usage. The access pattern in an active archive is also different than a traditional HSM in that the data is not automatically restored to the "higher tier" storage system when requested, but rather is accessed directly from the storage device that the data is resting on. This makes every storage device in an active archive both primary storage and archival storage.[1]
An active archive is an archive in the sense that it manages the data within the active archive throughout the lifecycle of that data according to each company's particular Information Lifecycle Management (ILM) policies and procedures.[9] This means that while the active archive serves as the primary storage pool, it is also the final storage location for a file at the same time.[1]
Active Archive Alliance
The Active Archive Alliance is a collaborative industry alliance dedicated to promoting active archives for simplified, online access to all archived data. Launched in early 2010 by founding technology partners Dell, FileTek, QStar Technologies Inc, Silicon Graphics International (SGI) and Spectra Logic, the Active Archive Alliance is a vendor neutral organization open to leading providers of active archive technologies including file systems, active archive applications, cloud storage, and high density tape and disk storage, as well as individuals and end-users. Active Archive Alliance members provide active archive solutions, best practices, and industry testimonials so that organizations can achieve fast, online access to all their data in the most cost effective manner. Active Archive Alliance members include Cleversafe, Crossroads Systems, Dell, DataDirect Networks, Fujifilm, GRAU DATA AG, Hewlett-Packard, Imation, QStar Technologies Inc, Quantum Corporation, Scality, Seven10, Silicon Graphics International, Spectra Logic and XenData.[10]
References
- ↑ 1.0 1.1 1.2 "What is Active Archive?".
- ↑ "Factoring Power and Cooling Costs Into Tiered Storage Decisions (Diogenes Lab)".
- ↑ "Building an Active Forever Archive - Metadata Evolution and the Problem of What to Keep".
- ↑ "Active Archive Webinar (Jon Toigo)".
- ↑ 5.0 5.1 "Time Value of Data (Floyd Christofferson, SGI)".
- ↑ "Active Archive Overview Brochure".
- ↑ "Active Archive Alliance".
- ↑ "A Tutorial on Self-Healing Data Storage Systems".
- ↑ "SNIA Dictionary: Archive".
- ↑ "About the Active Archive Alliance".