Metadata holds key to future of storage
Published: 21 Oct 2003 14:55 BST
As you can imagine, with a Storage Tank-like architecture, reliability is essential. An organisation could become paralysed if the metadata somehow becomes inaccessible. To guarantee its availability, Storage Tank relies on a cluster of Intel/Linux metadata servers (the minimum configuration consists of two servers and costs $90,000). As with most clustering technologies, redundancy of the metadata database is a part of the message behind Storage Tank's clustering technology, which can grow to as many as eight systems.
In fact, databases play a key role in metadata-driven systems. In an effort to expeditiously find what it's looking for, any storage infrastructure that depends on more than some basic information will require a robust, super fast, secure and fault-tolerant database technology. If not done correctly, a layer of metadata (and the database that goes with it) can do more harm than good. Recall that, for the most part, metadata-driven virtualisation of storage takes the place of scenarios where users, applications and operating systems were more hardwired to storage -- scenarios where few compromises in performance and availability are made. But once layers of abstraction -- essentially translation layers -- are inserted into those scenarios, the potential for things to go awry increases. This embedding of database technology into the storage infrastructure is where the rocket scientists get involved, and the degree to which the rocket sciences succeed at dealing with the compromises introduced by additional layers of technology is what will separate the true winners from the rest of the pack.
For example, Microsoft has already made it clear that its forthcoming Yukon relational database technology is a key building block to WinFS. Although we most often hear about WinFS during discussions of Longhorn, Microsoft's ambitions for WinFS won't stop at the desktop. When you consider any of these storage virtualisation technologies and the voluminous amounts of metadata they will have to keep track of, it's not surprising to see that, based on the advertised advancements (high availability, additional backup and restore capabilities, replication enhancements, and secure by default) of its next generation database technology, why Microsoft's WinFS is waiting for Yukon. Likewise, neither EMC's content addressable storage nor IBM's Storage Tank technologies would be able to step on the field if their metadata databases weren't deeply embedded into the infrastructure and didn't bear some of the same design goals that Microsoft has for Yukon.
Embedding databases into the storage food chain isn't new. Much of what's being done today in the area of storage virtualisation resembles the architecture of IBM's 1970's class System 38. The System 38 technology eventually evolved into the AS/400, which in turn evolved into IBM's current iSeries midrange systems. According to IBM iSeries senior technical staff member Amit Dave, "Storage virtualisation was a critical design criteria of the System 38, and integrating a database directly into the operating system played a pivotal role in achieving that objective." The benefits, according to Dave, were clear: "The idea was to eliminate all notions of a disk drive and to relieve the users of any concerns about data placement or storage management. Instead, users only had to concern themselves with creating a data template and the system would take care of the rest. The System 38 automatically pooled disk drives so that they appeared to the application as one virtual memory store. Applications, and therefore users, had no knowledge of what or how many disk drives a system had, so they didn't have to know how to store the data, how to allocate space, or how spread that data across volumes."
Compared to the more horizontally focused applications of today's storage virtualisation technologies, the System 38 focused primarily on one type of application (databases). But the design goals, and the ultimate benefits to the end user, are nearly identical. Without the metadata layer, according to Dave, much of it wouldn't have been possible. Says Dave, "Over the long haul, the concept of metadata can lead storage in any number of directions. The opportunities are enormous and endless."





