Metadata holds key to future of storage
Published: 21 Oct 2003 14:55 BST
Microsoft may not be using the word "metadata", but it's evident from the information Microsoft is sharing that metadata will play a role in WinFS. The dead giveaway is one of WinFS' lynchpins: the next version of Microsoft's SQL Server relational database (code-named Yukon). Microsoft has already said that the querying capabilities of SQL Server are a key building block of WinFS. Microsoft senior vice president Bob Muglia recently told CNET's News.com that WinFS also would incorporate the data labelling capabilities of Extensible Markup Language (XML). Said Muglia in that interview: "Think of WinFS as pulling together relational database technology, XML database technology, and file streaming that a file system has. It's a [storage] format that is agnostic, that is independent of the application."
Perhaps demonstrating the sort of versatility that a metadata layer can introduce into the storage food chain, IBM's recently introduced Total Storage SAN File System (also known as Storage Tank) goes off on a completely different metadata vector. The idea behind Storage Tank is to virtualise storage, but not in the way you might think. Applications, operating systems and end users are indeed divorced from storage specifics. But instead of the perspective being one of content management, the perspective is storage management (although it's still capable of traditional content management).
IBM claims that the Storage Tank project will reach its shining moment when enterprises can take most or all of their current storage area networks (regardless of vendor or location) and merge them into a cloud (or "tank") of storage with a uniform interface that services all users, applications, and operating systems in utility-like fashion. What makes Storage Tank tick? Metadata.
According to a recent IBM press release, "IBM Research-designed software keeps track of descriptive information-'metadata' such as physical locations, file sizes or access permissions -- that accompanies the actual content within the files. Where most storage systems include this metadata in the storage system itself, Storage Tank spreads the information across servers on the network -- with the IBM software precisely monitoring the location of the metadata." In other words, to enable its distributed nature and its eventual ability to assemble a cloud from heterogeneous parts (not delivered yet), Storage Tank depends on system-level (as opposed to content-level) metadata information. The resulting infrastructure, says IBM, bears the on-demand characteristics of utility computing: no entity will run out of capacity, while the total cost of ownership is kept to a minimum since IT managers have a single point of management and don't have to overbuild silos of storage to accommodate the individual growth needs of each of those entities.





