I met someone knowledgeable on storage for large systems and he spoke about 'object storage' as opposed to the current 'normal storage' based on filesystems that we know of (HFS, NTFS etc).
I'm trying to wrap my head around this but I can't seem to fathom how (or why) it will be better than what we currently use. Does anyone have any clue on this? So is this the 'next big thing' in storage? Or is it one of those things - like multimedia, 3D, holography etc.
FCP Editor, Edit systems consultant
It's an attempt to standardize and virtualize how OS and applications deal with storage, make it more intelligent and universal across multiple types of devices. Kind of like what SCSI and SATA did for block storage, and TCP/IP did for networking, except going even further. It doesn't seem to be here yet as this Register article illustrates, and certainly not yet for desktop or SMB storage.
The key word in your question is "big."
The world is still defining the problems that object storage will solve. However, most object storage vendors are positioning the technology to be a solution for certain kinds of big data and long-term archives. When I say BIG, I'm talking about minimum 1/2 petabyte environments -- extreme scale -- places where a single, "traditional" file system like those you mention fall short of such large requirements. New scale brings new problems that require new solutions.
I think this technology will not be relevant for the *typical* video editing needs and workflows that are discussed here on the Cow. (There may be some developments that make object storage applicable to our world, but I think it won't happen for at least a couple years.)
Studio Network Solutions (SNS)
Think of a harddrive or chassis of harddrives as simply a place to put data. No RAID or filesystem. Files are grouped together as "objects" which can be any group of complete files. These objects are then chopped up into pieces and spread across storage systems for redundancy. These storage systems can be anywhere.
Because it can use any storage system it can grow large but performance is not great because there is no RAID in place to speed things up. It's big, redundant and slow generally so it's good for clouds.
Don't think that in itself, "object storage" has anything to do with performance. Its implementations - sure. It's entirely plausible for high end (and very high performance) boxes from Isilon and EMC to already have "objectified" storage to some extent.
Aren't Storage Spaces, ZFS's virtual devices, Drive Pool - all attempts at "objectifying" storage? Sure, some have atrocious performance penalties (and some don't). Yet performance penalties aren't inherent to object storage, only its implementations - which is something you could say about any storage.
It's like saying that virtualization is inherently bad for performance. Sure it was initially, yet now that the IT world is mostly virtualized, desktop OSs having built-in very efficient and silky smooth hypervisors, and Adobe and NVidia jumping on the bandwagon, we know that virtualization is actually a boon to high performance because of its efficiency and scalability.
[Alex Gerulaitis] " It's entirely plausible for high end (and very high performance) boxes from Isilon and EMC to already have "objectified" storage to some extent."
Well a 36 drive Isilon X400 is only capable of sustaining 750MB/sec. Many 16 drive RAID chassises from a number of manufacturers claim in excess of 1000BM/sec. No matter how many X400 you strap together it will always fail to deliver more than 750MB/sec from a single request. Most RAID boxes will scale up for single requests.
[Neil Sadwelkar] "I'm trying to wrap my head around this but I can't seem to fathom how (or why) it will be better than what we currently use. Does anyone have any clue on this? So is this the 'next big thing' in storage? Or is it one of those things - like multimedia, 3D, holography etc."
Object storage is definitely the next big thing for storing a lot of data, just not necessarily your data for the way you use it.
Think of object storage as the next layer of abstraction for commodity storage. Back in the day, formatting a hard drive involved thinking about segments and cylinders and manually aligning them and flipping dip switches on the hardware to set them up. Now SAS and SATA firmwares handle that low level noise and the OS only has to think about the filesystem. Object storage goes a level further, managing a large number of individual filesystem-formatted physical drives into a single large blob that is typically accessed natively via a RESTful (HTTP-based) interface. Amazon S3 is object storage, and as you may be familiar, is not accessed the same way as your more ordinary single-filesystem NAS.
Object storage seeks to create a storage environment that can scale both in size and geographically, and is usually measured in petabytes. You could build one in the tens of terabytes, but you would not realize the advantages compared to conventional RAID+filesystem storage. Object storage's sweet spot is at multi-petabyte scale.
To give a couple high-level examples of how it works, Ceph and OpenStack's Swift both work by aggregating a collection of individual physical drives, usually cheap commodity SATA HDDs, that reside in a bunch of Linux servers which have formatted each with XFS or ext4 (or soon BTRFS). Since the Ceph or Swift system is a cluster of many servers each with many drives inside, the cluster controller(s) distribute the data across typically three or more locations in the cluster for durability and fault tolerance. These clusters are typically assembled with 10Gb Ethernet as the network "glue" and can employ SSD cache at each node, and thus in aggregate can provide very good storage performance for the types of applications that demand that scale and style of storage- large web applications, Hadoop clusters, and other Big Data analytics.
The cool thing about how Ceph and Swift scale is that neither uses a database to keep track of the objects they are storing. It is all handled algorithmically based on file metadata. This makes it much more efficient for them to scale into the billions of objects without requiring a complex and expensive database to keep track of it all.
Are they useful for production and post? Maybe as an warm archive, but at the scale where they start to look competitive performance-wise, they are probably much too large and expensive to make any financial sense compared to more conventional RAID-based NAS and SAN solutions. The thing about using object storage for production and post is they would need to have a NAS gateway to deal with sharing these RESTful HTTP-based storage systems with clients that want to connect via conventional NAS protocols like SMB, AFP, NFS, etc.
Best write-up on object storage I've seen - thanks Andy.