High-throughput, feature-rich S3 object storage for hot/warm content and enabling search combined with the economics of tape
Many in the storage world still have a skewed view of object storage. This really isn’t a surprise given that there are three concepts that need to be defined when talking about object storage, and different storage vendors often have different definitions of those concepts. The 3 concepts are interface/protocol, object structure/system and storage media. In this blog, I am going to focus primarily on interface and protocol.
Object Storage Interfaces and Protocols
Before Amazon S3, all object storage solutions had their own proprietary RESTful interface enabling distributed access over HTTP. In fact, the now de-facto-standard Amazon S3 API is a proprietary interface that Amazon has been kind enough to open up to the broader market and it is now the interface for all major object storage solutions. Most vendors still provide their own proprietary API for feature sets beyond what the S3 API allows. Adding to the complexity is that the S3 interface is also available via non-object storage devices like filers, NAS devices and now tape. If you would like a high-level overview of storage interfaces and protocols, check out this blog.
S3 on NAS and Filers
For many traditional workflows, S3 access on an existing NAS and filer may be an adequate “quick fix.” It really depends on scale. As the capacity and underlying NAS and filer infrastructure grows, so do the traditional NAS and filer issues of cost, recovery times, expansion and management. Additionally, as distributed workflows become the norm and file count grows, the need to provide easy-to-manage distributed access to an increasing user base and to enable metadata customization and search becomes more important.
Additionally, I would caution that just because your NAS performs well for your traditional file-system-based workflows, it may not necessarily lead to similar performance for RESTful use cases. When you are utilizing an interface on top of a file system, there is a continual translation and conversion going on from the file system to the S3 interface. Ultimately, depending on load, your system will reach its limit and then you will need to expand. This is where both cost and management become an issue when you use NAS/filers with an S3 interface, versus implementing a pure object storage solution.
S3 on Tape
For many years, there was a competition between tape and object storage. This was a direct result of object storage being relegated to the “cheap-and-deep” category of storage. It should come as no surprise that tape is and will most likely always be the cheapest way to archive content (for the foreseeable future). That said, the use of pure object storage as a hot/warm tier in front of a cold tier of object storage enabled by tape has increased in popularity. This is how Amazon has architected the S3 service, utilizing disk-based object storage with millisecond response times and also offering S3 Glacier Deep Archive with hour response times.
The market has followed suit. Earlier this year, FUJIFILM Corporation announced their Object Archive software, which uses their open-source file format, OTFormat, that they developed specifically for object data. Caringo Swarm certification was just announced earlier this week, but we have been working with FUJIFILM for some time on integration.
Benefits of Caringo + Fujifilm Solution
There are numerous benefits that the Caringo Swarm Intelligent Data Management Platform + FUJIFILM Object Archive solution brings to organizations by combining high-throughput and feature-rich object storage for hot/warm content and enabling search with the economics of tape. This solution is ideal for organizations looking to provide a complete S3-like service internally or externally. Amazon has optimized availability, performance and cost by mixing the performance characteristics of HDD and tape under the S3 API, and the combination of Caringo Swarm and FUJIFILM Object Archive delivers the same result.
The Caringo + FUJIFILM solution is also ideal for highly secure deployments with petabytes of data, as the solution can enable RESTful-interface-based workflows while still benefiting from the cost-effectiveness of putting infrequently accessed data on a cold archive enabled by tape. And, what makes it even better is that integrating the storage products is straightforward, as demonstrated in this video.
S3 on a Pure Object Structure/System
When you get down to it, the underlying structure and system that directs the incoming data to storage media and infrastructure is what defines object storage. In all object storage systems, there is some form of key/value addressing system in a flat address space—meaning data is “checked” in and an ID for that specific piece of data is returned. All you need is that ID to retrieve it. This is the “key” to object storage (pun intended) and what makes object storage ideal for enabling distributed workflows.
You don’t need to keep track of server names, directory structure and file names. In addition, you have the ability to manage multiple tenants, customize metadata, search, and deliver content directly from the storage layer over HTTP (with range reads). As distributed workflows become the standard, storage devices optimized for distributed access will become more critical for all types of organizations.
An Ending Note
With all that said, this should not be an “either-or” discussion. You should take a tiered approach employing different storage devices and solutions based upon your evolving requirements.
I invite you to register for episode 13 of our Brews & Bytes webinar where I will host a panel to discuss the future of data archive. My guests will be Eric Dey, Caringo Director of Product, Rich Gadomski, Fujifilm Head of Tape Evangelism, and Nami Matsumoto, Fujifilm Director of Data Management Solutions. We will discuss:
- The functional role of the archive in today’s workflows
- Leveraging tape, HDD and object storage to optimize accessibility and cost
- Where archive technologies are headed, and a look at active archiving
As always, we are here to help. Let us know if you want to set up a consultation with one of our experts to discuss your needs.