Questions and Answers: Storage as a Service with OpenStack Cloud
For those of you who missed our recent webcast on 'Storage as a Service with OpenStack Cloud' a week or two back, I've listed the questions that I was asked there below for your reference. If you want to download the slides or view the recorded webcast, you can sign up for that here.
- “You mentioned a VSA. What is a Virtual Storage Appliance, and can you give some examples?”
A: VSA is an acronym for Virtual Storage Array. It is a way to present some of the functionality that normally is given in a traditional RAID controller or a NAS controller in a virtualized context such as a guest running on top of a hypervisor. So some notable examples include Nexenta’s virtual ZFS appliance, Zadara’s VSA or the NetApp virtual NAS.
- “We use mainly Symmetrix arrays, pretty high-end storage. What parameters should we consider in evaluating if or how to move some of our storage to OpenStack? and/or other open source storage?”
A: Symmetrix is a great, very robust platform for high performance storage. And honestly, you are not going to be able to get a generic Linux box loaded with 7200 RPM drives competing with a wall full of flash drives, or 15K RPM drives in your fiber channel array. So, if you need very, very low latency, very, very high performance storage, you should go ahead and use this symetrics array.
But typically people buy Symmetrix for a few well-defined use cases; then, they fill it up with other storage that is a lot less performance sensitive. My recommendation is take a good look at your tier 2-4 storage. Consider this: does it have critical latency requirement? Does it have any other significant SLAs? If it doesn’t, consider moving that storage to a far less costly and easier to manage storage tier using OpenStack.
- “What are the pro/cons of adding NAS services/APIs to Cinders? Is this going to happen for Grizzly?”
A: Mirantis is in the business of deploying a lot of OpenStack environments. And we see a great deal of demand for this type of feature. So we are completely for expending Cinder to support both block- and NAS-based storage. Having a single end point managing them will provide ability for storage platforms such as NetApp and Nexenta to offer both storage types from a single array, providing customers a significant cost savings.
- Whether this will happen for Grizzly is an interesting question. We certainly support this effort to go into Grizzly, but there is still a bit of a community debate whether the right architectural direction is to extend to Cinder or to provide a completely different “NAS as a Service”.
- “The discussion on Gluster/Lustre/Ceph/Swift talked about the differences in the "logical" implementations. What about "physical"? How would one characterize the different physical platforms required for each?”
A: Actually, at the base level, you can implement all of them on the same cheapest possible storage tier loaded full of drives. Given that, there are in fact some optimizations available. So, for example, when you are building out Swift object servers, they are typically used with 7200 RPM drives. However, Ceph supports a separate logging tier, and Ceph can benefit from SSD-based acceleration in its object storage tier.
Gluster is a shared nothing file system, and does requires significantly more bus networking in the CPU tier order to deliver good performaning services. Lustre has the widest implementation possible of all of these. It can utilize SSDs well; in fact, it can utilize any servers you can throw at it. The faster the service you put behind Lustre, the faster the performance becomes. It has no performance bottlenecks, at least in a read/write I/O path.
- “Is it better to think of Boot Volumes from Cinder as Ephemeral storage, or just as more block storage? In other words, should we think of Cinder as supporting both ephemeral and block storage, or just block storage?”
A: So the introduction of ‘boot from volume’ created an ephemeral boot cycle in Cinder, so right now, cinder is an effective choice, and will do both depending on your choice. It really depends on your application requirements. So, for example, if you are thinking about the long-term life cycle of the data, you are better off presenting the storage exported as ‘boot from volume’ from Cinder, in a persistent fashion.
However, if you wanted to emulate the ephemeral life cycle of traditional block storage because you have lots of customers who are creating lots of VMs, and managing this storage becomes difficult, you want to simply turn on the flag that would allow you to treat the ‘boot-from-volumes’ in the very same manner as ephemeral storage, and be throwing away the boot-from-volumes as the VMs expired.
- “Did I understand correctly that you mentioned geo-replication as a feature of Swift? It's only on the roadmap, as far as I know?”
A: Well, that’s not quite true. Container based replication has been around for quite a while. You are correct that the entire Swift ring is not in the default Swift, although it’s not to say that it hasn’t been done by people like us. But by default, Swift will come with ability to replicate a specific designated container to remote storage. A replication is done asynchronously, and is not in the primary data path. Obviously, the SLA for the geo-replication is significantly less than our typical replication SLA on the block storage array, but it does work and has been tested by us.
- “Is it possible to have user quotas in Swift?”
A: I believe it is possible. I am more of a block storage guy. But I believe that Swift supports quotas and it is going to be integrated with Ceilometer for billing and metering relatively soon.
- “What factors should I consider when choosing between Swift and Ceph when deploying object storage. What is your sense of community adoption of Ceph and Swift in the large production deployments, and is Ceph adoption growing at a faster pace?”
A: Questions of Ceph adoption is better left to Inktank, the company who supports Ceph. Our sense is that Ceph does have a significant and growing adoption, but we are not currently monitoring the adoption of Ceph relative to Swift. Swift is used in very, very large production environments. Some of the more notable ones include Rackspace, which has over six petabytes of storage and folks like WebEx. At the recent Grizzly OpenStack summit, WebEx and Mirantis and Cisco presented how we are using Swift in order to store all of the user documents and video files in a very large geographically distributed Swift cluster.
Now, Swift and Ceph are designed to solve somewhat different problems. Ceph probably is not the store I would choose right now if you wanted to deploy a multi-petabyte system. While DreamHost is deploying Ceph in this environment, most typical Ceph deployments now are quite a bit smaller.
However, Ceph does offer a significantly higher performance and Ceph can provide via its ‘CRUSH’ algorithm, better placement controls and significantly higher fault tolerance. You can designate things as failure zones, nested failure zones, as well as very sophisticated rules that can take care of power management, switches and other issues using Ceph.
In Swift you can only deal with zones. And zones are assumed to be geographically proximate, although, that’s often not true.
- “What are the solutions for providing shared ephemeral storage to the instances? It looks like Ceph is not production ready for this use case, Lustre doesn't have an Openstack connector, NFS doesn't scale and the only remaining option would be Gluster in a distributed-replicated setup. Is this correct?”
A: At this point I would not use the distributed file system tier of Ceph in production. However, its block storage tier is essentially, relatively robust, and there is no reason not to use it. Lustre does not have an OpenStack connector.
Lustre is actually not an ideal platform for this particular use case. You could use Ceph. You could actually use NFS; remember that NFS does not need to scale in order to be used with OpenStack. You can go ahead and set up many different NFS arrays, and you can export storage from each one to a specific set of hosts. You don’t have to have this huge setup of, say, ten full racks for a single managed cluster. Once it’s set up and configured statically for Cinder, it does not need extensive management from you, so you can use Ceph and NFS and Gluster in this such a distributed, replicated set up.
So to recap, I would first try Ceph and Gluster. Ceph and Gluster have the advantage that they come with built in network fault tolerance. So if one storage frame were to fail, the other one would transparently resume traffic. With NFS that’s not a default feature, while you can obviously buy that feature from the enterprise NFS makers, it will come with a significantly higher cost.