OpenStack Project Technical Lead Interview Series #8: Michael Basnight, OpenStack Trove Project
This post is the 8th of a continuing series of interviews with OpenStack Project Technical Leads on our Mirantis blog. Our goal is to educate the broader tech community and help people understand how they can contribute to and benefit from OpenStack. Naturally, these are the opinions of the interviewee, not of Mirantis.
Here the interview is with Michael Basnight, OpenStack Trove Project Technical Lead.
Mirantis: Can you please introduce yourself?
Michael Basnight: I'm a Principal Engineer at Rackspace, working on OpenStack 24/7. I’ve been at Rackspace for over 7 years now. I’ve worked on data analytics, cloud website hosting, large scale provisioning systems and now Trove. I am able to focus my energy 100% on the open source codebase now, thanks to Rackspace. I’m based in Berkeley, CA, with an office that is either on my front porch with a cup of coffee, or next to my wife and 10 month old son, Alexander.
Q: What is your history with OpenStack? Why do you engage?
A: When we were creating Trove within Rackspace, myself and the other developers weren't thinking about OpenStack. We started finagling with a full Java stack, OSGI, Zookeeper, and the like… standard Java stuff. During that time, Rackspace decided to go 'all in' with OpenStack. Seeing that OpenStack was the way of our future, I decided (being the dev lead and manager at the time) to scrap the POC we had started and jump in using Nova as our provisioning engine. We had some hiccups along the way, because OpenStack was very young then. But I've been able to submit changes to most every OpenStack project while they were maturing, and now we have a solid project built on top of OpenStack, and a project that can be bolted on to any existing OpenStack install.
Q: What are your responsibilities as the Trove Project Technical Lead?
A: Heh, thats a good question. Its almost a recipe. I start with a base of product ownership and management, in order to keep the vision of the project moving in such a way that it works the best for the community. I sprinkle in a dash of project management, blueprint managing and keeping who's working on what in line. I stir, and slowly add some person managing, to make sure that people are committing to what they say they have promised. Add a generous helping of code review with a pinch of -1, and if I’m LUCKY, when its all done, I ice it with a bit of development on my own.
Q: Can you explain Trove’s role within OpenStack? Why does Trove matter?
A: Trove matters because storing data matters. You cannot name a single project that doesn't need data storage or caching. Trove is a one stop shop for data storage. Whether its Redis to store some highly available real time data, MySQL to house all your pet store information, or MongoDB to store your Marconi messages. The vision of Trove is to provision these, get the clusters configured and online, and keep them running. Trove also offers backups and restores for services it implements, and things like automatic backups / point in time recovery are in the works.
Q: What is genuinely unique and disruptive about Trove?
A: Trove is full of data storage system experts. We have engaged key persons in the community for multiple projects and service implementations. We have developers who have been administering MySQL since version 3.23. We are onboarding people who have written cluster implementations in MySQL. We are consulting Redis experts when implementing it. We are both developer and data experts. If we don’t know how to do it, we know who does.
Q: Tell us about the Trove community--who is contributing?
A: We started with some excellent developers from Rackspace, and then HP cloud services caught the bug. They really helped get the momentum rolling. Since we've gone into incubation we've had Mirantis start developing, UnitedStack is participating, and some developers from New Zealand. We also have some people who are interested in the finer aspects of clustering, and I've had some excellent conversations with people who are very interested in Galera Cluster and Tungsten replicator. We've gone from 2 time zones to worldwide in a matter of months!
Q: What has the Trove community accomplished so far?
A: We have a service thats been running in production at Rackspace for a while now. We have database provisioning down to a science! We have the nuts and bolts of a MySQL service running. Beyond that, we've defined ourselves as being the RDB and NRDB provisioning engine for the OpenStack community. We've done a good bit of relationship building with companies and evangelizing the product. I’m excited to see what Trove is in 6 months, for the Icehouse release. I expect to see Clustering, configuration editing, and a Redis implementation in Icehouse.
Q: Which capabilities will Trove provide in the OpenStack Havana release?
A: Trove knows MySQL. It does instance provisioning, security groups, and user/database creation and modification. It allows resizes, and we make sure that the config and service are sane upon resizing. It has backups and restores. Basically all the things you would want from a single instance managed MySQL service.
We have spent a good deal of time making Trove more configurable this iteration, and have been focused on doing what we need to graduate to integration. We have cleaned up a good deal of technical debt that had crept in, including adding RHEL integration and adding a templated config file for creation. Last but not least we have added Heat integration, which will be optional in Havana.
Q: What do you wish people knew about this project?
A: I wish people knew we were more than just MySQL as a service. In Icehouse we plan on full clustering support for Cassandra and MongoDB, as well as MySQL clustering. This includes master/slave and Galera/Tungsten multi master replication and clustering.
Q: Are there any common misconceptions about Trove?
A: Again, I’ll harp on the MySQL as a service, or SQL as a service. We are more than that. We want to provide provisioning and maintenance for SQL and NoSQL data stores. We don't want to be labeled as a "one trick pony".
Q: You mentioned Trove’s vision - can you tell us more about that?
A: Clusters. Clusters. Clusters. I see Trove's sweet spot as being a datastore provisioning / clustering API. People don't want to run their own clusters. They just want their product to work. Trove wants to provide the ease of installation, configuration, and maintenance for these clusters.
Q: Which use cases are you targeting?
A: I bet you can guess my answer… people who want clusters! Data redundancy is the way of the future current. Clustering is the same. Customers want their data, and they want a guarantee it'll be around when they need it. In steps's clustering support.
Q: What are (will be) the necessary prerequisites to get Trove up and running properly?
A: The biggest change in the next 6 months is Heat. We are relying on Heat to support cluster installation. This may mean integration with the Heat team, and they are a bunch of really good people, so that will be easy! Other than that, standard OpenStack requirements is all you will need. Once we are better integrated into DevStack, which will be very soon, it will be as simple as spinning up DevStack and using Trove.
Q: Whom would you like to see contributing to Trove?
A: There are two types of people I would love to see contribute. People with a passion for data. Someone who knows Galera cluster, or Cassandra, or some other data store technology. And even more importantly, people who want to operate the system. Operators have the best view of the overall system, because they need to know all the running parts. I’ve had a ton of intelligent conversations with our engineers and operators, and made Trove a better product because of it.
Q: Which functionalities need to be enhanced? and tested?
A: I think that Heat support, while new, needs enhancing. I can see Trove doing a lot with Heat in the future. I’d like to see more people testing our RHEL/Centos support. The Mirantis team has been a huge help in getting RHEL/Centos fully working, and I'm grateful for that. I would love to see other companies take that and run with it, and make sure we stay honest with our RHEL/Centos support.
Q: How specifically can people get started?
A: Funny thing. We are in the middle of making the Trove ramp up easier. We have some nonstandard ways of doing things, and one of my jobs as PTL is to standardize. We just finished work to get DevStack support. Its going through the gates right now! This, along with a few other items landing soon in DevStack, will make Trove usable without any extra setup. But currently, you need to pull down diskimage-builder and build an image, and upload that to Glance. After that its as simple as using our Trove client to provision an instance. Additionally, we have some helper scripts in the Trove-integration repository, and I will be updating them to work with the new DevStack merge in the next few days. In a few weeks, when the dust has settled, it will be ridiculously simple to get started with Trove. And if anyone has questions, we have a legion of smart humans, and a robot or two, in #openstack-trove.
Q: Thank you very much, Michael.
A: You’re welcome!