Using Virtual Machines and Containers for Community Development

Open source communities are made up of overlapping groups of users, contributors and committersOpen source when viewed as a software development model is all about the community. Code is, of course, critical to the success of a project, but it is the community that rights the code. Therefore a focus on community development is, in many ways, even more important than writing code. It is with this in mind that I recently presented a session on “Using Virtual Machines and Containers for Community Development” at ApacheCon, the official conference of the Apache Software Foundation (an organization that explicitly has a focus on “community over code”).
Open source communities consist of, broadly speaking, three types of people. Users, contributors and committers.

  • Users are interested in trying out the code, testing it in their environment and possibly deploying it.
  • Contributors are usually users who provide bug reports, feature requests and patches.
  • Committers are usually contributors who are committed to the long term success of the project.

Different projects might have different names for each of these groups and some may sub-divide one or more groups, but for the purposes of this blog post we’ll keep it simple.

In a healthy community-led open source project the goal is to move as many users as possible into the contributor role. We also want to move contributors into the role of committer. Doing so builds enhances the sustainability of the project since we need a constant flow of new people who will replace those who move away from the project for whatever reason. The entry point for these new people is always as a potential user since it is unusual for people to work on a project they are not using themselves, as part of a day job responsibility, as an educational exercise or for a personal project.

Community Development: From User to Committer

There are a number of things we can do to attract and retain users. For example, evangelism and marketing are important as is providing first level support within the project community. We need to ensure startup-documentation is clear and we need to make it possible for newcomers to get started with our project as quickly as possible. We also need to ensure that users are able to move from the try phase (high level examination of features and UI) to the test phase (configuring for a specific use case) with minimal overhead. Furthermore, by making the initial test environment portable and reproducible we make it easy for users to move from testing to deploy, at which stage they are well on their way to becoming a contributor.

This is where Virtual Machines can come into play. By providing the project as a virtual machine image our potential users can fire up a test environment, on their own accounts, in very little time. They can then work with that machine in any way they desire, including adding private data, which is not to be encouraged in a shared test installations. As the testing phase progresses some light prototyping of integration with other systems is possible. Ultimately, since this is the users own machine they can do anything they want with it.

Should the user want to move into the deploy phase then they can do so by simply migrating their test virtual machine, or they can start again from a base image. Either way the overhead in moving to deployment is significantly reduced. Furthermore, by including scripts and tools to update the software on the virtual machine we can make it easier for users to stay current with the latest developments in the software.

An additional advantage to providing tooling for staying current with latest development version of the project code is that we are also ensuring that it is as easy as possible for users to package any changes they make and submit them back to the parent project. For example, the virtual machine might include version controlled source code, project documentation with links to issue trackers while convenient scripts can be provided to create and submit patch files.

Finally, a virtual machine can also provide configurations for building and testing the application. This puts an end to cries of “works on my machine”. By having all contributors working on the same configuration we can simplify communications between team members. However, this does increase the need to test more completely on other configurations. Thankfully, the majority of configuration scripts can be reused in the testing phases too. This is especially true if a configuration management tool, such as Chef, is used.

What about Containers?

In the above text, I have focused on Virtual Machines, but today it seems Containers are all the rage. It is reasonable for people to ask “What about containers?” Indeed most of the advantages of Virtual Machine Images are realized with containers too.

Containers have their advantages and disadvantages over virtual machines. This post is not the place to discuss these. Instead, I will say that if you feel your application should be deployed as a container rather than a virtual machine then, by all means, provide containers rather than VMs.

How to Create Virtual Machines and Containers

There are a number of technologies you could use to describe your VM or container. Docker is currently the most popular way of defining a container while Vagrant is perhaps the most popular for managing virtual machine images. Both of these tools provide a way to manage the configuration of the VM or container, but both will also work alongside configuration management tools such as Chef, Salt or Puppet.

With each of these tools you can use the same configuration files to build and deploy virtual machines or containers on your local machine as well as to deploy them to cloud platforms. For example, Vagrant supports Hyper-V out of the box while the Vagrant-Azure plugin allows Vagrant VMs to be deployed to Azure. If you prefer Docker then you are equally well supported.

Making your VM Available

Each of these tools focus on enabling configuration in code. This means that it is sufficient to simply share the code, via your projects source code repository, and invite users to install the required tooling on their machine. However, we can make it even easier for those initial users by providing a virtual machine image in VM Depot.

VM Depot is a community managed repository of virtual machine images for Microsoft Azure. Storage of these images in VM Depot is free of charge and users can quickly deploy them to their Azure subscription (they can get a free trial too) using VM Depot Easy Deploy, our cross-platform CLI tools or the Microsoft Azure Management Portal.