This post is from the CollabNet VersionOne blog and has not been updated since the original publish date.
Continuous Integration- Basic Overview and Best Practices
Guest blog post by Adam Leggett, Chief Architect at Mike CI – Hosted Continuous Integration.
If you are involved in some capacity with the activity of software development, you may well have heard of the term ‘Continuous Integration’ (often shortened to ‘CI’). But what is it, and how is it done?
The basic premise of CI is pretty straightforward – your team needs a reliable method to create a build of the software under development, that happens on a continual basis. Why so? Well, if its not already obvious, this enables a team to receive frequent feedback about software quality earlier in the development cycle, thus helping to avoid potential downstream issues and defects. Also, if your team size is greater than one, integrating each developers work in a regular, repeatable fashion (perhaps hourly) will help significantly in highlighting any code integration issues.
Recipe for CI
So how does CI help to create this build and allow us to perform continuous quality checks? Lets list the essential ingredients that we need :
1. Source Code Control – this provides the foundation for the entire process. In a typical software project, developers work to transform functional requirements into source code, in whatever programming language(s) the project is using. Once their work is at an appropriate level of completeness, they check-in or commit their work to the source code (a.k.a version) control system. Arguably, the most popular version control system in use with CI is Subversion, and you will find that it is natively supported by almost all tools that occupy the CI space. From the perspective of the version control system, CI merely behaves like an additional ‘client’, usually with read-only access to the project repository. When configuring your CI workflow, you may wish to grab or checkout a fresh working copy of the source code before each build is run, or alternatively perform an incremental update.
2. Build Tool – if the source code needs to be compiled (e.g. Java or C++) then we will need tooling to support that. Modern Integrated Developer Environments (IDE), such as Eclipse or Visual Studio are able to perform this task as developers save source code files. But if we want to build the software independently of an IDE in an automated fashion, say on a server environment, we need an additional tool to do this. Examples of this type of tool are Ant,Maven and Rake and Make. These tools can also package a binary output from the build. For example, with Java projects this might be a JAR or WAR file – the deployable unit that represents the application being developed.
3. Test Tools – as part of the build process, in addition to compilation and the creation of binary outputs, we should also verify that (at minimum) the unit tests pass. For example, in Java these are often written using the JUnitautomated unit testing framework. The tools in (2) often natively support the running of such tests, so they should always be executed during a build. In addition to unit testing, there are numerous other quality checks we can perform and status reports CI can produce. For example you can wire in tooling that can perform static analysis on the source code, to ensure that it meets with your team or organisational coding standards.
4. Schedule or Trigger – we might want to create our build according to a schedule (e.g ‘every afternoon’) or when there is a change in the state of the project source code. In the latter case we can set up a simple rule that triggers a build whenever a developer changes the state of the source code by committing his/her changes, as outlined in (1). This has the effect of ensuring that your teams work is continuously integrated to produce a stable build, and, as you may have guessed, is where this practice gets its name from.
5. Notifications – the team needs to know when a build fails, so it can respond and fix the issue. There are lots of ways to notify a team these days – instant messaging, Twitter etc., but the most common by far is still email.
Continuous Integration Server
The tool that wires these five elements together is a CI Server. It interacts with the source control system to obtain the latest revision of the code, launches the build tool (which also runs the unit tests) and notifies us of any failures. And it does this according to a schedule or state change based trigger. A CI server often also provides a web-based interface that allows a team to review the status, metrics and data associated with each build.
There is a pretty overwhelming, and for the beginner a somewhat confusing, choice of available tools in this space. Some are open source, some proprietary. I don’t have time to go into all the available options here unfortunately. However, there is a handy feature comparison matrix available here. There are also some who offer CI as a hosted web-based service, such as ourselves (Mike CI).
One thing to bear in mind is that if you decide to choose an on-premise installation of a CI server, you will need to ensure that sufficient computing resources are allocated to it. Unlike some other elements of your development infrastructure, your ‘build box’ will need a reasonable amount of headroom with respect to RAM, CPU, disk and network I/O. This is why I firmly believe that a hosted or cloud computing-based solution offers immediate tangible benefits for those who wish to get started with Continuous Integration without all the hassle.