Research into open source software quality

Share this post

Reading Time: 5 minutes

My first decade in software development was in software testing and test management and despite moving away from that discipline it remains a special interest. As an open source practitioner I maintain a keen interest in where the worlds of software quality and open source meet. It goes without saying that quality is managed VERY differently in open source projects than it is in traditional closed-source projects. How differently, and how project attributes positively impact on quality, has been the focus of an increasing amount of academic research, especially now that the open source phenomenon is mainstream.

Open source research

Open source projects lend themselves well to academic research due to the openness and transparency that comes with publicly hosted source code, messaging channels and bug tracking tools. In 2007 I undertook an extensive review of the existing body of research around open source software quality, this was the first review of its kind at that point and was published in the journal IEEE Software. A limited but growing body of research papers around how quality in OSS was achieved highlighted the following conclusions:

High-quality OSS relies on having a large, sustainable community; this results in rapid code development, effective debugging, and new features.
Code modularity, good documentation, tutorials, development tools, and a reward and recognition culture facilitate the creation of a sustainable community.
The system and the community must co-evolve to achieve sustainable development and high-quality software.
High modularity and many bug finders and fixers result in low defect density.
Rapid release cycles keep code reviewers and developers interested and motivated, quickly resulting in new features and high quality
Code review by people outside the project team leads to independent, objective reviewing.
The project team’s environment and culture are as important as system design when creating high-quality software. This can be handled in several ways, but success depends on a highly organized approach, with sophisticated tool support for collaboration, debugging, and code submission.

Recent research into open source quality

A lot has happened in the intervening five years including two more children and two job changes, so I took my eye off the open source quality ball for a while. So I’m excited to now be reviewing research that has occurred since 2007 in this area.

In particular, a 2009 paper by Conley and Sproull caught my eye recently: Easier Said than Done: An Empirical Investigation of Software Design and Quality in Open Source Software Development.

The paper explores the conventional wisdom that software modularity improves quality. The authors investigated over 200 releases from 46 open source projects and surprisingly found that while high modularity reduced software complexity, it actually increased the numbers of certain types of bugs.

The fact that modularity reduces software complexity chimes with earlier research in this area which concludes that high modularity was a factor in building a sustainable open source community (see my 2007 paper for references). This stands to reason. Take Moodle for example (Modular Object Oriented Dynamic Learning Environment) which has a highly modular architecture, making it easy to developers to work on a single part of the system without impacting the core system or other modules, hence a very large development community has become established.

Prior research into the relationship between code modularity and quality was based on anecdotal evidence, however Conley and Sproull were looking for empirical evidence of a link, but found it lacking. They suggested a reason for this was due to the difficulty of measuring the degree of modularity. While prior work had been based on examining function calls among single classes in source code, this measurement does not account for the fact that modules are often defined as ‘packages’ of files and classes, and that communication also occurs via interfaces not just function calls. Thus they proposed a new measurement that takes these factors into account, and used that as the basis for their comparisons. Download the paper if you’d like to see the measurement, it’s far too complex to go into here!

Package instability

Among a number of complex and technical metrics was one that really caught my eye: package instability. A package with a value of zero indicates complete stability whereby changes made to other packages will never affect this one; however a package with a value of 1 indicates complete instability whereby changes made to other packages will likely directly affect this one.

This got me thinking that what a great tool it would be that could identify package instability in the form of a system heat map, for example. So if I were putting time estimates together for some custom work on an open source system that involved working on a module with a known high degree of instability, I could to take into account the extra testing and bug fixing effort as part of my estimate rather than be hit by surprises once work gets underway. Another use would be for software testers who could see which packages have the highest instability and contain the closest links to other packages and thus target their testing efforts at parts of the user interface which use those packages.

Package instability seems to me to be an incredibly useful metric and one I’d like to explore further in future, and see some tool support for too.

Defect density

The authors also had some neat ideas about bug tracking. They looked at numbers of bugs reported in the project bug trackers, and introduced a method I’ve not seen before that allowed direct comparisons of quality despite differences in the size of developers and user communities, e.g. small projects won’t have as many bugs reported but that obviously doesn’t mean they’re better quality! They weighted bug tracker metrics by the number of downloads for each release.

Such a simple and easily executed idea and so obvious! It had me heading straight for the Moodle Stats page and onto Sourceforge to get release download data and into the Moodle JIRA Tracker to get bug metrics per release. Alas, this data was not readily available. The data is all there but it would take some custom JIRA queries and Sourceforge API hacking to expose it, neither of which I had time to do. One for a rainy day I think, unless anybody wants to have a crack at extracting the data themselves!
Such a metric would allow you to track product quality across many releases of an open source project or even at module level depending on how the bug tracker collects data, which would make for a fascinating study.

Modularity equals higher numbers of static bugs

The other big finding by Conley and Sproull was that as degree of modularity increases, the number of static bugs increases too. That’s bugs uncovered by source code analysis tools, not end users. There were high variances found across projects though, and it was suggested that other project or release characteristics may be coming into play such as project governance mechanisms and skill levels of contributing programmers.

As earlier research has shown, a project team’s environment and culture are as important as system design when creating high-quality software. Open source project organisation and tool support for collaboration, debugging, and code submission are all contrinuting factors. I look forward to understanding those areas in more detail as I continue to review recent research via this blog.

Easier Said than Done: An Empirical Investigation of Software Design and Quality in Open Source Software Development
Conley, Caryn A., and Sproull Lee
2009 42nd Hawaii International Conference on System Sciences (HICSS 2009)
http://flosshub.org/content/easier-said-done-empirical-investigation-software-design-and-quality-open-source-software-de

Share this post