Monday 16 July 2012


HERESY AND SOFTWARE DEVELOPMENT METRICS #1

This is a series of blog posts that respond to some important and popular blog or forum postings on the web that relate to software development metrics. In classifying “important” and “popular” I am taking quite an objective view of things and taking the almighty Google page rankings as my guide to the influence that these postings have. Above all I have looked for postings that have come from authors that are seen as the great and good in the world of software development. This is why I have titled this series “Heresy and software development metrics” in that these blog postings seem to have become the conventional wisdom on the topic. I intend on injecting a little heresy in the face of this conventional wisdom. It is my hope that I can unsettle some of this conventional thinking where that thinking has become, in my view, damaging to the improvement of the great endeavour that is software development.


------------------------------

CANNOT MEASURE PRODUCTIVITY

First in this series of postings is Martin Fowler’s “Cannot Measure Productivity” blog posting. Martin Fowler is one of those commentators on software development that always seems to manage to strike a blow for common sense. He is an authorative voice and evangelist for ThoughtWorks on many of the domains that they have products in. I have found his writings on Patterns as being particularly extensive and required reading on the subject. It is with this knowledge of Martin’s, normally excellent, musings that I dismayed at exactly how wrong he has got it when it comes to measuring productivity in software development. The posting is uncharacteristically defeatist on the topic, in areas it is wide of the mark on points of fact, and dangerously misguided in opinion. I note that the blog entry was posted in 2003 and is somewhat old by now and perhaps Martin’s thinking may have moved on since then. Regardless of its age this blog posting seems a popular touchstone on the web when it comes to measuring software development productivity.

Let me summarise this blog post in a few sentences. Martin commences his posting with the categorical capitulation on the ability of software development teams to measure their productivity by clearly stating that in his view there is “no way of reasonably measuring productivity”. He then goes on to say that this is more specifically because we don’t have a reasonable measure of output. Martin goes on to explain why Lines of Code is a poor measure of output and that Function Points are similarly flawed. He finally posits that the business value delivered by software products is the only measure of productivity.

[Productivity]


THE DANGER INHERENT IN CAPITULATION

The single most dangerous assertion of this posting is the outright capitulation in the quest for workable measures of software development productivity. I say dangerous because any discipline that fails to robustly account for itself in terms of productivity is destined to remain at the fringes of any corporate or organizational power structure. As long as software development teams make excuses about why they can’t objectively and concretely demonstrate their productivity software development will fail to achieve the strategic prominence it deserves in the corporate landscape.

Martin makes a rather feeble defence that “a company's lawyers, it's marketing department, an educational institution” all make no reasonable account for their productivity. On a point of fact this claim is patently untrue. In the age of digital marketing those responsible for marketing budgets are constantly held to account for their productivity. Perhaps this was less the case in 2003 when this blog posting was written and that the world has moved on and marketers are, through the vast array of marketing analytics products and services now available, able to adjust their approaches and strategy to optimise their performance. Perhaps the lesson here is that people carrying out their job are always looking for objective ways to evaluate their productivity; nobody likes turning up to work and wasting their time and effort. If the technology arises to make that possible for a discipline to understand it’s productivity it will be widely embraced.

Educational institutions, at least the well run ones, have extensive and widely used productivity evaluation practices. Such practices include elaborate counting and grading of publications produced by academics, grades achieved by students, the consistent rating of teaching delivery, and the ongoing evaluation of demand for courses based on changes in the entry grades of students accepted into those courses. The reality is that objective measures that give insight into productivity are essential to the healthy function of any discipline within an organization, to suggest that we as software developers should make excuses about why we thing its impossible to understand our productivity is a grave mistake.

The reason that objective productivity measures exist for these, according to Martin, impossible to measure functions within organizations (i.e. marketers and educators) is the reality that if a function fails to demonstrate productivity in a meaningful and objective way it will forever remain on the periphery of the strategy and outlook of the organization within which they contest for resources and leadership. There are any number of IT professionals out there whom will complain and moan that their wider organization fails to recognise or value the strategic importance of IT generally, and software development specifically. They complain that software development capabilities remain considered a support function to be tolerated rather than a strategic partner with a seat at the leadership table. This continues to occur despite the increasing strategic importance of software development capabilities. Sadly, this is too often the case and is in no small part due to the fact that software development investments are sometimes (more often than one might think) a black hole that organizations tip money into where teams, consisting of employees on relatively high salaries, provide very little in the way of objective data around the productivity or the expensive resources that they employ. All of that occurs even before any debate is to be had about the business value delivered by a software development initiative – more on that later.


MEASURING OUTPUT: CHALLENGING, BUT POSSIBLE

Before we go on let’s look more closely at what is meant by Productivity. According to Wikipedia:

Productivity is a ratio of production output to what is required to produce it (inputs). The measure of productivity is defined as a total output per one unit of a total input”.

We see from this definition, and indeed intuitively, that we must have good measures of outputs and inputs if we are to end up with a satisfactory measure of productivity.

Martin devotes the entire blog to measures of software development output. This is where his well targeted criticism for Function Points rings true. I would add to that and suggest that going to your business with output measures and some account of productivity expressed in terms of abstract function points is a mistake – quite frankly they won’t care what you claim about productivity because they won’t understand what function points mean in terms of their business - and nor should they.

Martin launches, with some vitriol, at the measurement of Lines of Code in his blog posting, he admonishes the software industry at large for one of his “biggest irritations” and that we all “should know better”. We all, no doubt, feel suitably chastised and perhaps more than just a little patronised. He proposes that it is impossible to measure the productivity of a software team because raw LoC is a terrible measure of the output of a software development team. This is like saying that democracy, in all its forms, should be abandoned because a candidate that he didn’t like got voted into office. Just because some software development teams and practitioners have applied a static source code metrics like Lines of Code in an overly simplistic and misguided way is no reason to abandon all attempts to measure productivity.

Martin goes off onto a very wide arc of argument suggesting that the only meaningful measure of output is the resultant business value of features delivered. Presumably he means business value in monetary terms. Software development teams are in no position to put a monetary value on the business value delivered by a feature. Doing so involves making some thoroughly unreasonable and rather arrogant assumptions about the value delivered by other functions within the business in order for functional change to come to fruition (e.g. how much value to Operations bring to that equation? What about the Sales teams?). The whole process of arriving at these business valuations is so subjective and arbitrary that the end result is pure opinion and therefore not a useful measure of output.

I suggest a measure of output that is somewhat more pragmatic, I propose that the only output that matters is the key features/requirements/user stories/user epics that the business has asked for. Any account of productivity must be expressed directly in terms of the functionality that is being delivered by software development teams. If the business have collectively asked for these features then they must value the capability that they bring. Okay, so this measure of output is not universally comparable across platforms with different stakeholders within an organization or indeed across organizations however it is something that is meaningful to the stakeholders that are consuming (directly or indirectly) the functionality that is being delivered by the software development teams – it is most often those stakeholders that are paying for the software development capability and therefore it is they who will need to be convinced regarding the worth of the outputs, the inputs invested in producing them, and the resultant productivity.


MEASURING INPUT: SURPRISINGLY CHALLENGING, BUT ALSO POSSIBLE

Martin’s blog post focuses exclusively on the challenges of measuring the outputs of software development teams; it makes no comment about the similarly thorny issue of how the inputs of software development teams should be measured.

Input could be measured purely in terms of monetary cost but this has shortcomings. For example, it will be skewed by cost factors not under the control of the development team (e.g. an offshore location versus an onshore) and therefore the overall productivity calculation fail to express important differences in what the developers themselves are contributing as input to the process. Without this transparency areas of high productivity can not be identified and their best practice propagated throughout an organisation.

Measures of time invested might also be used. However this may have shortcomings in that it is remarkably difficult to have consistent and reliably manually tracked measures of time input to software development to any useful level of task granularity, this measurement is also surprisingly taxing on development teams and discipline in these manual tracking activities wane over time. If not sufficiently granular, time tracking fails to recognise that developers often get dragged off onto tasks unrelated to a software release (e.g. supporting production).

Artefact based measures of input (i.e. physical manifestations of the effort contributed by software development teams such as source code change, configuration file change etc.) are another potential source of input measures. These measures are appealing in that they are objective and involve no additional reporting burden on development teams (i.e. they are incidental to the development process) but they are challenging to measure in ways that capture the sophisticated nature of what software developers must invest to create those artefacts. While automation of measuring artefacts is achievable it requires that a great many aspects of the artefacts must be evaluated (e.g. volume, complexity, interrelatedness, contextual difficulty, etc.) and aggregated into a single measure of effort which technically challenging.

Incidentally, if you have reliable and granular measures of input into delivering functionality to your stakeholders, as has been proposed above as a measure output, you not only have the ability to demonstrate productivity you also then have the ability to help the business to prioritise tasks for future releases or Sprints based on the input invested in similar tasks previously.

It is very important to note that ANY and ALL measures of software development team productivity must be communicated alongside commensurate measures of release quality. Even in the more readily measured manufacturing engineering domains productivity measures need to be supplemented with measures of quality.


CONCLUSION

In summary, it is professional suicide for developers to excuse themselves of providing reasonable and, ideally, objective measures of productivity. Without credible measures of productivity software development capabilities will remain tertiary support functions in many businesses where such capabilities could be primary drivers or strategic direction.

Nobody has said that measuring productivity in software development is easy, it isn’t, but it is necessary. Measuring output is difficult; but possible. The less well appreciated challenges of measuring input are also challenging but also far from impossible. Software development divisions of organisations just need to choose the measurement approaches that are appropriate for their environment.

Mr Fowler should speak for himself when he suggests that we “admit to our ignorance” when it comes to productivity in software development. This ignorance is his and should remain his alone.