Code quality’s singular metric
There are some excellent articles and posts around the internet recently related to a question asked on LinkedIn about metrics and code quality. Specifically, the question asked was:
What are the useful Metrics for Code Quality?
The user goes on to state that
The quality of any software application will depend mostly on its code base and it’s important to know what might be the key metrics that help us to evaluate the stability and quality of the code base.
Many of the answers bring up excellent points, including different ways of viewing the question, such as taking the time to understand what attributes of code are related back to quality. For example, Michael Bolton aptly suggests you ask if the code is:
- testable
- supportable
- maintainable
- portable
- localizable
What’s more, further answers suggest coverage, code size, and other metrics as discrete measurements that can help gauge quality. These are all excellent answers; however, the answer turns out to be quite simple.
We have found time and time again that there is one metric that most appropriately relates to code quality: Cyclomatic complexity. If your code base has highly localized pockets (i.e. methods) of Cyclomatic complexity (or CC) your code will have issues (undoubtedly affecting quality how ever you define it) eventually.
In fact, CC affects arguably every attribute listed above (testable through localizable). Think about it for a minute: a method that has 27 different paths is next to impossible to adequately test, which means you’re going to have a doozy of time supporting it because it isn’t easy to maintain. Code that is littered with high CC is a blast to port as well (hopefully you’ve got deep pockets and customers that absolutely love you). Good luck localizing it too.

It turns out that the other metrics mentioned (such as code size) tend to correlate to each other– in fact, it seems that all complexity-like metrics point back to CC. Classes that have a lot of dependencies are usually big and big classes usually have big methods and big methods usually have lots of conditionals. Lots of conditionals mean a high CC value (CC measures paths through a method, such as from an if/else chain).
Code coverage is an excellent metric for ascertaining what code isn’t touched by tests and it happens to relate directly to CC because in order to reach 100% branch coverage you’d have to have a one to one relationship with CC (i.e. if a method has 27 different paths, you’d need 27 tests to reach full coverage). Plus, coverage can be unfortunately misleading and can provide a false sense of security.
The beauty of CC is that it’s one metric. One number is all you need to understand risk. You can then apply it in many ways. For example, we provide development teams with ratios related to CC (because CC precisely delineates complex methods it’s often helpful to relate it to other normalized metrics) that enable them to gauge quickly the overall health of a code base. When the ratios grow, things are getting worse and when they decrease, happiness ensues.
The definition of quality (and its associated attributes) as it relates to software has traditionally been quite hard to nail down (regardless if you are a customer or a developer); however, one thing is factual– complex code is a house of cards that will eventually collapse (via attrition, bankruptcy, ossification, etc). Finding complexity and proactively reducing it will lead to software that is more testable, maintainable, and supportable. And by the way, that happens to be the kind of software customers like.

September 27th, 2007 at 2:51 am
Good article. One minor thing:
“(i.e. if a method has 27 different paths, you’d need 27 tests to reach full coverage)”
I don’t know what you call tests, but I guess you at least need 27 asserts (even if you hava a 1:1 relationship between tests and methods). But CC = number of tests is a good rule of thumb.
One needs to be aware of the pitfalls of adding CC though. Sometimes people add method CC to create class or package CC. It worked for us to calculate class CC with class_cc=sum(cc_of_methods) - number_of_methods (there is a name for this somewhere). This takes into account that some classes are not very complex but contain lots of small methods. Yo can go with avg(method_cc) too.
Peace
-stephan
–
Stephan Schmidt :: stephan@reposita.org
Reposita Open Source - Monitor your software development
http://www.reposita.org
Blog at http://stephan.reposita.org - No signal. No noise.
September 27th, 2007 at 9:44 am
Right on, Stephan– good point w/asserts. I couldn’t agree more with you regarding the dangers of CC when aggregating it. Check out “Aggregate Cyclomatic complexity is meaningless” as too often people do exactly what you mention.
September 29th, 2007 at 2:53 am
I agree with you in general regarding the need for lower CC stats. But I would also caution against going extra miles trying to reduce CC. At one of my employers, we had a rule that class that have high CC (compared to arbitrary thresholds) are considered risky and therefore they need to be tested, whereas other classes are deemed simple and therefore don’t have to be tested (I know that this flies in the face of Test Driven Design, but …).
So can you guess what the developers used to do when they run into a method that has high CC? They simply refactored it into other methods, until the CC measure was back in “non complex” area. Needless to say that this was a waste of time, and that the resulting code was difficult to understand and maintain (because it was written to avoid high CC, and not to promote simplicity, testability, understandability, …)
October 2nd, 2007 at 10:59 pm
[…] Fowler discusses Design Stamina Health in making the compelling case that software design is a worthwhile activity. I doubt many of us disagree with his conclusion, assuming that the resulting design realizes the intended goal - that it can evolve. While good software designs are able to evolve based on the known factors today, the unforeseen factors of tomorrow reap havoc on design. I doubt we can ever entirely defeat these forces, but many techniques have been discovered that allow us to craft more adaptable software designs. Object-oriented development, design patterns, software code quality metrics, design quality principles, emergent design techniques such as Test-Driven Development, and Service Oriented Architecture all represent a positive step. Yet for large enterprise software systems, there is still a key ingredient missing in delaying design rot. On the horizon looms a disruptive technology, codename OSGi, that stands to redefine how we think about designing enterprise software on the Java platform. […]
October 8th, 2007 at 7:25 am
[…] New for Visual Studio 2008 - Code Metrics- I like the “Maintainability Index” metric, but as I’ve mused before, man, there is only one metric that matters, baby. […]
October 17th, 2007 at 7:23 pm
I think that there is never one singular metric for code quality, but rather, a bunch of smaller metrics that can show an overall picture.
Much like a connect-the-dot picture, no single dot can give you a clear view into reality, but many smaller dots (CC, code coverage %, # tests, LOC, eLOC, etc…) may give you a general idea.
October 23rd, 2007 at 7:40 am
Let me thank you for bringing it here & explained in great detail.
It’s good to see that, there is you are close to my views on CC.
Regards,
Venkat.
October 23rd, 2007 at 1:29 pm
[…] Andrew Glover calls it as Code Quality’s singular metric […]
October 23rd, 2007 at 1:36 pm
[…] Andrew Glover calls it as Code Quality’s singular metric […]
October 29th, 2007 at 4:56 pm
[…] There are myriad code metrics available to measure attributes of code, such as complexity, coupling, and length, but few are arguably useful. In fact, as I’ve stated before, Cyclomatic complexity is the most applicable metric out there for accurately determining risk. […]
February 14th, 2008 at 8:39 am
[…] I wonder if Andy has seen this. […]