Capturing and examining code metrics can be useful if the resultant data is applied correctly. Attempting to measure productivity by studying code, for example, is a measurement which has largely failed [ref]. Code metrics, however, can be helpful in objectively spotting complexity. Since complexity has a high correlation to the occurrence of defects, doesn’t it make sense to measure and track source code complexity?
When I see code that violates the DRY principle or contains a load of String literals, I tend to immediately jump into refactoring mode. Yet, as Paul recently point out to me, when we (as developers) construct and maintain build files, all those code perfectionist ideals seem to disappear.
For example, I was recently working with an Ant build file that had a maven-esque mechanism for managing dependencies. The build script downloaded each required binary dependency at build time via Ant’s get task like so:
This particular project has over 20 dependencies; therefore, there were over 20 get targets like the one above. This build script was in danger of becoming a maintenance nightmare if someone decided to upgrade a few different libraries!
Note how commons-lang-1.0.1.jar is repeated twice, meaning that someone would have to update the version twice during an upgrade. Also note the base URL- ibiblio was the main download source for 80% of the binaries.
These get tasks can be refactored to utilize the DRY principle by mimicking the Replace Magic Number with Symbolic Constant technique. By breaking out and up the String values, the build file can become more manageable and therefore, more maintainable.
For example, the above Ant code can be refactored into
Now, when a library’s version requires updating, one only has to edit one String; moreover, if the file’s location changes, or the team decides to host all binaries locally, one property requires changing, not 20 plus.
The next time you happen to be in a refactoring mood, have a look at your build file. Just remember to verify it works before and after you improve it.
I am often surprised/concerned to hear where many organizations are along their “software automation continuum”. Developers spend most their time automating processes for users, yet don’t always see ways to automate their own development processes.
Sometimes developers/organizations are under the impression that they have automated an activity because they’ve written a few scripts to eliminate some steps in the development process. It’s not about throwing a couple of scripts together and running them all the time. In order for automation to be effective for the entire team, you need to share your automation script(s) to the SCM repository, incorporate into the build system and finally, make it continuous (or at least scheduled). The figure below demonstrates the steps along the automation continuum.
Identify- the activity you seek to automate. It may be compile, test, inspection, deployment, database integration, and so on. An easy way to think of this is based on the major software disciplines/activities that most projects typically engage in - no matter which methodology they are using
Automate- the activity so that it is repeatable. This will involve writing some scripts
Share- using an SCM tool such as Subversion, so that others may use the scripts/programs you have written. If no one else can use your script, you are not leveraging the capabilities of automation.
Build– A process that you have automated that lives on its own is of minimal value. In many cases, you’d like one process to precede another. For example, you may want to remove certain files before you perform a compile. You can do this by incorporating your automation into a comprehensive build script. You may write this in a build script using a tool like NAnt.
Make it Continuous- so that humans do not need to run the automated process and so that risks are mitigated by ensuring all is well whenever a change is applied to the software. You may do this using a continuous integration tool such as CruiseControl.
Where is your organization along this automation continuum? Use the above heuristic to determine how automated your development activities are on your projects. Note, that you may be performing steps 2 and 4 together (for example, if you are using a build script). The more you can make these activities automatic, the more time you will have to deliver value to your users.
Ever find yourself in need of a simple way to validate the structure and even the contents of generated XML? Here is one way of doing it:
public void testToXML() {
BatchDependencyXMLReport report =
new BatchDependencyXMLReport(new Date(9000000), this.getFilters());
report.addTargetAndDependencies("com.vanward.test.MyTest",
this.getDependencies());
report.addTargetAndDependencies("com.xom.xml.Test",
this.getDependencies());
String valid =
"<DependencyReport date="Wed Dec 31 21:30:00 EST 1969">"+
"<FiltersApplied><Filter pattern="java|org" />"+
"</FiltersApplied><Class name="com.vanward.test.MyTest">"+
"<Dependency name="com.vanward.xml.Element" /></Class>"+
"<Class name="com.xom.xml.Test">"+
"<Dependency name="com.vanward.xml.Element" />"+
"</Class></DependencyReport>";
assertEquals("report didn't match xml", valid, report.toXML());
}
This is as brute force as it gets. This test case works too, but if the document structure changes, someone’s got to update that nasty String.
Thankfully, there is a better way to do this. XMLUnit is a JUnit extension (there is also a .NET equivalent) that provides a nifty API for validating the structure of XML documents and their contents. Via its Diff class, the validation is also more flexible than that of straight String comparisons.
One can use XMLUnit via extending its XMLTestCase or via composition. Either way, however, you must properly configure it. This is easily done by creating a fixture as shown below:
Now we can rewrite the previous test case and utilize the Diff class.
public void testToXML() throws Exception{
BatchDependencyXMLReport report =
new BatchDependencyXMLReport(new Date(9000000), this.getFilters());
report.addTargetAndDependencies(
"com.vanward.test.MyTest", this.getDependencies());
report.addTargetAndDependencies(
"com.xom.xml.Test", this.getDependencies());
Diff diff = new Diff(new FileReader(
new File("./test/conf/report-control.xml")),
new StringReader(report.toXML()));
assertTrue("XML was not identical", diff.identical());
}
Note, however, that by using XMLUnit, our test cases can end up being a bit more dependent on aspects out of their control. In this case, the test case depends on the file system- the report-control.xml file is read in and used for comparison purposes. Because of this outside dependency, XMLUnit tests are usually not true unit tests, but component tests.
To validate the structure of XML (meaning attribute values are ignored), XMLUnit offers the ability to use various listeners within the Diff class. The aptly named IgnoreTextAndAttributeValuesDifferenceListener effectively ignores the data in an XML document and simply validates the structure. Using this listener is quite simple:
public void testToXMLFormatOnly() throws Exception{
BatchDependencyXMLReport report =
new BatchDependencyXMLReport(new Date(), this.getFilters());
report.addTargetAndDependencies(
"com.vanward.test.MyTest", this.getDependencies());
report.addTargetAndDependencies(
"com.xom.xml.Test", this.getDependencies());
Diff diff = new Diff(new FileReader(
new File("./test/conf/report-control.xml")),
new StringReader(report.toXML()));
//setting this difference listener will ingore ALL attr values
diff.overrideDifferenceListener(
new IgnoreTextAndAttributeValuesDifferenceListener());
assertTrue("XML was not similar", diff.similar());
}
The next time the task of validating an application’s generated XML comes up, consider using XMLUnit.
For years, the impact of compliance on information technology has been projected. In early 2005, as the Sarbox Section 404 deadlines began to kick in, Qualcomm reported 67,000 man hours and over $7 million to achieve compliance.
Bernie Donnelly, vice-president of quality assurance and control at the Philadelphia Stock Exchange explains how execution of financial compliance is a function of information technology: “When [Sarbanes-Oxley] first came out, everybody was thinking about finances and the accuracy of year-end reports. But it starts to take on a life of its own. Because when you ask that one question-’Is this number accurate?’ - then you have to ensure its accuracy. On the IT side, all these other things have to happen to answer that one question.”
In the case of Sarbanes-Oxley, cost assessment metrics are emerging. In the InformationWeek article “Sarbox Isn’t Just for the Big Guys” the accepted rule of thumb for Sarbanes-Oxley compliance is $1 million for every $1 billion in revenue.
Smaller companies face these costs, too. Often, as a supplier to a public company bound to offer transparency about dealings with vendors, smaller businesses find themselves in the ripple effect of regulatory compliance.
A PricewaterhouseCoopers survey of CEOs reports the bad news that most companies consider the benefits of compliance efforts unlikely to match these costs. Given the high adoption rate of smaller, private companies to regulatory compliance concerns and the known lack of ROI on these efforts, seasoned technology executives are looking to early software quality as an opportunity to achieve their corporate financial goals.
In the Stelligent whitepaper “The Business Case for Engineered Software Quality” the benefits of improved quality — especially early in the development lifecycle — are documented. Countering the explicit costs of regulatory compliance with the intrinsic benefits of improved quality yields a compelling financial argument for technology executives to contemplate.
The information technology component of compliance is ultimately about making sure the processes that collect, manipulate and maintain data all work as expected. There is great overlap between this mission and the mission of early software quality.
Early software quality has its own established business case. Regulatory compliance fails to meet ROI evaluations, but is required by government entities or strategic business pressures.
By creating a strategy that couples the two initiatives, savvy technology executives can leverage the huge savings of early quality techniques to absorb the costs of required compliance.
I had the opportunity once to speak with a project manager at a large company whose development team had fashioned a fairly rigorous automated development testing regimen. They had a high degree of tests at all levels and had built a fairly robust auto deployment process via their build. From time to time, however, the team would deploy their application into company wide production with glaring UI specific issues, such as pages with broken tables and missing images. It was particularly painful for this manager as he would inevitably find out about the issues from various groups within the company who depended on this application. It turned out that this team was so heavily focused on automation that they neglected to utilize common sense and had never actually manually checked the state of a deployment. Once a manual sanity check was put into practice, UI ugliness issues largely disappeared.
Best practice: be sure to run a manual sanity check after a deployment before your users accuse you of insanity.
Coding standards facilitate a common understanding of a code base among a diverse group of developers. Just like the car maintenance market has been largely standardized (i.e. one can buy a new headlight for a Toyota made by Toyota or any number of other third-party vendors; moreover, one can buy this item at various stores not even affiliated with Toyota, making the replacement or enhancement process rapid) so too, can a code base’s “structure” become standardized, which permits various individuals to quickly assess behavior and modify as needed in a rapid manner.
While both human code reviews and peer programming can be effective in monitoring coding standards, they do not scale as well as automated tools. Not only do tools contain hundreds of rules (that are usually customizable), they can be run frequently without intervention.
In a Continuous Integration environment, a code analysis tool can be run anytime the repository changes. The tool can analyze an individual file (such as the one modified) or the tool can analyze the entire code base. What’s more, due to the Continuous Integration infrastructure, interested parties can be notified of coding standard violations instantly.
For instance, a popular code analysis tool for the Java platform is PMD. PMD has over 180 customizable rules in categories ranging from braces placement (i.e. for conditionals), naming conventions, design conventions (like simplifying conditionals, etc), and even unused code.
In Java, if a conditional only has one statement following it, braces are optional. The following code, for example, is completely legal in Java:
if(status)
commit();
Some organizations, however, find this code dangerous, due to subtle behavioral effects that may occur if someone forgets to add braces when adding additional statements.
The following code is completely legal; however there is a subtle defect that could ensnarl an unsuspecting developer. Do you see it?
if(status)
log.debug("committing db");
commit();
PMD, however, with its handy dandy rule set, will find code that has the potential to cause these errors and signify them in a report.
Naming conventions are usually the first coding aspects defined by teams to follow as un-descriptive terse variable names and methods can be somewhat difficult to comprehend (especially if the original author no longer works for the company!). For example, the following method could stand a better name and the variables ‘s’ and ‘t’ are also quite unhelpful in the larger context (i.e. one can figure out their type by examining the top of the method; however, if they were named more descriptively someone wouldn’t be required to look back at the top of the method).
public void cw(IWord wrd) throws CreateException {
Session s = null;
Transaction t = null;
try{
s = WordDAOImpl.sessFactory.getHibernateSession();
t = s.beginTransaction();
s.saveOrUpdateCopy(wrd);
t.commit();
s.flush();
s.close();
}catch(Throwable thr){
thr.printStackTrace();
try{s.close();}catch(Exception e){}
try{t.rollback();}catch(Exception e){}
throw new CreateException(thr.getMessage());
}
}
Once again, PMD to rescue! Running PMD against this code would yield multiple rule violations for both the method name and those one character variable names. By default, PMD’s scanning lengths are set to 3; however, teams can modify these values for longer names if desired.
PMD can also facilitate in the simplification of code. For example, the following method, while syntactically correct, is rather verbose.
Once this method is flagged by PMD, it can be made more straightforward like so.
public boolean validateAddress(){
return (this.getToAddress() != null);
}
PMD can be run via Ant or Maven and, like most every other inspection tool on the market, PMD produces and XML report, which can be transformed into HTML. For example, the following report displays the violations for a series of .java files in the XDoclet code base.
PMD also contains a series of rules which can report complexity metrics like Cyclomatic Complexity, long methods and long classes. What’s more, PMD isn’t the only code audit tool available to Java developers. CheckStyle is another open source tool with extensive documentation and Ant and Maven runners capable of producing HTML reports.
FxCop is a similar tool for the .NET platform with a myriad of rules and reporting capabilities; additionally, PyLint is available for Python.
By continuously monitoring and auditing code, corrective action can be taken early and often, thus avoiding long term maintenance issues and reducing the chances of introducing defects.
Several months ago, while I was speaking with my co-worker, Chuck, one of our project’s builds failed. I knew this because I received an SMS text message on my mobile phone, my Office Space “we’ve got sorta a problem here” sound was played through my computer speakers, I received an email, and the Orb on my desk changed to a red hue. I briefly interrupted my conversation with Chuck, picked up the phone, and rang the tech lead for this project. Before I got one word out, she says “I just got it and I’m on it”. I hung up the phone and continued my conversation without missing a beat. This is the Detector pattern, in action.
The Detector pattern is about getting the right information, to the right people, at the right time and in the right way. If I was away from my email, the SMS text message would have notified me. If I was in my office, but without my email up, the Orb or the sound would have been sufficient in signaling the problem. This pattern is about pushing information to people that need to take action to resolve a problem. A failed compile or a failed test is an example of such a problem.
It is important that this information is not just sent to all project members every time. This information should only be sent once a certain threshold has been met and to people that need to take action on the problem.
Ever attempted to deploy your software in your user’s environment only to discover that you were using a different version of the operating system, database, or application server? The Scorched Earth pattern is about reducing assumptions in your build and testing environments. A “scorch” is removing and reapplying software, scripts, and configuration values to ensure the environment is operating as expected.
When you are building our software, you want to be sure that there are no left over files or configuration settings that may make the software fail (or to receive a false positive). A full “scorch” means to start with nothing on the computer and apply a “layer” at a time until the complete system is applied. This may typically be performed on a testing or staging machine. Ideally, a Scorched Earth implementation that uses a Robot (automated process) to apply each layer makes this process more efficient. For example, remove everything from a machine and then apply the following layers to it:
Operating System
Operating System Configuration (network connectivity, users, firewall)
Server components for the software (e.g. application server, database server, messaging server)
Server configuration
Applying third-party tools (such as web frameworks, ORM, etc.)
Custom software (software you are writing for user)
You may apply a “scorch” to only one layer, such as the custom software components only. In this case, you remove all associated files and build and test the software in this environment. Which layer you scorch to will depend upon your level of risk. If your software relies upon various operating system files, then you may choose to scorch the entire system more often. In any case, a full scorch is recommended a few times before releasing the software to your users.