Monday, May 28, 2007

Comments--How many should you have?

While there is considerable conversation about how many unit tests to write (I have two recent posts on the topic--here and here--based on conversations with vendors), few people have much to say re how many comments there should be. Once the usual themes (more comments, keep comments up-to-date, avoid pointless comments) have been stated, the conversation ends. Everyone understands what relevant and up-to-date comments mean, but few will hazard a guess as to how many of them necessary.

Interestingly, the comment ratio is a key factor in one of the most interesting metrics, the maintainability index (MI). It is also a ratio that is rewarded by Ohloh.net, the emerging site for tracking details of open-source projects. Ohloh gives projects a kudo for a high percentage of comments. The question is how high is high enough to earn the kudo. According to Ohloh, the average open source project runs around 35% comments. Projects in the top third overall get the kudo. I don't know the cut-off for this top third, but I do know Apache's FOP with a 46% ratio of comments definitely qualifies.

Comment count is a metric that is particularly easy to spoof. You could do what some OSS projects do and list all the license terms at the top of each file. I've always disliked scrolling through this pointless legalese. In my file headers, I simply point to the URL of the license, which is sufficient. But all the license boilerplate inflates comment counts amazingly. So does commenting out code and leaving it in the codebase (gag!) or writing Javadoc for simple getters and setters (also gag).

But where legitimate comments are concerned, 35% is probably a good working number to shoot for. I find many OSS projects at this ratio have quite readable code. I'll be working to bring my own codebase up to this level.

Friday, May 18, 2007

Groovy Gaining Traction


Java developers suddenly have a wealth of choices when it comes to dynamic languages that run on the JVM. There's JavaFX, which Sun announced at JavaOne this year, and JRuby, which Sun expects to complete sometime this year, and then, of course, there's my favorite: Groovy. Groovy makes writing Java programs far easier. It essentially takes Java and removes the syntactical cruft, leaving a neat language that makes you terrifically productive.

Because Groovy took a long time getting out of the gate, it's taken some licks in the press. However, it's clear that Java developers are catching on to its benefits. The JavaOne bookstore published its daily top-10 sales during the show. The picture on this post, shows the Day 2 list with two Groovy titles in the top 10 (at places 5 and 8). Overall, the Groovy bible, Groovy in Action, came in at number 5 for the show. Interest is definitely growing.

If you haven't tried Groovy yourself, it's definitely worth a look. Here are a couple of good overviews:

Wednesday, May 16, 2007

Unit Testing Private Variables and Functions

How do you write unit tests to exercise private functions and check on private variables? For my projects, I have relied on a technique of adding special testing-only methods to my classes. These methods all have names that begin with FTO_ (for testing only). My regular code may not call these functions. Eventually, I'll write a rule that code-checkers can enforce to make sure that these violations of data hiding don't accidentally appear in non-test code.

However, for a long time I've wanted to know if there is a better way to do this. So, I did what most good programmers do--I asked someone who knows testing better than I do. Which meant talking to the ever-kind Jeff Frederick, who is the main committer of the popular CI server Cruise Control (and the head of product development at Agitar).

Jeff contended that the problem is really one of code design. If all methods are short and specific, then it should be possible to test a private variable by probing the method that uses it. Or said another way: if you can't get at the variable to test it, chances are it's buried in too much code. (Extract Method, I have long believed, is the most important refactoring.)

Likewise private methods. Make 'em small, have them do only one thing, and call them from accessible methods.

I've spent a week noodling around with this sound advice. It appeals to me because almost invariably when I refactor code to make it more testable, I find that I've improved it. So far, Jeff is mostly right. I can eliminate most situations by cleaning up code. However, there are a few routines that look intractable. While I work at find a better way to refactor them (a constant quest of mine, actually), I am curious to know how you solve this problem.

Thursday, May 10, 2007

Reusing IDE and SATA Drives: A Solution


Because I review lots and lots of tools, I find myself going through PCs pretty quickly. It's not fair to gauge the performance of a product on old hardware, so each year I buy new PCs. Over the course of years, I've accumulated lots of IDE drives from the PCs I've discarded. I rarely ever use them, but every once in a while I would like to know what's on them and whether I can reuse one of them. Unfortunately, this is a time consuming task, especially hooking up the drive to a PC that will access it.


I recently came across an elegant solution to this problem: the USB 2.0 Universal Drive Adapter from newertech. This device comes with a power supply for the HDD and a separate cable that plugs into an ATA-IDE drive, a notebook IDE drive, or a SATA drive. The other end of the cable is a USB plug. So, you attach the cord and the power the drive, plug the USB end of the cable into your PC--and the IDE drive magically pops up as a USB drive on your system, with full read and write capabilities.

I have cleaned up a bunch of IDE drives during the last week using this adapter. In the process, I've discovered it has some limitations. It did not work well on older drives. Some would not power up (but they did start up when I swapped them into a PC), and others did not handle read/writes well (Windows generated write errors), although it's hard to know if the errors come from the drive or the adapter. But for most drives from the last few years, the product worked without a hitch. Neat solution and, at $24.95 retail, a no-brainer purchase.

Further note: I increasingly use virtualization for my testbed infrastructure. When I'm done with a review, I archive the VMs. This keeps my disks mostly pristine and reusable, so I am not discarding disks with old PCs nearly as much as before. The fact that drives today are > 500GB also helps. ;-)