Whether you realize it or not, you need a tool that finds duplicate source code in your applications. In fact, if you’ve never used one before, you probably don’t realize how much you need an automated solution to this problem. It’s nearly impossible to manually locate the types of duplicate code that such a tool can easily bring to the surface. Even if you think you’re intimately aware of an application’s code base, every line of code you write contains the potential to awaken the duplicate code dragon.
To combat the problem, we have Atomiq – what I consider to be the best solution for finding duplicate/similar code in C#, VB.Net, ASPX, Ruby, Python, Java, C, C++, ActionScript, and XAML.
UPDATE 11/9/2010 10:46:26 AM by AD: The promotional code that used to appear on this post has been removed.
How can I make such bold claims? Well, for one, because I know I always write awesome code and yet it’s astonishing how frequently Atomic says, “No, Alex, you do not always write awesome code.”
Oh, and it happens to be the only code similarity finder I could find that was easy to use for my purposes. You might say nothing else duplicates the Atomiq experience! *SNORT*
Atomiq doesn’t really need this much of an introduction. If you haven’t done so already, it’s an easy, safe, and incredibly eye opening experience to run Atomiq against any of your projects. Here’s how to get some immediate gratification:
- Download Atomiq from http://getatomiq.com (There’s no real need to spend much time on their website if you just want to get some immediate gratification)
- Get some immediate gratification.
That’s pretty much it as long as we ignore the part where Atomiq points and laughs at all the duplicate code it found!
I was able to quickly gain an extensive understanding of where my duplicate code existed with minimal knowledge of how to use Atomiq. After I finished wiping up the tears, I was able to begin the gratifying process of fixing things.
A personal example
If you’ve read my posts about using delegates to eliminate duplicate code and using IDisposable responsibly, then this example is going to look familiar to you. To be honest, it was Atomiq that led me to the delegate-based design in those posts!
Below you will find two methods that do two different things with an OdbcDataReader. Don’t spend a lot of time trying to figure out what they do.
The first bit of code returns a list of first names from a database.
The second bit of code sends an email notification to a list of email addresses in the same database.
It’s pretty typical code you might find in any application that connects to a database. You can tell the author had good intentions, but it’s not hard to think of a few simple things we could do to make the code a better.
But it’s probably not a good idea to just dive in and start refactoring!!!
If you looked at the code above, you might have noticed that the two methods followed very similar patterns. Though the methods do two different things, they’re not that different from each other. In other words, there’s a lot of duplicate code. There might be a hundred instances of that pattern in your application! It would be very difficult to find those instances manually.
Before you attempt to refactor, you might want to use a tool like Atomiq to help you find all of the duplicate code patterns in your code. Finding and eliminating those patterns will help you make better refactoring decisions.
For example, if we look at the NotifyPeople method from above in Atomiq, we can see from the red lines that there are two other places that have the same pattern as lines 103-115 and 120-131.
Closer inspection shows that, indeed, one of the places in our code that duplicates that pattern is the GetFirstNames method from above. Again, my two previous blog posts, using delegates to eliminate duplicate code and using IDisposable responsibly, go into detail explaining how I chose to solve this particular problem in one of my projects.
I’m not going to go into great detail on how to use Atomiq, as its smart developers have already graciously done this on getatomiq.com. It’s not massively complicated software which makes getting to know its full extents a quick exercise. I do suggest watching the introductory videos on the homepage if you’re sitting on the toilet with nothing else to do.
The entire Atomiq user interface isn’t exactly what I would call “typical”, so spending some time on their website will help you get the most out of the tool. And because of that unique interface, Atomiq has a few features that you might not discover without the aid of the website.
This wouldn’t be a proper review if I didn’t throw my opinion in the air and wave it like I just don’t care, now would it?
Here’s my bullet list of “other” notes I took during my review of Atomiq.
- I’m not sure why they don’t include the option to download and run a standard installer for Atomiq, but I don’t necessarily find this to be a problem. I chose to xcopy the Atomiq exe out to c:\Utilities\Atomiq.
- If you don’t configure the analyzer’s settings appropriately, the Atomiq user interface might not show anything useful. When this happens, it’s easy to assume you didn’t do something right.
- Minor detail, but I like to look at change logs. At the time of this writing, there isn’t one on the website or included with the Atomiq application.
- The entire user interface isn’t exactly what I would call “standard”. That doesn’t mean it’s a bad/unusable interface, but I’m typically the kind of guy who likes things to look/work the way they do by default. I mean, I never even changed my myspace theme from the default skin for crying out loud! It just catches me off guard when I can’t ALT+F, N to start a new project for example.
- When you create a new project in Atomiq, the first thing you are required to do is pick a directory that contains your source code. The Pick button shows a pretty standard directory picker, but I really wish I could locate the directory I want more quickly by pasting the directory from my clipboard. I’m quite ninja-like when it comes to wrangling a computer, so it’s sometimes easier for me to get the directory I want in the clipboard from another application than it is for me to locate the directory with this user interface. But now that I think about it, I don’t think wrangling is something ninjas do.
- It would be great if Atomiq had an MRU list to help me open the files I use often.
- It doesn’t appear that Atomiq is able to find duplicate code patterns that vary only by magic numbers, variable/class names, or syntactic variations. Using a tool like Atomiq most effectively sometimes requires gently massaging your code beforehand. For example, the screenshot below shows that Atomiq doesn’t find any similarities between NotifyPeople above and NotifyPeople2. Yet they are, for all practical purposes, identical.
If everyone on your team uses roughly the same coding and naming guidelines, you might not have to worry about that problem much.Here’s another example that I think Atomiq might be able to shed some light on one day. There’s clearly a very important similarity between lines 166 and 177 below.
Maybe the developers of Atomiq could provide some kind of clue that there’s an opportunity to perform the following “extract method” refactoring:
Still a winner
Prior to finding Atomiq, I’d never used a code similarity finder. These days, it’s something I use often and can’t imagine living without! It’s such a simple, useful tool, I don’t know why anyone wouldn’t want to use it on their own projects.
Most of the time, you’ll likely use Atomiq as more of a detective tool. It usually finds little pieces of a pattern that, upon closer inspection, are much bigger, more important patterns. So even though Atomiq can’t perform miracles, the tremendous satisfaction that comes along with deleting tons of duplicate code from your application is worth many times its $30 price tag.
UPDATED 11/9/2010 10:46:26 AM by AD: The promotional code that used to appear on this post has been removed.