Recently in flay Category

I said the following in my previous post about flay:

"Soon I will write up why flay kicks towelee [sic], PMD, and everyone else's tool in the ass... But I think the above is a damn good start."

There are going to be two classes of tools for this type of work: string-based tools and AST-based tools.

Towelie

Towelie is Giles' entry into the fray of ruby developer tools. Towelie is an AST-based tool that uses ParseTree to detect duplicate code at the method level.

There aren't nearly as many available to us rubyists, so it is worth a peek... But, after closer inspection, I just can't compare the two. Yes, towelie attempts to detect duplicated (but not similar) code, but that is where the similarities end, so the comparison doesn't seem fair.

Consider it an exercise for the reader.

PMD's CPD, Simian, Same, etc.

On the CPD page it says that the current version "was rewritten by Steve Hawkins to use the Karp-Rabin string matching algorithm". In other words, it is a string-based tool. This family of tools have completely different objectives than flay. They're normalizing whitespace,stripping comments, and looking for duplicate code. That's great... it actually finds lots and lots of good stuff and I used to use same when starting with new clients.

But... (there is always a but...)

These tools would (could!) NEVER point this out:

Matches found in :defn (mass = 32)
  A: ../../drawr/dev/lib/drawr.rb:38
  B: ../../png/dev/lib/png.rb:181

A: def write(file)
B: def save(path)
A:   File.open(file, "wb") { |f| f.write(to_s) }
B:   File.open(path, "wb") { |f| f.write(to_blob) }
   end

especially considering the code is actually written like this:

def write(file); File.open(file, 'wb') { |f| f.write to_s }; end

vs:

def save(path)
  File.open path, 'wb' do |f|
    f.write to\_blob
  end
end

This is something that a duplicate code string-based scanner just can't do. Even the simplest change like {} vs do/end or changing your line wrapping on long conditionals will be missed... lost... ignored.

So, flay has the ability to go beyond simple copy/pasted code and detect real candidates for refactoring. That is something that the java folks don't seem to have (yet) for some reason. Certainly the foundation set by PMD means it is available, but it isn't there yet.

Flay is blowing my mind

| | Comments (0)

No, really. Flay is blowing my mind. It is just getting cooler and cooler. The next version of flay is going to have a verbose mode that will try to show an N-way diff:

Matches found in :when (mass = 84)
  A: unit/itemconfig.rb:182
  B: unit/itemconfig.rb:192
  C: unit/itemconfig.rb:207

A: when /^(#{__item_numstrval_optkeys(tagid(tagOrId)).join("|")})$/ then
B: when /^(#{__item_listval_optkeys(tagid(tagOrId)).join("|")})$/ then
C: when /^(#{__item_strval_optkeys(tagid(tagOrId)).join("|")})$/ then
A:   num_or_str(tk_call_without_enc(*(__item_cget_cmd(tagid(tagOrId)) << "-#{option}")))
B:   simplelist(tk_call_without_enc(*(__item_cget_cmd(tagid(tagOrId)) << "-#{option}")))
C:   _fromUTF8(tk_call_without_enc(*(__item_cget_cmd(tagid(tagOrId)) << "-#{option}")))

And Patrick Ritchie recently submitted additional similarity reporting that I'm going to work on folding in soon. His version allows extra fuzzier matching of copy-pasted code that has been edited!

Soon I will write up why flay kicks towelee, PMD, and everyone else's tool in the ass... But I think the above is a damn good start.

Flay analyzes ruby code for structural similarities. Differences in literal values, variable, class, method names, whitespace, programming style, braces vs do/end, etc are all ignored. Making this totally rad.

Changes:

1.0.0 / 2008-11-06