ParseTree: February 2005 Archives
drbrain (9:42): my machine just did factorial of 5!
zenspider (9:42): yay!
drbrain (9:43): it takes just under 1 second to pull the ParseTree of demo/factorial.rb all the way out to a result
zenspider (9:44): are you shitting me???
ONE SECOND?
drbrain (9:45): $ time ruby run.rb demo/factorial.rb
Machine Stack: [120]
Call Stack: #<Stack:0x31c6ec @sp=0, @fp=0, @stack=[:stack_empty, 3, 120, :stack_empty, :stack_empty, :stack_empty, :stack_empty, 79, 120, :stack_empty, :stack_empty, :stack_empty, :stack_empty, 22, 120, :stack_empty, :stack_empty, :stack_empty, :stack_empty]>
Returned: 120
real 0m1.053s
user 0m0.810s
sys 0m0.040s
drbrain (9:45): now I need to get it simplified down enough to run through ruby2c
So... drbrain comes up to me in an IM and says flgr is saying it'd be really cool if you could ask a method for its source. I know what he is doing, baiting me like that, but I play along anyways to see what the outcome is like. drbrain and I talked about it and thought it'd be really cool if our ruby2c system added a to_c method to the Method class. That isn't hard at all really, so we added:
class Method
# with_class_and_method_name is a silly method.
# Implementation is an exercise for the reader.
def to_sexp
with_class_and_method_name do |klass, method|
ParseTree.new.parse_tree_for_method(klass, method)
end
end
def to_c
with_class_and_method_name do |klass, method|
RubyToC.translate(klass, method)
end
end
end
But the question came up... can we do this to display ruby code? The answer is yes, and it only took me about 30 minutes to get the proof of concept up and running. First, the example code:
class Example
def example(arg1)
return "Blah: " + arg1.to_s
end
end
e = Example.new
puts "sexp:"
p e.method(:example).to_sexp
puts "C:"
puts e.method(:example).to_c
puts "Ruby:"
puts e.method(:example).to_ruby
and now the output:
sexp:
[:defn, :example, [:scope, [:block, [:args, :arg1], [:return, [:call, [:str, "Blah: "], :+, [:array, [:call, [:lvar, :arg1], :to_s]]]]]]]
C:
str
example(long arg1) {
return strcat("Blah: ", to_s(arg1));
}
Ruby:
def example(arg1)
return "Blah: " + arg1.to_s
end
Cool huh? I can now translate any method to C or get the ruby code for it (sans-comments unfortunately) simply by calling to_c or to_ruby on the method itself!
Refax the Automatic Refactoring Engine is a very cool proof of concept that uses ParseTree to discover redundant code and suggest a refactoring. The only problem with it is that it outputs raw sexp as the suggested refactoring:
Suggest refactoring weirdfunc in HastilyWritten: [:block, [:args], [:while, [:vcall, :keepgoing], [:block, [:fcall, :puts, [:array, [:str, "This is a weird loop"]]], [:fcall, :doSomethingWeird]]]]
Not very comprehensible. I spent some time with it and plugged RubyToRuby (more on that coming soon) so it would output:
Suggest refactoring HastilyWritten#weirdfunc from:
def weirdfunc()
puts("This is a weird loop")
doSomethingWeird()
begin
puts("This is a weird loop")
doSomethingWeird()
end while keepgoing
end
to:
def weirdfunc()
begin
puts("This is a weird loop")
doSomethingWeird()
end while keepgoing
end
Reworked Refax
require 'parse_tree'
require 'ruby_to_ruby'
class Refax
def couldPossiblyRefactor?(p, ind)
return false unless p[ind].is_a?(Array)
return false unless p[ind].first == :while
return false if p[ind][-1] == :post
return true unless p[ind][2].is_a?(Array)
p[ind][2].first == :block
end
def howManyInsn(p)
fail "Must be a while, not a #{p}" unless p.first == :while
if p[2].is_a?(Array)
fail unless p[2].first == :block
p[2].size - 1
else
1
end
end
def grabInsnArray(p)
fail "Must be a while, not a #{p}" unless p[0] == :while
if p[2].is_a?(Array)
p[2][1..-1]
else
[p[2]]
end
end
def isEquiv(a, b)
a.to_s == b.to_s
end
def fixcode(p, ind)
loopsize = howManyInsn(p[ind])
goodcode = p.clone
goodcode.slice!(ind-loopsize..ind-1)
goodcode # todo : make correcter
end
def recurseOn(p)
if p.is_a?(Array)
@lastclass = p[1] if p.first == :class
@lastfunc = p[1] if p.first == :defn
p.each { |i| recurseOn(i) }
p.each_index do |ind|
if couldPossiblyRefactor?(p,ind)
loopsize = howManyInsn(p[ind])
if loopsize < ind
if isEquiv(p[ind-loopsize,loopsize], grabInsnArray(p[ind]))
goodstuff = fixcode(p, ind)
puts "Suggest refactoring #{@lastclass}##{@lastfunc} from:"
puts
puts RubyToRuby.translate(eval(@lastclass.to_s), @lastfunc)
print "\nto:\n\n"
puts RubyToRuby.new.process(s(:defn, @lastfunc, s(:scope, goodstuff)))
end
end
end
end
end
end
def refactor(c)
fail "Must have class or module" unless c.is_a?(Module)
p = ParseTree.new.parse_tree(c)
recurseOn(p)
end
r = Refax.new
ObjectSpace.each_object(Module) { |c|
r.refactor(c)
}
end
I am releasing ParseTree 1.3.3 today in preparation of our ruby2c release (also today). Changes in ParseTree are minor, but necessary for ruby2c.
ParseTree is a C extension (using RubyInline) that extracts the parse tree for an entire class or a specific method and returns it as a s-expression (aka sexp) using ruby's arrays, strings, symbols, and integers.
As an example:
def conditional1(arg1)
if arg1 == 0 then
return 1
end
return 0
end
becomes:
[:defn,
:conditional1,
[:scope,
[:block,
[:args, :arg1],
[:if,
[:call, [:lvar, :arg1], :==, [:array, [:lit, 0]]],
[:return, [:lit, 1]],
nil],
[:return, [:lit, 0]]]]]
Features/Problems:
- Uses RubyInline, so it just drops in.
- Includes SexpProcessor and CompositeSexpProcessor.
- Allows you to write very clean filters.
- Includes show.rb, which lets you quickly snoop code.
- Includes abc.rb, which lets you get abc metrics on code.
- abc metrics = numbers of assignments, branches, and calls.
- whitespace independent metric for method complexity.
- Only works on methods in classes/modules, not arbitrary code.
- Does not work on the core classes, as they are not ruby (yet).
Changes:
- 3 minor enhancement
- Cleaned up parse_tree_abc output
- Patched up null class names (delegate classes are weird!)
- Added UnknownNodeError and switched SyntaxError over to it.
- 2 bug fixes
- Fixed BEGIN node handling to recurse instead of going flat.
- FINALLY fixed the weird compiler errors seen on some versions of gcc 3 .4.x related to type punned pointers.
