Worst Regexp I've Ever Written

| | Comments (2)

This was part of a Parse::RecDescent grammar (or it might have been antlr by then) for a QA scripting tool that matched most perl 5 regexps. Note that nearly everything is double backslashed for the parser generator, not as part of the regexp.

  regex           : m/(
                       \\/(\\\\\\/|[^\\/])+?\\/
                  |   m\\((\\\\\\)|[^\\)])+?\\)
                  |   m\\[(\\\\\\]|[^\\]])+?\\]
                  | m\\\{(\\\\\\\}|[^\\\}])+?\\\}
                  |   m(.)(\\\\\\6|[^\\6])+?\\6)/x
                  | <error>

I'm mainly blogging this because I keep losing it. I think this is the correct unbackslashification:

m/  (\/(\\\/|[^\/])+?\/
 |  m\((\\\)|[^\)])+?\)
 |  m\[(\\\]|[^\]])+?\]
 |  m\{(\\\}|[^\}])+?\}
 | m(.)(\\\6|[^\6])+?\6)/x

Still gross as hell.

2 Comments

That's some lovely ASCII art.

And people complain about Elisp regex verbosity...

Leave a comment