Posted by Seamus on Tuesday, August 21, 2012.

How to parse quotes in Ragel (and Ruby)

The key to parsing quotes in Ragel is ([^’\] | /\./)* as found in the rlscan example. Think of it as ( not_quote_or_escape | escaped_something )*.

Making it work with single and double quotes

Here’s the heart of a working example that covers both single and double quotes:

%%{
  machine not_scanner;

  action Start {
    s = p
  }
  action Stop {
    quoted_text = data[s...p].pack('c*')
    # do something with the quoted text!
  }

  squote = "'";
  dquote = '"';
  not_squote_or_escape = [^'\\];
  not_dquote_or_escape = [^"\\];
  escaped_something = /\\./;
  ss = space* squote ( not_squote_or_escape | escaped_something )* >Start %Stop squote;
  dd = space* dquote ( not_dquote_or_escape | escaped_something )* >Start %Stop dquote;

  main := (ss | dd)*;
}%%

Why does it work?

Use this example string:

"a\"bc"

Follow it on the graph: (notice the symmetry… the “top” processes double quotes and the “bottom” processes single quotes)

thumbnail of the graph of the state machine

… tl;dr …

   "      a     \     "     b      c      "
➇  →  ➁  →  ➂  →  ➃  →  ➂  →  ➂  →  ➂  →  ➇
                     BAM!

State ➃ is eating the escaped double quote and therefore preventing the machine from stopping—that’s the key!

You can also do it with a scanner

Here’s what you would do in a scanner:

%%{
  machine scanner;

  action GotOne {
    quoted_text = data[(ts+1)...(te-1)].pack('c*')
    # do something with quoted text!
  }

  squote = "'";
  dquote = '"';
  not_squote_or_escape = [^'\\];
  not_dquote_or_escape = [^"\\];
  escaped_something = /\\./;

  main := |*
    squote ( not_squote_or_escape | escaped_something )* squote => GotOne;
    dquote ( not_dquote_or_escape | escaped_something )* dquote => GotOne;
    any;
  *|;
}%%

What blog is this?

Safety in Numbers is Brighter Planet's blog about climate science, Ruby, Rails, data, transparency, and, well, us.

Who's behind this?

We're Brighter Planet, the world's leading computational sustainability platform.

Who's blogging here?

  1. Patti Prairie CEO