Safety in Numbers
Brighter Planet's blog
How to parse quotes in Ragel (and Ruby)
The key to parsing quotes in Ragel is ([^’\] | /\./)* as found in the rlscan
example. Think of it as ( not_quote_or_escape | escaped_something )*.
Making it work with single and double quotes
Here’s the heart of a working example that covers both single and double quotes:
%%{
machine not_scanner;
action Start {
s = p
}
action Stop {
quoted_text = data[s...p].pack('c*')
# do something with the quoted text!
}
squote = "'";
dquote = '"';
not_squote_or_escape = [^'\\];
not_dquote_or_escape = [^"\\];
escaped_something = /\\./;
ss = space* squote ( not_squote_or_escape | escaped_something )* >Start %Stop squote;
dd = space* dquote ( not_dquote_or_escape | escaped_something )* >Start %Stop dquote;
main := (ss | dd)*;
}%%
Why does it work?
Use this example string:
"a\"bc"
Follow it on the graph: (notice the symmetry… the “top” processes double quotes and the “bottom” processes single quotes)
… tl;dr …
" a \ " b c "
➇ → ➁ → ➂ → ➃ → ➂ → ➂ → ➂ → ➇
BAM!
State ➃ is eating the escaped double quote and therefore preventing the machine from stopping—that’s the key!
You can also do it with a scanner
Here’s what you would do in a scanner:
%%{
machine scanner;
action GotOne {
quoted_text = data[(ts+1)...(te-1)].pack('c*')
# do something with quoted text!
}
squote = "'";
dquote = '"';
not_squote_or_escape = [^'\\];
not_dquote_or_escape = [^"\\];
escaped_something = /\\./;
main := |*
squote ( not_squote_or_escape | escaped_something )* squote => GotOne;
dquote ( not_dquote_or_escape | escaped_something )* dquote => GotOne;
any;
*|;
}%%
What blog is this?
Safety in Numbers is Brighter Planet's blog about climate science, Ruby, Rails, data, transparency, and, well, us.
Who's behind this?
We're Brighter Planet, the world's leading computational sustainability platform.