1 year ago

#35607

test-img

iUknwn

Opening and closing elements in LPEG for Pandoc Reader

I am working on a simple Pandoc reader that can process some of the basic html-like syntax used in forums (such as [b]bold[/b] and [h1]Header[/h1]).

I managed to get a basic reader working with LPEG (as described in the pandoc documentation), but the solution I landed on feels clunky. Is there better way to define the grammar around start and end tags (using things like priorities or negative lookahead, or LPEG groupings)?

Here was what I was able to get working:

local P, S, R, Cf, Cc, Ct, V, Cs, Cg, Cb, B, C, Cmt =
  lpeg.P, lpeg.S, lpeg.R, lpeg.Cf, lpeg.Cc, lpeg.Ct, lpeg.V,
  lpeg.Cs, lpeg.Cg, lpeg.Cb, lpeg.B, lpeg.C, lpeg.Cmt

local whitespacechar = S(" \t\r\n")
local wordchar = (1 - whitespacechar)
local spacechar = S(" \t")
local newline = P"\r"^-1 * P"\n"
local blanklines = newline * (spacechar^0 * newline)^1
local endline = newline - blanklines
local emph_start = P"[i]"
local emph_end = P"[/i]"
local strong_start = P"[b]"
local strong_end = P"[/b]"
local header_start = P"[h" * (R"17" / tonumber) * "]" 
local header_end = P"[/h" * R"17" * "]"
local tag_start = emph_start + strong_start + header_start
local tag_end = emph_end + strong_end + header_end

-- Grammar
G = P{ "Pandoc",
  Pandoc = Ct(V"Block"^0) / pandoc.Pandoc;
  Block = blanklines^0 * (V"Header" + V"Para") ;
  Para = Ct(V"Inline"^1) / pandoc.Para;
  Inline = V"Emph" + V"Strong" + V"Str" + V"Space" + V"SoftBreak" ;
  Str = (1 - (whitespacechar + tag_end + tag_start))^1 / pandoc.Str;
  Space = spacechar^1 / pandoc.Space;
  SoftBreak = endline / pandoc.SoftBreak;
  Emph = emph_start * Ct(V"Inline"^1) * emph_end / pandoc.Emph;
  Strong = strong_start * Ct(V"Inline"^1) * strong_end / pandoc.Strong;
  Header = header_start * Ct(V"Inline"^1) * header_end / pandoc.Header;
}

function Reader(input)
  return lpeg.match(G, input)
end

And here's the kind of text I'd like to transfrom:

[h1]A Test[/h1]
The [i]quick[/i] dog jumped over the lazy stream!
Tags should be able to be applied [b]mid[/b]word.

lua

grammar

pandoc

peg

lpeg

0 Answers

Your Answer

Accepted video resources