1 year ago
#208980
Spart
Lark simple sql grammar
I'm trying to parse a simple sql via this grammar:
grammar = ```
program : stmnt*
stmnt : select_stmnt | drop_stmnt
select_stmnt : select_clause from_clause? group_by_clause? having_clause? order_by_clause? limit_clause? SEMICOLON
select_clause : "select"i selectables
selectables : column_name ("," column_name)*
from_clause : "from"i source where_clause?
where_clause : "where"i condition
group_by_clause : "group"i "by"i column_name ("," column_name)*
having_clause : "having"i condition
order_by_clause : "order"i "by" (column_name ("asc"i|"desc"i)?)*
limit_clause : "limit"i INTEGER_NUMBER ("offset"i INTEGER_NUMBER)?
// NOTE: there should be no on-clause on cross join and this will have to enforced post parse
source : joining? table_name table_alias?
joining : source join_modifier? JOIN source ON condition
//source : table_name table_alias? joined_source?
//joined_source : join_modifier? JOIN table_name table_alias? ON condition
join_modifier : "inner" | ("left" "outer"?) | ("right" "outer"?) | ("full" "outer"?) | "cross"
condition : or_clause+
or_clause : and_clause ("or" and_clause)*
and_clause : predicate ("and" predicate)*
// NOTE: order of operator should be longest tokens first
predicate : comparison ( ( EQUAL | NOT_EQUAL ) comparison )*
comparison : term ( ( LESS_EQUAL | GREATER_EQUAL | LESS | GREATER ) term )*
term : factor ( ( "-" | "+" ) factor )*
factor : unary ( ( "/" | "*" ) unary )*
unary : ( "!" | "-" ) unary
| primary
primary : INTEGER_NUMBER | FLOAT_NUMBER | STRING | "true" | "false" | "null"
| IDENTIFIER
drop_stmnt : "drop" "table" table_name
FLOAT_NUMBER : INTEGER_NUMBER "." ("0".."9")*
column_name : IDENTIFIER
table_name : IDENTIFIER
table_alias : IDENTIFIER
// keywords
// define keywords as they have higher priority
SELECT.5 : "select"i
FROM.5 : "from"i
WHERE.5 : "where"i
JOIN.5 : "join"i
ON.5 : "on"i
// operators
STAR : "*"
LEFT_PAREN : "("
RIGHT_PAREN : ")"
LEFT_BRACKET : "["
RIGHT_BRACKET : "]"
DOT : "."
EQUAL : "="
LESS : "<"
GREATER : ">"
COMMA : ","
// 2-char ops
LESS_EQUAL : "<="
GREATER_EQUAL : ">="
NOT_EQUAL : ("<>" | "!=")
SEMICOLON : ";"
IDENTIFIER.9 : ("_" | ("a".."z") | ("A".."Z"))* ("_" | ("a".."z") | ("A".."Z") | ("0".."9"))+
%import common.ESCAPED_STRING -> STRING
%import common.SIGNED_NUMBER -> INTEGER_NUMBER
%import common.WS
%ignore WS
However, when I call the parser with text,
"""select cola, colb from foo left outer join bar b on x = 1 join jar j on jb > xw where cola <> colb and colx > coly""",
it parses the second join
as a term
, i.e. as part of the first join
's condition. Any thoughts on how to do this correctly?
lark-parser
0 Answers
Your Answer