Features

Main Features

  • Earley parser, capable of parsing any context-free grammar

    • Implements SPPF, for efficient parsing and storing of ambiguous grammars.

  • LALR(1) parser, limited in power of expression, but very efficient in space and performance (O(n)).

    • Implements a parse-aware lexer that provides a better power of expression than traditional LALR implementations (such as ply).

  • EBNF-inspired grammar, with extra features (See: Grammar Reference)

  • Builds a parse-tree (AST) automagically based on the grammar

  • Stand-alone parser generator - create a small independent parser to embed in your project.

  • Flexible error handling by using a “puppet parser” mechanism (LALR only)

  • Automatic line & column tracking (for both tokens and matched rules)

  • Automatic terminal collision resolution

  • Standard library of terminals (strings, numbers, names, etc.)

  • Unicode fully supported

  • Extensive test suite

  • MyPy support using type stubs

  • Python 2 & Python 3 compatible

  • Pure-Python implementation

Read more about the parsers

Extra features

  • Import rules and tokens from other Lark grammars, for code reuse and modularity.

  • Support for external regex module (see here)

  • Import grammars from Nearley.js (read more)

  • CYK parser

  • Visualize your parse trees as dot or png files (see_example)

Experimental features

  • Automatic reconstruction of input from parse-tree (see examples)

Planned features (not implemented yet)

  • Generate code in other languages than Python

  • Grammar composition

  • LALR(k) parser

  • Full regexp-collision support using NFAs