RacketParsingLexerYaccCompilers

Building Parsers: Lexers and Yacc | Schema Programming Part 32

2.925 min read
Md Nasim SheikhMd Nasim Sheikh
Share:

We learned to use #lang with syntax/module-reader in Part 15. That works great for Lisp-like languages. But what if you need to parse a C-style language? You need a Lexer and a Parser.

Advertisement

Lexing (Tokenizing)

Use parser-tools/lex.

(require parser-tools/lex
         (prefix-in : parser-tools/lex-sre))

(define-lex-abbrevs
  [digit (:/ "0" "9")]
  [identifier (:: alphabetic (:* (:or alphabetic digit)))])

(define my-lexer
  (lexer
   [digit (token-NUM (string->number lexeme))]
   [identifier (token-ID lexeme)]
   [whitespace (my-lexer input-port)]))

Parsing (Grammar)

Use parser-tools/yacc.

(require parser-tools/yacc)

(define my-parser
  (parser
   (start exp)
   (end EOF)
   (tokens value-tokens op-tokens)
   (error (lambda (ok? name val) (error "Parse error")))
   (grammar
    (exp [(NUM) $1]
         [(exp PLUS exp) (+ $1 $3)]))))

Why use this over Regex?

Regular expressions are fine for simple scraping. But for recursive languages (like JSON, XML, or C), you need a Context-Free Grammar (CFG). Only a parser generator like Yacc can handle that reliably.

Advertisement

Summary

Racket provides polished versions of the classic Unix tools lex and yacc. They form the backend of many actual Racket languages.

Quick Quiz

What is the role of a Lexer?

Md Nasim Sheikh
Written by

Md Nasim Sheikh

Software Developer at softexForge

Verified Author150+ Projects
Published:

You May Also Like