RacketParsingLexerYaccCompilers
Building Parsers: Lexers and Yacc | Schema Programming Part 32
2.925 min read
Md Nasim Sheikh
We learned to use #lang with syntax/module-reader in Part 15. That works great for Lisp-like languages. But what if you need to parse a C-style language? You need a Lexer and a Parser.
Advertisement
Lexing (Tokenizing)
Use parser-tools/lex.
(require parser-tools/lex
(prefix-in : parser-tools/lex-sre))
(define-lex-abbrevs
[digit (:/ "0" "9")]
[identifier (:: alphabetic (:* (:or alphabetic digit)))])
(define my-lexer
(lexer
[digit (token-NUM (string->number lexeme))]
[identifier (token-ID lexeme)]
[whitespace (my-lexer input-port)]))
Parsing (Grammar)
Use parser-tools/yacc.
(require parser-tools/yacc)
(define my-parser
(parser
(start exp)
(end EOF)
(tokens value-tokens op-tokens)
(error (lambda (ok? name val) (error "Parse error")))
(grammar
(exp [(NUM) $1]
[(exp PLUS exp) (+ $1 $3)]))))
Why use this over Regex?
Regular expressions are fine for simple scraping. But for recursive languages (like JSON, XML, or C), you need a Context-Free Grammar (CFG). Only a parser generator like Yacc can handle that reliably.
Advertisement
Summary
Racket provides polished versions of the classic Unix tools lex and yacc. They form the backend of many actual Racket languages.
Quick Quiz
What is the role of a Lexer?
Written by
Md Nasim Sheikh
Software Developer at softexForge
Verified Author150+ Projects
Published: