Lab2: Lexer
Introduction
This assignment requires you to write a lexical analyzer of C programming language.
Requirement
You should download this package and finish "c.lex" "tokens.sml". We have provided a test case "test.c"
Input
Your input files are plain C programs like "test.c".
Output
The output should be line by line as this form:
token AT POS: position
position is the start position of the token in input file and starts from 0.
Output format of token:
- symbols
- + - * / ( ) [ ] { } ; > < >= <= == != -> .
- keywords
- int char void struct while for if else return
- intconst
- INT(intconst) e.g. 1 => INT(1)
- stringconst
- STRING(stringconst) e.g. "haha" => STRING("haha")
- identifier
- ID(identifier) e.g. main => ID(main)
For example, the output of "test.c" should be:
int AT POS:9 ID(main) AT POS:13 ( AT POS:17 ) AT POS:18 { AT POS:20 int AT POS:24 ID(a) AT POS:28 = AT POS:29 INT(1) AT POS:30 ; AT POS:31 while AT POS:35 ( AT POS:40 ID(a) AT POS:41 > AT POS:42 INT(1) AT POS:43 ) AT POS:44 { AT POS:50 ID(a) AT POS:58 = AT POS:59 ID(a) AT POS:60 + AT POS:61 INT(1) AT POS:62 ; AT POS:63 } AT POS:69 return AT POS:73 ID(a) AT POS:80 ; AT POS:81 } AT POS:83
Hints
- ML-lex manual
- See Specification of our C language
- Finish datatype token and function toString in "tokens.sml"
- Finish the rules in "c.lex"
- Before compile , you need to manually run "<path to your mlton bin>/mllex <your lex file>". Notice that we use mllex of MLton not SML/NJ