Toggle navigation
Home
Blog
Carnivorous Plants
Publications
Projects
Project Supervision
Courses
Sonoma State University
Department of Computer Science
CS-460: Programming Languages
Programming Assignment 2: Tokenization
Objective:
Write a program in C or C++ that will identify and remove comments from an input test file using a deterministic finite state automoton (DFA) then use a DFA to convert the input file into a series of tokens. Lastly, your program should display the tokens as output (if no syntax errors occurred) or an error message instead.
Please note
The input test files conform to our
C-like programming language defined in Backus-Naur Form (BNF)
with two exceptions:
programming_assignment_2-test_file_5.c
and
programming_assignment_2-test_file_6.c
each contain invalid integers. Therefore, I expect your program to output an appropriate error message along with the correct line number where the syntax error occurred. For example, "Syntax error on line 3: invalid integer".
Input test files to determine the effectiveness of your programming assignment two solution:
programming_assignment_2-test_file_1.c
programming_assignment_2-test_file_2.c
programming_assignment_2-test_file_3.c
programming_assignment_2-test_file_4.c
programming_assignment_2-test_file_5.c
programming_assignment_2-test_file_6.c
The output from your programming assignment two solution should look like the following text files:
For input file,
programming_assignment_2-test_file_1.c
the output should look like
output-programming_assignment_2-test_file_1.txt
For input file,
programming_assignment_2-test_file_2.c
the output should look like
output-programming_assignment_2-test_file_2.txt
For input file,
programming_assignment_2-test_file_3.c
the output should look like
output-programming_assignment_2-test_file_3.txt
For input file,
programming_assignment_2-test_file_4.c
the output should look like
output-programming_assignment_2-test_file_4.txt
For input file,
programming_assignment_2-test_file_5.c
the output should look like
output-programming_assignment_2-test_file_5.txt
For input file,
programming_assignment_2-test_file_6.c
the output should look like
output-programming_assignment_2-test_file_6.txt
Reflection
Since tokenization is the first step in parsing, it can be difficult to determine syntax errors at this point (except for the most rudimentary tokens). The difficulty is not knowing which grammatical rules should be applied on a token-by-token basis as defined by the
C-like programming language in BNF
. Regardless of difficulty, tokenizing the entire input file without loss of a single byte (i.e. every byte is accounted for from input file to tokenization) is vitally important as you will be relying on tokens exclusively from here on in parsing and lexical analysis.
Uploading your programming project files
Please upload your source code and a Makefile as a zip or gzipped-tar file.
Programming Assignment 2 Rubric
CRITERIA
RATINGS
POINTS
Compilation:
Will the program compile with GNU compiler?
Proficient
2 points
Makefile provided. Student's assignment two program is written in C or C++ and compiles with gcc (GNU C compiler) or g++ (GNU C++ compiler) without syntax errors on GNU/Linux. No external libraries (besides standard built-in C/C++ libraries) are required to build the project.
Satisfactory
1 point
Student's assignment two program is written in C or C++ and compiles on GNU/Linux with gcc or g++. A Makefile is not included. Extra external library dependencies may be required to compile and run student's assignment one program besides standard built-in C/C++ libraries.
Below Expectation
0 points
Makefile not included. Student's assignment two program fails to compile with gcc or g++ on GNU/Linux.
2 points
Parsing Implementation:
How was the parsing technique implemented?
Proficient
2 points
Program features a procedurally-driven deterministic finite-state automaton (DFA) to identify and parse comments.
Below Expectation
0 points
Student's assignment two program implements table-driven DFAs; OR student's assignment two program implements a combination of procedurally-driven and table-driven DFAs; OR student's assignment two program fails to use DFAs (table or procedural).
2 points
Test program 1:
The first benchmark test.
Proficient
1 point
Student's assignment two program removes all comments for test program one without impacting the line numbering of statements. Student's assignment two program identifies then displays the list of tokens as output. The token list output from student's assignment two program matches the expected token list output as noted above in the assignment guidelines.
Satisfactory
0.75 points
Between one and three tokens do not match the expected output token list as noted in the assignment guidelines above.
Below Expectation
0 points
Four or more tokens do not match the expected output token list as noted in the assignment guidelines above; OR the output token list is missing.
1 point
Test program 2:
The second benchmark test.
Proficient
1 point
Student's assignment two program removes all comments for test program two without impacting the line numbering of statements. Student's assignment two program identifies then displays the list of tokens as output. The token list output from student's assignment two program matches the expected token list output as noted above in the assignment guidelines.
Satisfactory
0.75 points
Between one and three tokens do not match the expected output token list as noted in the assignment guidelines above.
Below Expectation
0 points
Four or more tokens do not match the expected output token list as noted in the assignment guidelines above; OR the output token list is missing.
1 points
Test program 3:
The third benchmark test.
Proficient
1 point
Student's assignment two program removes all comments for test program three without impacting the line numbering of statements. Student's assignment two program identifies then displays the list of tokens as output. The token list output from student's assignment two program matches the expected token list output as noted above in the assignment guidelines.
Satisfactory
0.75 points
Between one and three tokens do not match the expected output token list as noted in the assignment guidelines above.
Below Expectation
0 points
Four or more tokens do not match the expected output token list as noted in the assignment guidelines above; OR the output token list is missing.
1 point
Test program 4:
The fourth benchmark test.
Proficient
1 point
Student's assignment two program removes all comments for test program four without impacting the line numbering of statements. Student's assignment two program identifies then displays the list of tokens as output. The token list output from student's assignment two program matches the expected token list output as noted above in the assignment guidelines.
Satisfactory
0.75 points
Between one and three tokens do not match the expected output token list as noted in the assignment guidelines above.
Below Expectation
0 points
Four or more tokens do not match the expected output token list as noted in the assignment guidelines above; OR the output token list is missing.
1 point
Test program 5:
The fifth benchmark test.
Proficient
1 point
Student's assignment two program detects an invalid integer on line eight of test program five then outputs the error message "Syntax error on line 8: invalid integer". No token list is displayed since a syntax error occurred.
Satisfactory
0.75 points
An error message is displayed but the line number where the error occurred is incorrect or missing; OR a token list is displayed with the error message.
Below Expectation
0 points
No error message is displayed. A token list may be displayed instead.
1 points
Test program 6:
The sixth benchmark test.
Proficient
1 point
Student's assignment two program detects an invalid integer on line eight of test program six then outputs the error message "Syntax error on line 8: invalid integer". No token list is displayed since a syntax error occurred.
Satisfactory
0.75 points
An error message is displayed but the line number where the error occurred is incorrect or missing; OR a token list is displayed with the error message.
Below Expectation
0 points
No error message is displayed. A token list may be displayed instead.
1 point
Total points: 10