Compilation

The compilation process generates machine code from assembly code.

It is comprised of four main processes, which often run almost concurrently.

Lexical Analysis

The lexical analysis process groups character into tokens to check if the code is semantically sound.

• Tokens are the vocabulary of the source language.
• It determines the type of each token.

Parse Trees

A parse tree combines symbols into the goal symbol to check if code is semantically sound.

It is a visual representation of the parsing process.

• The process of combining two or more symbols is called a production.
• A terminal symbol is one that is not abstracted, and appears in the string.
• A non-terminal symbol is an abstraction of another symbol.
• The grammar is the list of symbols to match.
• The goal symbol is the last symbol that would combine the whole statement into one.

$$\begin{array}{c} \text{Build a parse tree from the statement }a=x+y\text{ using}\\ \text{the following grammar:}\\\\ \begin{array}{r c l} \text{<variable>}&\text{::==}&\text{<symbol>}\\ \text{<operator>}&\text{::==}&\text{+|-|*|/}\\ \text{<term>}&\text{::==}&\text{<number>|<variable>}\\ \text{<expression>}&\text{::==}&\text{<term>|<expression><operator><expression>}\\ \text{<assignment>}&\text{::==}&\text{<variable>=<expression>}\\ \end{array}\\\\ \hline\\ \begin{array}{c}\begin{array}{c} a&\rightarrow&\text{<variable>}\\ =\\ x&\rightarrow&\text{<variable>}\\ +&\rightarrow&\text{<operator>}\\ y&\rightarrow&\text{<variable>}\\\\ \end{array}\\ \text{We can therefore break the statement into the following:}\\ \text{<variable>=<variable><operator><variable>}\\ \text{<variable>=<term><operator><term>}\\ \text{<variable>=<expression><operator><expression>}\\ \text{<variable>=<expression>}\\ \text{<assignment>} \end{array}\\\\ \text{As we can reach the goal symbol, the statement}\\\text{is syntactically valid.} \end{array}$$

• A production always forms a non-terminal symbol.
• Not all productions generate machine code.