2007-04-21

Writing a parser: introduction to ADL

This post will introduce ADL (Acronymic Demonstrational Language), a language I designed to demonstrate how to write a parser.

Statements

ADL has 5 different types of statements:
  • if-statement
  • while-loop
  • for-loop
  • assignment
  • function call
An if-statement looks like this:
if expression then
    statements
end if
An if-statement can also have an else-clause:
if expression then
    statements
else
    statements
end if
A while-loop looks like this:
while expression do
    statements
end while
A for-loop looks like this:
for variable := expression to expression do
    statements
end for
A for-loop can also have a by-clause that defines the number to add to the variable on each iteration:
for variable := expression to expression by expression do
    statements
end for
An assignment looks like this:
variable := expression
A function call looks like this:
function()
It can also have one or more arguments separated by commas:
function(expression, expression, expression)

Expressions

Expressions in ADL consist of variables, constants, function calls, unary expressions, binary expressions and groups.

Variable names start with a letter, followed by zero or more letters or numbers.

ADL has 2 types of constants: strings and integer numbers. A string is a sequence of characters, enclosed between 2 quotes. An integer is a sequence of 1 or more digits.

Unary expressions are expressions with an operator that has only 1 operand:
  • positive operator: +expression
  • negative operator: -expression
  • logical NOT-operator: not expression
Binary expressions are expressions with an operator that has 2 operands:
  • logical AND-operator: expression and expression
  • logical OR-operator: expression or expression
  • comparison: expression comparator expression (comparator is one of the following: ==, <>, <, >, <=, >=)
  • addition: expression + expression
  • subtraction: expression - expression
  • multiplication: expression * expression
  • division: expression / expression
A group is an expression between parentheses: ( expression )

Precedence

When an expression is built from multiple operators, the expression is evaluated in the following order:
  • Grouping: ( )
  • Unary expression: + - not
  • Multiplicative expression: * /
  • Additive expression: + -
  • Comparison: == <> < > <= >=
  • Logical OR: or
  • Logical AND: and

2 comments:

EirĂ­kur Fannar Torfason said...

Shouldn't the logical AND operator have higher precedence than the locical OR operator? Not that is matters much in this context. This is still an excellent series on parsers.

Tommy Carlier said...

I just checked, and in other languages (C, C++, C#, ...) AND does indeed have higher precedence than OR.

The truth is that, for me, it doesn't matter. I can never remember the precedence of AND and OR in a language, so I never mix them without using parentheses.

In fact, StyleCop (a statical analysis tool from Microsoft) will give you a warning if you mix operators that have a different precedence and don't use parentheses to explicitly define the precedence.