I would like to parse boolean expressions in PHP. As in:

```
A and B or C and (D or F or not G)
```

The terms can be considered simple identifiers. They will have a little structure, but the parser doesn't need to worry about that. It should just recognize the keywords `and or not ( )`

. Everything else is a term.

I remember we wrote simple arithmetic expression evaluators at school, but I don't remember how it was done anymore. Nor do I know what keywords to look for in Google/SO.

A ready made library would be nice, but as I remember the algorithm was pretty simple so it might be fun and educational to re-implement it myself.

Recursive descent parsers are **fun to write** and **easy to read**. The first step is to write your grammar out.

Maybe this is the grammar you want.

```
expr = and_expr ('or' and_expr)*
and_expr = not_expr ('and' not_expr)*
not_expr = simple_expr | 'not' not_expr
simple_expr = term | '(' expr ')'
```

Turning this into a recursive descent parser is super easy. Just write one function per nonterminal.

```
def expr():
x = and_expr()
while peek() == 'or':
consume('or')
y = and_expr()
x = OR(x, y)
return x
def and_expr():
x = not_expr()
while peek() == 'and':
consume('and')
y = not_expr()
x = AND(x, y)
return x
def not_expr():
if peek() == 'not':
consume('not')
x = not_expr()
return NOT(x)
else:
return simple_expr()
def simple_expr():
t = peek()
if t == '(':
consume('(')
result = expr()
consume(')')
return result
elif is_term(t):
consume(t)
return TERM(t)
else:
raise SyntaxError("expected term or (")
```

This isn't complete. You have to provide a little more code:

**Input functions.**`consume`

,`peek`

, and`is_term`

are functions you provide. They'll be easy to implement using regular expressions.`consume(s)`

reads the next token of input and throws an error if it doesn't match`s`

.`peek()`

simply returns a peek at the next token without consuming it.`is_term(s)`

returns true if`s`

is a term.**Output functions.**`OR`

,`AND`

,`NOT`

, and`TERM`

are called each time a piece of the expression is successfully parsed. They can do whatever you want.**Wrapper function.**Instead of just calling`expr`

directly, you'll want to write a little wrapper function that initializes the variables used by`consume`

and`peek`

, then calls`expr`

, and finally checks to make sure there's no leftover input that didn't get consumed.

Even with all this, it's still a tiny amount of code. In Python, the complete program is 84 lines, and that includes a few tests.

Licensed under: CC-BY-SA with attribution

Not affiliated with Stack Overflow