Language1
Constructing Nest
I made this as part of my teaching materials for CS 381 : Summer 2011 at OSU where I was the instructor. It shows how to create a very small interpretor for a toy stack language.
Example operations
We are starting with an extremely simple language, given by simple operations on a stack. for e.g
1 2 + => 3
2 2 * => 4
2 3 dup => 2 3 3
2 1 + dup * => 9
1 2 3 swap => 1 3 2
0 1 2 3 pop => 0 1 2
Can you guess what these evaluate to?
2 3 4 + *
2 3 swap 4 swap + *
1 dup 2 + + 3 swap dup + swap pop
Concatenative languages are simple Turing-complete languages that have very minimal syntax. They provide a set of primitives to operate on the program stack, and little else.
Try writing the BNF rules for this language. Remember, the rules are extremely easy. Do not spend too much time thinking about it.
Parsing
module Main where
So, here is our starting point. It starts by looking at a provided string, and parses into a nested
structure given by the data type Nest
. Our nesting is provided by square brackets.
import Text.ParserCombinators.Parsec
data Nest = W String
| I Int
| Nested [Nest]
deriving (Show)
Read a line and Parse it, returning the Nest data structure.
readLine :: String -> IO Nest
readLine input = case parse parseExpr "nest" input of
Left err -> error (show err)
Right q -> return q
Parse a number of words or nested structures.
parseExpr = do
x <- many parseSingle
return $ Nested x
Parse either a single word or a pair of nested brackets []
parseSingle :: Parser Nest
parseSingle = do
spaces
x <- (try parseInt) <|> (try parseWord) <|> (try parseNest)
spaces
return x
Parse a nested structure starting with [
and ending with ]
parseNest :: Parser Nest
parseNest = do
char '['
e <- parseExpr
char ']'
return e
parseInt :: Parser Nest
parseInt = do
i <- many1 digit
return $ I (read i)
Parse a simple word without any spaces or nesting between them.
parseWord :: Parser Nest
parseWord = do
w <- many1 (noneOf " \n\r\t[]")
return $ W w
Run these commands and see what they print out.
readLine ""
readLine "1 2 3"
readLine "1 2 + 3 dup"
readLine "[1 [2 +]] dup"
Are there any surprises?
Our language has the following literals,
- numbers such as 1 2 3 …
- combinators (functions)
- pop, swap, dup, + -
Can you write the BNF notation for this language?
Remember, a BNF notation for a simple expression language looks like this
<digit> ::= 1 | 2 | 3 | ... | 0
<num> ::= <digit> | <digit> <num>
<expr> ::= <expr> + <expr>
| <expr> - <expr>
| <num>
Do we need anything representing <expr>
in our BNF?