Constructing Nest

I made this as part of my teaching materials for CS 381 : Summer 2011 at OSU where I was the instructor. It shows how to create a very small interpretor for a toy stack language.

Example operations

We are starting with an extremely simple language, given by simple operations on a stack. for e.g

1 2 +   => 3
2 2 *   => 4
2 3 dup => 2 3 3
2 1 + dup *  => 9
1 2 3 swap => 1 3 2
0 1 2 3 pop => 0 1 2

Can you guess what these evaluate to?

  • 2 3 4 + *
  • 2 3 swap 4 swap + *
  • 1 dup 2 + + 3 swap dup + swap pop

Concatenative languages are simple Turing-complete languages that have very minimal syntax. They provide a set of primitives to operate on the program stack, and little else.

Try writing the BNF rules for this language. Remember, the rules are extremely easy. Do not spend too much time thinking about it.

Parsing

module Main where

So, here is our starting point. It starts by looking at a provided string, and parses into a nested structure given by the data type Nest. Our nesting is provided by square brackets.

import Text.ParserCombinators.Parsec

data Nest = W String
          | I Int
          | Nested [Nest]
  deriving (Show)

Read a line and Parse it, returning the Nest data structure.

readLine :: String -> IO Nest
readLine input = case parse parseExpr "nest" input of
    Left err -> error (show err)
    Right q -> return q

Parse a number of words or nested structures.

parseExpr = do
  x <- many parseSingle
  return $ Nested x

Parse either a single word or a pair of nested brackets []

parseSingle :: Parser Nest
parseSingle = do
  spaces
  x <- (try parseInt) <|> (try parseWord) <|> (try parseNest)
  spaces
  return x

Parse a nested structure starting with [ and ending with ]

parseNest :: Parser Nest
parseNest = do
  char '['
  e <- parseExpr
  char ']'
  return e

parseInt :: Parser Nest
parseInt = do
  i <- many1 digit
  return $ I (read i)

Parse a simple word without any spaces or nesting between them.

parseWord :: Parser Nest
parseWord = do
  w <- many1 (noneOf " \n\r\t[]")
  return $ W w

Run these commands and see what they print out.

readLine ""
readLine "1 2 3"
readLine "1 2 + 3 dup"
readLine "[1 [2 +]] dup"

Are there any surprises?

Our language has the following literals,

  • numbers such as 1 2 3 …
  • combinators (functions)
  • pop, swap, dup, + -
Can you write the BNF notation for this language?

Remember, a BNF notation for a simple expression language looks like this

<digit> ::= 1 | 2 | 3 | ... | 0
<num> ::= <digit> | <digit> <num>
<expr> ::= <expr> + <expr>
         | <expr> - <expr>
         | <num>

Do we need anything representing <expr> in our BNF?