Implementing a Programming Language in 7 Lines of Code
Implementing a programming language is often viewed as a daunting task reserved for compiler engineers. However, at its core, the process is an exercise in understanding the fundamental nature of computation. By stripping away the syntactic sugar and complex features of modern languages, we can uncover the elegant machinery that allows a few simple rules to create a Turing-complete system.
This guide explores the implementation of a minimalist, higher-order functional language based on the lambda calculus. We will move from a 7-line interpreter to a more robust implementation, demonstrating how the eval/apply design pattern scales to create full-featured languages.
The Foundation: Lambda Calculus
Developed by Alonzo Church in the early 20th century, the lambda calculus is a mathematical notation for reasoning about functions. It is the theoretical bedrock of all major functional languages like Haskell, Scheme, and ML, and it influences the design of languages like JavaScript and Python.
Remarkably, the lambda calculus requires only three types of expressions:
- Variable References: A name referring to a value.
- Anonymous Functions: Written as
(λ v . e), wherevis the argument andeis the expression returned. - Function Calls: Applying a function to an expression, written as
(f e).
Despite lacking built-in numbers, booleans, or loops, the lambda calculus is Turing-equivalent. This means any function computable by a Turing machine can be expressed using only these three constructs. This is achieved through advanced techniques such as Church encodings (representing data as functions) and the Y combinator (enabling recursion).
The 7-Line Interpreter
Using R5RS Scheme, we can implement a denotational interpreter for the lambda calculus in just seven lines of code. This implementation leverages Scheme's read function, which handles the lexing and parsing of s-expressions into a tree structure.
; eval takes an expression and an environment to a value
(define (eval e env) (cond
((symbol? e) (cadr (assq e env)))
((eq? (car e) 'λ) (cons e env))
(else (apply (eval (car e) env) (eval (cadr e) env)))))
; apply takes a function and an argument to a value
(define (apply f x)
(eval (cddr (car f)) (cons (list (cadr (car f)) x) (cdr f))))
; read and parse stdin, then evaluate:
(display (eval (read) '())) (newline)
How it Works: Eval and Apply
The core of this interpreter is the eval/apply cycle. This pattern is a scalable architecture found in many professional interpreters, most notably described in Structure and Interpretation of Computer Programs (SICP).
eval: Takes an expression and an environment and reduces it to a value. If the expression is a variable, it looks it up in the environment. If it is a lambda, it creates a closure. If it is a function call, it evaluates the function and the argument, then passes them toapply.apply: Takes a function (a closure) and an argument, then evaluates the function's body within an environment extended with the argument's value.
An environment is simply a map from variables to values. A closure is a pair consisting of the lambda expression and the environment that existed when the function was defined, which "closes" the open term.
Scaling the Architecture
While the 7-line version is a proof of concept, the eval/apply pattern scales effectively. By expanding the eval function to handle more expression types, we can build a more capable language.
For example, a 100-line implementation in Racket can introduce:
- Primitive Operations: Addition, subtraction, and boolean logic.
- Conditionals:
ifstatements for branching logic. - let and letrec: For local variable bindings and explicit recursion.
- Mutation: The
set!operator for changing variable values. - Sequencing:
beginblocks for executing multiple expressions in order.
By separating the syntactic design (the parser) from the semantic design (the evaluator), developers can quickly prototype new language features. If you want a different syntax, you only need to build a parser that outputs s-expressions, leaving the core evaluation logic untouched.
Critical Perspectives and Insights
Implementing a language in a language that already supports functional primitives (like Scheme or Racket) can feel like "implementing Lisp in Lisp." However, the value of this exercise is not in the utility of the resulting language, but in the understanding it is provides.
As one community member noted, building these systems—whether it is a language interpreter or an ALU with logic gates—demystifies the "magic" of computing:
People should have a sense that yes, it's complex but it's also NOT magic.
Furthermore, the discussion highlights that while the lambda calculus is a minimalist theoretical tool, it can be extended into the Binary Lambda Calculus (BLC), where conventions for representing bits and bytes allow for the creation of incredibly compact self-interpreters, some as small as 170 bits.
Conclusion
From a few lines of Scheme to a full-featured functional language, the journey of implementing an interpreter reveals the fundamental building blocks of computation. By mastering the eval/apply pattern and understanding the environment-closure relationship, programmers can move from being users of languages to architects of the tools they use every day.