Implementing brainfuck in syntax-rules

This started as an exercise for me to stress-test niënor's macro expansion capabilities, so the first version of this code was written in slightly bastardized syntax-rules - ones that niënor could understand.

This post describes how the code works step-by-step. Scroll down for regular scheme syntax-rules, beware you'll miss commentary.

Environment

Niënor's macroexpander is basically trimmed syntax-rules with some quirks. It requires () - null - to be matched explicitly, doesn't allow for ... - ellipsis, and requires a separate form for each syntax case.

These limitations allow the implementation to be "dead simple" - the entire library implementing them (with tests!) is around 170 lines of owl lisp. It also allows me to insert it form-by-form in this article to comment on them separately.

A quick example is as follows:

(define-macro-rule (right now)
  (false! a right now)
  (set! a #f))

;; slack off for now
(define-macro-rule (later)
  (false! a later)
  (begin))

roughly equals to

(define-syntax false!
  (syntax-rules (right now later)
    ((_ a right now)
     (set! a #f))
    ((_ a later)
     (begin))))

Supported features

I'm going to support a lispy flavour of brainfuck. It will require minor changes to make it more usable in a lispy environment (so that the lisp reader is able to consume it easily).

This flavour changes [ to (, ] to ), . to p, and requires one to put whitespace between all non-list tokens. So +++[-]. becomes + + + ( - ) p. Note that I've skipped input support. I'll implement it in "real scheme" version, but for simplicity's sake it's not available in niënor syntax-rules.

Prior work

Though I haven't googled it before implementation, someone (Hirofumi Yoshikawa) had already implemented this idea previously1. And I might have duplicated their work. Their code written for Gauche Scheme is available on github.

Implementation

We'll need to keep track of the tape, and the code we still need to execute. To easily access the tape, I'm storing it in two bindings: left and right.

1 2 [3] 4 5
\_/ \^____/ right
 |   |
 |   current pointer
 left

I'll be able to access the current pointer with a simple (_ value . rest) destructure, where value is the current pointer. I'm using _ as a "hook" in lists to simplify bindings because of upper mentioned limitations of niënor's macroexpander. It makes null () accumulators impossible (there will always be at least one _ element, so an additional case for () is not needed.

Initialization

This is the code required to bootstrap and finish evaluation of the macro.

;; init
(define-macro-rule (_)
  (bf . code)
;;     / left
;;     |   / right
  (_bf (_) (_) code))
;;     ^^^ ^^^
;;   start with empty stacks for both left and right

;; end
(define-macro-rule (())
  (_bf l r ())
;;     ^^^  ignore stacks, match only () for code. if not for the _ hook, if l or r were to be empty (),
;;          I would have to match them explicitly in another form.
  (puts "\neot\n"))

+ and -

For arithmetic, I use the unary system, as it's pattern matchable and immediate. So, 0 = (_), 1 = (_ I), 2 = (_ I I), ...

;; when a value v is available at current pointer, grab it and add 1 (I'm using I for 1) to it,
;; consume token from code and recurse
(define-macro-rule (+ _ I)
  (_bf (_ . left) (_ (_ . v) . right)   (+ . code))
  (_bf (_ . left) (_ (_ I . v) . right) code))

(define-macro-rule (+ _ I)
  (_bf (_ . left) (_)       (+ . code))
  (_bf (_ . left) (_ (_ I)) code))

Since a lot of programs seem to depend on integer overflow, I've added additional case for over and underflow. This is quite ugly, but gets the job done. When trying to + a number that is exactly 255, rewrite it to 0.

(define-macro-rule (+ _ I)
  (_bf (_ . left)
       (_ (_ I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
           I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
           I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
           I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
           I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
           I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
           I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
           I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I)
          . right)
       (+ . code))
  (_bf (_ . left) (_ (_) . right) code))

Subtraction is implemented in a similar manner, with a bit more destructuring.

(define-macro-rule (- _)
;;                      / this v just gets dropped
  (_bf (_ . left) (_ (_ v . rest) . right) (- . code))
  (_bf (_ . left) (_ (_ . rest) . right)   code))

;; if there is no value on current ptr yet, rewrite it to 0 subtract again
(define-macro-rule (- _)
  (_bf (_ . left) (_)     (- . code))
  (_bf (_ . left) (_ (_)) (- . code)))

And underflow, using the same trick as overflow.

(define-macro-rule (- _)
;;                   / this 0 gets rewritten to a 255
  (_bf (_ . left) (_ (_) . right) (- . code))
  (_bf (_ . left) (_ (_ I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                      I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                      I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                      I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                      I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                      I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                      I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                      I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I) . right) code))

< and >

Moving the current pointer right is as simple as destructuring right to (ptr . rest), and moving ptr to the beginning of left.

(define-macro-rule (> _)
  (_bf (_ . left)   (_ v . right) (> . code))
;;                     ^
;;          __________/
;;         /  whoosh!
;;        v
  (_bf (_ v . left) (_ . right)   code))

(define-macro-rule (> _)
  (_bf (_ . left)     (_)  (> . code))
  (_bf (_ (_) . left) (_) code))

Vice versa for <.

(define-macro-rule (< _)
  (_bf (_ v . left) (_ . right)   (< . code))
;;        ^
;;         \__________
;;           !hsoohw  \
;;                     v
  (_bf (_ . left)   (_ v . right) code))

;; fail hard on < at ptr=0
(define-macro-rule (< _)
  (_bf (_) (_ . right) (< . code))
  (compiler-error "underflow"))

. / p

As noted before, . was renamed to p.

The simplest way to do output in this context is to just generate putchar instructions. In a real lisp/scheme one could accumulate all output into a list and write it all at once in the end. Since niënor is not a real lisp, I do this.

(define-macro-rule (p _)
  (_bf (_ . left) (_ (_ . c) . right) (p . code))
  (begin
    (putchar (_cnt . c))
    (_bf (_ . left) (_ (_ . c) . right) code)))

;; imply 0 on empty current ptr
(define-macro-rule (p _)
  (_bf (_ . left) (_)     (p . code))
  (_bf (_ . left) (_ (_)) (p . code)))

Unary -> decimal translation is handled by a dead simple recursive macro that just counts its arguments. Since niënor has a constant folding mechanism, the compiled code does not do any additions and literally calls putchar for every p.

(define-macro-rule ()
  (_cnt _ . vs)
  (+ 1 (_cnt . vs)))

(define-macro-rule ()
  (_cnt)
  0)

A simple "Hello World!" program I stole from Polish Wikipedia compiles to the following IR.

(_defun main ()
  ((_begin
    ((putchar 72)
     (_begin
      ((putchar 101)
       (_begin
        ((putchar 108)
         (_begin
          ((putchar 108)
           (_begin
            ((putchar 111)
             (_begin
              ((putchar 32)
               (_begin
                ((putchar 87)
                 (_begin
                  ((putchar 111)
                   (_begin
                    ((putchar 114)
                     (_begin
                      ((putchar 108)
                       (_begin
                        ((putchar 100)
                         (_begin
                          ((putchar 33)
                           (_begin
                            ((putchar 10)
                             (puts-static "\neot\n" _putchar)))))))))))))))))))))))))))
   (_tailcall! exit (128) 0)))

Looping

Since loops are defined by lisp lists, they're easily pattern-matchable in macro rulesets. Because of limitations imposed by niënor, loops need to have at least one expression in them. In regular scheme loop code would be matched with an ellipsis. Here, it's (x . xs) which requires x to exist.

Since numbers are pattern-matchable, it's possible to decide whether a condition is met or not (is ptr equal to 0 or not) by pattern matching against our 0 - (_) (it even looks quite round... like a zero...).

;; again, an empty ptr case
(define-macro-rule (_)
  (_bf (_ . left) (_)     ((x . xs) . code))
  (_bf (_ . left) (_ (_)) ((x . xs) . code)))

;; v is whatever, execute loop and jump back
(define-macro-rule (_)
  (_bf (_ . left) (_ v . right) ((x . xs) . code))
  (_bf (_ . left) (_ v . right) (flapp
                                 (x . xs)   ; "execute loop"
                                 ((x . xs)) ; "jump back"
                                 code)))    ; rest of the program

;; v is zero, skip loop
(define-macro-rule (_)
  (_bf (_ . left) (_ (_) . right) ((x . xs) . code))
  (_bf (_ . left) (_ (_) . right) code))

flapp is a "flat append" macro. Its definition is not important here. Applied enough times, it will return what append would return in regular scheme. It's needed here because of syntax limitations. In a perfect world (x . xs) . ((x . xs)) . code would be nice, but that's unfortunately not handled by owl's lisp reader (or any lisp reader to my knowledge).

Usage

Since all evaluation happens at compile time (quite slowly, but happens), all brainfuck programs compiled to uxn roms execute immediately.

(define (main)
  (bf
    + + + + + + + +
    ( > + > + < < - ) > + + > > + <
    ( - ( > > + < < - ) + > > ) > +
    ( - < < < ( - > ( + ( - ) + > + + > > > - < < ) <
      ( < ) > > + + + + + + ( < < + + + + + > > - )
      + < < + + p ( - ) < < ) > p > + ( > > ) > + ))
  (exit!))
$ time ol -r nienor.scm t/test-brainfuck.scm 
Assembled a.out in 2.20KB (3.44% used), 77 labels

real 0m56,381s
user 0m56,312s
sys  0m0,060s
$ ./uxncli a.out
       *
      * *
     *   *
    * * * *
   *       *
  * *     * *
 *   *   *   *
* * * * * * * *

eot

Issues

Regular syntax-rules version

This is a sparsely commented version that can run in owl lisp. I haven't checked if it works in other scheme implementations, though I believe it would be simple enough to make it work in other implementations if it were to fail. The things that make me worry are () - empty, unquoted lists which are fine in owl lisp, but a syntax error in other scheme implementations.

Input (, renamed to r) is available, though it has to be declared as unary expressions.

Code:

(define-syntax bf
  (syntax-rules (42 I + - > < p r () _count input =>)
    ((_ _count)
     0)
    ((_ _count x . xs)
     (+ 1 (_ _count . xs)))
    ((_ 42 i l rt ())
     '("eot" i l rt))
    ;; +
    ((_ 42 in left ((I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ; overflow
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I) . right)
        (+ . code))
     (_ 42 in left (() . right) code))
    ((_ 42 in left (v . right) (+ . code))
     (_ 42 in left ((I . v) . right) code))
    ((_ 42 in left () (+ . code))
     (_ 42 in left ((I)) code))
    ;; -
    ((_ 42 in left ((I . v) . right) (- . code))
     (_ 42 in left (v . right) code))
    ((_ 42 in left (() . v) (- . code))
     (_ 42 in left ((I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ; underflow
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
                     I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I) . v)
        code))
    ((_ 42 in left ()   (- . code))
     (_ 42 in left (()) (- . code)))
    ;; >
    ((_ 42 in left (v . right) (> . code))
     (_ 42 in (v . left) right code))
    ((_ 42 in left () (> . code))
     (_ 42 in (() . left) () code))
    ;; <
    ((_ 42 in (v . left) right (< . code))
     (_ 42 in left (v . right) code))
    ((_ 42 in () right (< . code))
     (syntax-error "underflow " '(() right code)))
    ;; . -- renamed to p
    ((_ 42 in left (v . right) (p . code))
     (begin
       (write-bytes stdout (list (_ _count . v)))
       (_ 42 in left (v . right) code)))
    ;; , -- renamed to r
    ((_ 42 (v . in) left (_drop . right) (r . code))
     (_ 42 in left (v . right) code))
    ((_ 42 (v . in) left () (r . code))
     (_ 42 in left (v) code))
    ((_ 42 () left (_drop . right) (r . code))
     (_ 42 () left (() . right) (r . code)))
    ((_ 42 () left () (r . code))
     (_ 42 () left (()) (r . code)))
    ;; looping
    ((_ 42 in left () (p . code))
     (_ 42 in left (()) (p . code)))
    ((_ 42 in left ()   ((x ...) . code))
     (_ 42 in left (()) ((x ...) . code)))
    ((_ 42 in left (() . right) ((x ...) . code))
     (_ 42 in left (() . right) code))
    ((_ 42 in left (v . right) ((x ...) . code))
     (_ 42 in left (v . right) (x ... (x ...) . code)))
    ;; init
    ((_ input => (in ...) . code)
     (_ 42 (in ...) () () code))
    ((_  . code)
     (_ 42 () () () code))
  ))

Usage:

(bf input => ((I I I I I I I I I I ; 97
               I I I I I I I I I I
               I I I I I I I I I I
               I I I I I I I I I I
               I I I I I I I I I I
               I I I I I I I I I I
               I I I I I I I I I I
               I I I I I I I I I I
               I I I I I I I I I I
               I I I I I I I))
    r
    > + + + + + + + + + + + + + + + + + + + + + + + + + +
    ( < p + > - )
    )
$ time ol temp.scm
abcdefghijklmnopqrstuvwxyz

real 0m0,951s
user 0m0,938s
sys  0m0,013s

Future work

No.

Epilogue

This was fun. If you're interested in niënor, you can follow its development on github. Full brainfuck interpreter code is available in niënor repo's test suite at t/test-brainfuck.scm.


  1. https://youtube.com/watch?v=QNVyguoyUyA↩︎