Oh there are so many languages that I like so much syntax of, but I will start with FORTH, it has the āpurestā and most simple syntax, itās just a stream of WORDS. Anything can be a word except whitespaces, whitespace delineates WORDS, and a WORD is any sequence of characters excluding whitespace. ^.^
A WORD you can think of as a function in FORTH, although whether thatās a built-in or so can be debatable, and you can rebind WORDās, and WORDās are only run one at a time, so a single WORD will be run to completion before it even looks at the next. You can write a fully conformant FORTH interpreter in just a few hundred lines of code in a verbose language (like a dozen or so in Python). FORTH runs on satelliteās even to this day to more things than people would expect, it is fast and short and easy to update.
The basic concept of FORTH is there is a STACK, which you can push to and pop from, though there are WORDS that allow you do to things like assign variables and such as well in most implementations.
For example DUP
will duplicate what is on top of the stack, so if the number 1
was on it, now there will be two 1
's. To put a 1
on the stack you can just use the WORD 1
, which defaults to a built-in function that just parses that integer and puts it onto the stack, all such integers are ādefaultedā in such a way, though in some implementations you can override that if you so wish. +
is a built-in function (again, can be overridden by the user, but ew in this case?), it just pops the top two items off the stack and puts on to the stack the added value of the two things it popped off.
So if you run:
1 2 +
Then you will end up with 3
on the stack as it first pushes a 1
when it runs 1
, then pushes a 2
when it runs 2
, then it pops the 2
and then the 1
and pushes a 3
when it runs the +
WORD.
To define a function you use the :
WORD, which pops WORDās from the input stream until it hits a ;
WORD and stores them into the named WORD. So a function like : DOUBLE 2 * ;
would define a DOUBLE
word that pushes a 2 then runs *
, which will pop the top two stack values, multiply them together, and push on the result, so doing 3 DOUBLE
would then have 6 on the stack. You can optionally have a comment after the name in the function definition inside of (...)
so by convention the stack input along with --
then the stack output is generally put there, so for DOUBLE WORD youād document it like : DOUBLE (a -- b) 2 * ;
to state it pops one input and pushes one output, and of course you can write text to describe documentation as well. For note, even things like ;
are just functions too, though for ;
itās : ; POSTPONE EXIT REVEAL POSTPONE [ ; IMMEDIATE
, where the WORD POSTPONE
means to take a WORD that you would normally call immediately and instead postpone it, like take the pointer to a function in most languages. EXIT means to exit compilation mode, REVEAL will reveal the new WORD so it can be used, etcā¦ FORTH itself needs very very few built-ins defined (though there are usually a lot more for speed reasons), and even the act of defining a function is actually taking the function pointers of WORDS and inlining them, usually directly to machine code (often with some simple optimizations to inline things), hence why FORTH is usually quite FAST. ^.^
For note, FORTH is usually case insensitive, but full-caps is most often used by convention except in strings ("
WORD).
Letās see a simple popular tutorial set of code:
: STAR [CHAR] * EMIT ;
: STARS 0 DO STAR LOOP CR ;
: SQUARE DUP 0 DO DUP STARS LOOP DROP ;
: TRIANGLE 1 DO I STARS LOOP ;
: TOWER ( n -- ) DUP TRIANGLE SQUARE ;
This allows you to call, say, 4 TOWER
and it will print out:
*
**
***
****
****
****
****
For each function:
-
: STAR [CHAR] * EMIT ;
This defines a WORD named STAR
, the [CHAR]
word takes the next WORD and treats it as a character array (basically a string), so *
in this case, and then EMIT
's it to the screen. So calling STAR
will just print a *
. For note, [CHAR]
is ansi forth setups rather than the FORTH standard, to be standard compliant just change [CHAR] *
to 42
. ^.^
-
: STARS 0 DO STAR LOOP CR ;
This defines a WORD named STARS
, then pushes a 0
onto the stack, then calls the WORD DO
, which will pop the top two numbers off the stack and consume WORDās up until LOOP is encountered, then run them repeatedly the number of times as the number popped off stack (0 to N, where N is what was popped), then lastly it prints a carriage return (\n
in other words). So calling 3 STARS
will print ***\n
.
-
: SQUARE DUP 0 DO DUP STARS LOOP DROP ;
This defines a WORD named SQUARE
, which first DUPlicates whatās on top of the stack, then pushes a 0, then does a loop of calling STARS
that number of times after duplicating the input again, then drops the number thatās on the stack, so calling 2 SQUARE
would print **\n**\n
.
-
: TRIANGLE 1 DO I STARS LOOP ;
This loops from 1 to the passed in number and calls STARS
with each iteration of I
(the internal variable set by DO
for its loop by default), so calling 4 TRIANGLE
would print *\n**\n***\n
, so a triangle of one size less than the number pass in
-
: TOWER ( n -- ) DUP TRIANGLE SQUARE ;
This just duplicates the input, then calls TRIANGLE (which pops one of those off) then calls SQUARE (popping the original off), and thatās it.
But FORTH is pure function, most programs in FORTH will read a lot like english sentences, you can define any kind of DSEL as you want, etcā¦ FORTH and LISP have very similar ideas, both can create DSELās with impunity, but where Iād say FORTH is a purity of function, LISP is a purity of form, they do things in very different ways.
FORTH being so short and easy to implement even in raw machine code makes it really common to bootstrap micro-projects that need to be reprogrammable, plus itās very fun to program in as it is so different than essentially any other language. ^.^