Today's friday programming language insanity is a tad different. I'm going to look at another twisted stack-based language. I've got a peculiar fondness for these buggers, because back in the day, I was a serious Forth addict. One of the ideas that's actually come up in serious programming languages in the last few years is creating a sort of cross between functional languages and stack-based languages, producing what are known as concatenative languages. An excellent example of an extremely powerful and useful member of this family is called Factor, by Slava Pestov.
But serious useful languages aren't the realm of my regular friday pathology. So I'm going to tell you about a not-really-serious version of a concatenative language, called False. Semantically, False is actually not a horrible language. In fact, if it weren't for the bogglingly awful syntax, it's something I could imagine using for tiny file-filtering utilities. But the syntax is designed to be truly horrible, and when you blend the natural potential for confusion that you get from doing everything backwards on a stack with a syntax that looks like line-noise, you get something that can really sprain your brain.
The idea, loosely speaking, behind programming in False is that there are really two kinds of things in a false program. There are values - which are pushed on the stack whenever they're encountered in a program - and there are functions which take a stack as input, and produce a new stack as output. So everything is ultimately totally oriented around the stack.
There are, naturally, a bunch of built-in functions for you to work with:
- For basic arithmetic, there's binary (+, -, *, /), unary minus (_, the underscore character).
- For comparing values, there are "=" (equals) and binary ">" (greater than).
- For logic, there are binary "&" (and) and "|" (or) and unary "~" (not).
- For stack manipulation, there are "$" (duplicate the top of the stack),
"%" (discard the top of the stack), "\" (swap the top two stack values),
"@" (rotate the 3rd item in to the stack to the top), and ∅ (take the top value N off the stack, and then copy the Nth stack value and put it in top of the stack). - There's simple string IO - and literal string surrounded by double quotes
is treated as a function that prints out the string.
Values are where things start to get interesting. The simple value types are integers and characters. Integers are written pretty normally, except that an integer literal can't be negative. The only way to write a negative is to write the positive literal, followed by unary minus. So -3 would be written "3_". Character literals are written as characters preceeded by a single quote: `a
.
The interesting value type is the function. A function is really just a value like any other. You write a function by surrounding it in square brackets. So, for example, a function to compute n2+7 where n is the value on top of the stack would be written: [$*7+]
.
If there is a function value on the stack, you can invoke that function using "!". So, to invoke the function we just defined on the value 3, we'd write: "3[$*7+]!
".
Once we have function values, we can talk about control flow. There are only two control flow operators: "?" and "#":
- "?" is the "if/then" control construct. It assumes that the value on top of the stack is a function. It pops the function off the stack and saves it, and then pops the next value on the stack. If the value it popped is not zero, then it executes the saved function; otherwise, it just discards it. So, to print "true" if the top of the stack is non-zero, we could write "
[true]?
". To print true if it the stack top is true, and false if it's false, we could use: "$["true"]?~["false"]
". (To understand that second one: it says "duplicate the value on top of the stack - so there are two copies of it. Then push the function that prints true onto the stack. Then, if the value was non-zero, run the print-true function. After that, we're left with just the copy of the former stack-top. So we unary-not it, and then do another conditional.) - "#" is a while-loop. It takes two functions as parameters, and runs the second one as long as the first one returns true.
One more thing, and we can look at a real example, which will make it all make more sense (or at least as much sense as it will ever make.) False also has a set of global variables, which we can use for naming things. A variable is a single letter. For each variable, there are two functions: "var:", which sets the value of the variable to the value on top of the stack, and "var;", which retrieves the value of the variable and pushes it onto the stack. So, for example, to set the variable "a" to 3, you'd write "3a:".
So, for example, here's the factorial function:
[$0=$[\%1\]?~[$1-f;!*]?]f:
Let's break that down a bit. Remember that the basic factorial function is written recursively as something like "fact(n) = if (n==1) then 1 else n*fact(n-1)".
- The entire thing is a function enclosed in "[]", which gets assigned to the name "f".
- Inside the function, the first thing it does is set up a comparison with "$0=$". It wants to compare the top of the stack to 0, and also keep a copy of it, for multiplying later. So it duplicates it, and compares it to 0. Then it duplicates the result of the comparison (because the only way to do an "else") involves duplicating a comparison result. So, if this was called with, say "3", the stack would be (3 0 0).
- Next, it pushes the function "
[\%1\]
" onto the stack, and then does a conditional. So if the top of the stack is not 0, then it will execute the function it just pushed. So, following our example, the the top of the stack is 0, so the comparison fails. So it will continue on with the stack (3 0). - Next it does a boolean not to the top of the stack, and then does another function with a conditional. So when it gets to the second "?", in our example, the stack would now be "3 -1", and it would invoke the function "
[$1-f;!*]
" on the stack (3) - "
[$1-f;!*]
" is pretty simple: it duplicates the top of the stack;
subtracts one from it, and computes the factorial of that, and then multiplies. So
with our example, it would duplicate (3 3), subtract one (3 2), take the factorial of 2 (3 2), and then multiply (6).
Here's a more interesting example. This is a function which generates all of the prime numbers smaller than whatever is on top of the stack:
[9[1-$][\$@$@$@$@\/*=[1-$$[%\1-$@]?0=[\$.' ,\]?]?]#]p:
So, to use that to compute all of the primes less than 1000, you'd do:
1000p;!
. If you just pick it apart, and look at the pieces, it's really
not that hard to understand - but it looks horrendous. That's one of the
fun things about False - it's actually not all that pathological. In fact, it's
actually a rather elegant minimal language. But the syntax of it is just horrific - designed to be as ugly and incomprehensible as possible. So it ends up looking almost as
bad as TECO code.
If this kind of stack-based semi-functional language appeals to you, I highly recommend taking a look at Factor. Factor is a very serious
language from the concatenative family, which is being actively developed and supported, and has an extensive set of libraries including everything from GUIs to webservers.
- Log in to post comments
Here's a chunk o bad math for you:
Engineers write defence-against-aliens manual
The "TECO" link just points back to this page.
I really love reading about these esoteric languages. And the other pages of course, but these are more along the lines of things I can understand.
On a mildly related note: After you posting about SNUSP, I started writing a little SNUSP interpreter and I would like people to at some point be able to get a copy and have a go with it. Do you, or someone else, have any suggestions as to useful and free file hosting for this sort of purpose?
Thanks.
That's actually kind of similar to dc.
The syntax of dc is a little nicer, but there are no loops or functions -- only conditionals and macros that can be recursive. (It's also officially called a calculator rather than a programming language.)
Sargeist: I use gna.org. It's a development site rather than just plain file serving, but the instructions are easy and the tools you don't need mostly stay out of your way. They do require the code to be under a Free license.
Just to be clear about dc, it's not a concatenative language. The similarity is that it's stack-based with cryptic syntax.
I wrote a factorial macro in dc and discovered that, unlike False, the programs aren't conceptually simple. It has no boolean type and handles conditionals weirdly. You have to compare (and consume) the top two stack values, conditionally triggering a macro.
Anyway, here's factorial in dc. You can see that there's a lot more bookkeeping:
[
d1-
d_1
Oops. That's not it. Let's try escaping.
[
d1-
d_1
Okay, never mind. You can't see that it's longer because these forums escape differently on the preview than the actual posting.
Anyway, it's about four times as long. Mostly because dc makes it hard to do a significant self-contained marcro (one that leaves all the registers in the original state).
Seems to me that the bit about the conditional construct has two typographical errors:
1. Quotes are missing inside the function printing "true".
2. A final question mark is missing in the if-else construction.
I've posted the dc factorial macro, if anyone wants to see it.