Todays dose of pathology is another masterpiece from the mangled mind of Chris Pressey. It's called "[Version](http://catseye.mine.nu:8080/projects/version/)", for no particularly good reason.
It's a wonderfully simple language: there's only *one* actual kind of statement: the labelled assignment. Every statement is of the form: "*label*: *var* = *value*". But like last's weeks monstrosity, Smith, Version is a language that doesn't really have any flow control. But instead of copying instructions the Smith way, in Version the program is toroidal: it executes all statements in sequence, and when it reaches the end, it goes back to the beginning.
The way that you manage to control your program is that one of the variables you can assign values to is special: it's called IGNORE
, and it's value is the *ignorance space*. The value of the ignorance space is an *irregular* expression (basically, a weak wild-card subset of regexps); if a statement's label fits the ignorance space, then the statement is ignored rather than executed. The program will keep executing as long as there are any statements that are not part of the ignorance space.
Doing anything interesting in Version involves using some special magic variables like IGNORE
. So before we look at any examples, we need to go through the list:
* OUTPUT
: assigning a value to OUTPUT
prints the value to standard out.
* IGNORE
: as mentioned above, this is the value of the ignorance space. Assigning a new value changes the ignorance space.
* CAT
: Assigning a value to CAT
actually does string concatenation. Whatever variable was the real target of the previous assignment to a non-special variable the real target of an assignment to CAT
; the value assigned to CAT
is concatenated with the value of the
real target.
* PUT
: This is a rather twisted one. This takes the *name* of the last real assignment target, and concatenates the assigned value to it. *Then* it copies the value of the previous variable to the new one. (So, for example, if you had just assigned "X=3", and you did "PUT=B", what would happen is that it would create a variable named XB, and then do "XB=X". It's effectively a way to do a twisted kind of arrays.)
* GET
: the inverse of PUT
. It does the same name-concatenation trick, but instead of assigning the value of the previous variable to the new name, it assigns the value of the new name to the previous variable. (So "X=3" followed by "GET=B" is actually an assignent "X=XB".)
There are also a bunch of special variables that can not be used as the targets of assignments, but only as parts of an expression on the right hand side:
* INPUT
: reads a line from the standard input, and returns that value.
* EOF
: returns "TRUE" if you reach the end of the standard input.
* EOL
: returns a string containing the end of line character.
Finally, there are a couple of functions:
* PRED n
: returns n-1.
* SUCC n
: returns n+1.
* CHOP s
: returns the string s with the last character removed.
* POP s
: returns the string s with the first character removed.
* LEN s
: returns the length of the string s.
That's everything that's built in; anything else, you get the pleasure of writing for yourself.
The key to the language is the ignorance space idea. You can create any kind of control flow you want using ignorance spaces. The value of an ignorance space is a sort of very weak regular expression: "\*" means "any sequence of characters", "?" means "any single character"; and "|" separates alternatives. So, for example, "s\*z|d?rk|b?m" would match "schnozz" (s\*z), or "dork" (d?rk) or bum (b?m), but not "schnozzs", "alphabet", or "doork".
So, as usual, we start with a "Hello world" program:
HELLO: OUTPUT="Hello, world!"
HELLO: OUTPUT=EOL
HELLO: IGNORE="*"
So the first statement prints "Hello world"; the second statement prints an end-of-line, and the third statement tells it to ignore everything, which causes the program to stop.
Another example is the good old "99 bottles of beer" song. I'm going to put a line number in parens before each line to make it easy to refer to things; the stuff in parens isn't part of the program.
(1) I: BEER = "99"
(2) 0: IGNORE = "I"
0: OUTPUT = BEER
0: OUTPUT = " bottles of beer on the wall,"
0: OUTPUT = EOL
0: OUTPUT = BEER
0: OUTPUT = " bottles of beer,"
0: OUTPUT = EOL
0: OUTPUT = "Take one down, pass it around,"
0: OUTPUT = EOL
(3) 0: BEER = PRED BEER
0: OUTPUT = BEER
0: OUTPUT = " bottles of beer on the wall."
0: OUTPUT = EOL
0: OUTPUT = EOL
(4) 0: FOO = BEER
(5) 0: CAT = "|I"
(6) 0: IGNORE = FOO
* We start off in statement 1 by assigning "BEER" to the value 99.
* We want to build a descending loop, so in statement two, we take the label for statement one, which sets "BEER" to 99 into "IGNORE", so that it won't ever execute again.
* Then we print out the song, filling in the number of bottles of beer.
* When we get to line 3, that's where a bottle of beer gets passed around, so
we decrement "BEER", and then continue printing.
* Line 4 is where we start to get interesting; this is how we're going to terminate the loop. Everything except the initializer in line 1 has the label "0". So we want to do something that will make the ignorance space be "0|I" when we run out of beer. So we store the value of BEER into a temporary variable, "FOO".
* In line 5, we add "|I" to the value of "FOO" using the concatenation special
variable. So now FOO contains "#|I" where "#" is the current value of BEER.
* In line 6, we set the ignorance space to the value we built in "FOO".
So, at the beginning of the first iteration, BEER=99
; when it gets to statement 4, BEER=98
, so FOO=99|I
, and the ignorance space ends up being "98|I"; and only the initializer in line 1 matches the ignorance space. So we repeat, 99 times, until finally we get to the point where at line 4, BEER=0
, and so the ignorance space ends up being "0|I", which ignores the whole program, and it stops.
One more cute example: the unix "cat" program. cat copies it's input to its standard output, and stops when it gets to the end. In Version, it would be written:
TRUE: OUTPUT=INPUT
TRUE: IGNORE=EOF
So.. It reads a line from the input, and prints it to the output. Then it sets the ignorance space to the value of the special EOF
variable, which returns the value "FALSE" if there's more input, or "TRUE" if the program has reached the end of the file. So while there's more input, the value of the ignorance space is "FALSE", which doesn't match any of the program statements. Then when it reaches the end of the file, the ignorance space value becomes "TRUE", which matches both statements, and so the program ends.
- Log in to post comments
The special function variables seem a bit like cheating -- the minimalist language design didn't actually work, so some functions were created and hidden as assignments. This isn't like a compiler macro -- these are complete gifts to the language, not shortcuts for actual code.
The idea is nifty but Smith is cooler.
BMurray:
I agree that the special variables are a cheat, but my attitude towards it is that the point of the language is the goofy "ignorance space" idea, which is cool in a really silly sort of way. The special variables are just a hack so that he didn't need to write a real parser.
But yeah, definitely not as cool as Smith.
These are getting quite obscure - not that that's a bad thing! By comparison, Brainfuck and Befunge were pretty well known. Are you familiar with The Esolang Wiki?
Anyhow, love the blog, especially the weekly "Pathological Programming" section.
I think you could maybe get around this by formalizing things such that the operation isn't assignment (=) but streaming (<<), and allowing streams to be chained. The "cheating" functions (PRED CHOP etc) could be replaced with user-configurable external programs which the interpreter invokes, using the <<s to assign standard input and output.
This way of doing things would also make the "special" INPUT and OUTPUT variables more natural, as well as having the advantage that the labels could be removed from the language. The interpreter could, as an optimization, simply discard anything written into a variable which it sees is never used as a right-hand variable, and trivial assignments could be used instead of labels:
DUMMY << OUTPUT << "HELLO WORLD"
DUMMY << IGNORE << "DUMMY"
Of course this would mean that the resulting language would not be turing complete without the cooperation of external programs, but that itself is a somewhat pathological idea.