Chris Pressey: List of Unfinished Interesting Esolangs (LoUIE)
Here are some designs for esoteric programming languages that I've worked on, but which have turned out to be a bit resistant to joining the ranks of the completed. You could describe them as "open problems" in esolang design — but that would suggest that the completed designs would be "solutions" to something. (Ha!)
I've disseminated what little I have for these designs below, because it looks unlikely that I'll have the time and resources to complete them in the near future, and I am a gracious individual who wouldn't want to hoard these potentially beautiful things. If you would like to continue the work on any of these designs, feel free. Of course, if you do manage to complete something based on one or more of them, I would appreciate being given some credit (since you are a gracious individual as well, right?)
Unfinished Interesting Esolangs (10)
- Boo-yah!
(1998-ish)
"overlapping instructions in 2 dimensions"
This thing's terminally undesigned. I started thinking about it in the late nineties — possibly as early as 1998 — in Winnipeg.
The challenge is to design a 2-dimensional language similar to Befunge-93, except where each instruction is is composed, not of a single symbol, but of a 2×2 square of symbols. As in Befunge, the IP should move at a rate of one cell per tick, and that means that each executed instruction must in general share two symbols with the previously executed instruction. In other words, in Boo-yah!, instructions overlap.
Symbols are drawn from the alphabet {
/,\}. This two-symbol alphabet gives us 24 = 16 possible instructions. 4 of these instructions would be used to direct the flow of the IP like Befunge's><^v:\\ // /\ \/ // \\ /\ \/
That leaves 12 instructions to manipulate the state.
One of the reasons that this has gone undesigned for so long is the difficulty in selecting a sufficiently "sweet" combination of instructions: one which will allow the instructions to be overlapped, while also having the instructions manipulate the state of the program in useful and interesting and non-trivial ways.
It would be very tempting to just riff on Brainfuck and have a tape, instructions that increment and decrement the current cell on that tape, and instructions that move back and forth on the tape. But I think that's a bit boring. Ideally, each instruction would affect several different bits of state in the runtime model simultaneously, and it should be possible to come up with a series of instructions in which most, but not all, of those state changes have been cancelled out, leaving only the desired change in state. Since this cancellation approach would naturally dovetail with overlapping, it would be a good fit for the language.
- Kig
(2006-ish)
"five points collapse and expand"
Kig was an idea I had while living in Vancouver for an automaton based on a 2-dimensional, orthogonal, integral, Cartesian grid containing only a set of five points. The set of points would shrink until it could shrink no more, at which point it would explode, and begin to shrink again. The execution would be considered terminated once the pattern reached an obvious fixed point, i.e. when the points would cluster in the exact same way repeatedly before exploding.
The shrinking occured in the following manner. On each tick, pick the northwesternmost midpoint of two of the points, and call it the target. (There is a cute pigeonhole proof that, for 5 points with integral coordinates, such a midpoint with integral coordinates will always exist.) If the target is occupied by one other of the five points, stop — it's time to explode. Otherwise, move the furthest, northwesternmost point to the target, and repeat. Since there are still 5 points with integral coordinates, the midpoint property is preserved and the operation can be repeated as necessary. We can also prove, fairly straightforwardly, that this shrinking procedure always terminates, i.e. always reaches the "it's time to explode" state. (The "furthest, northwesternmost" condition is just to disambiguate, in the case that there are more than two points that have a midpoint.)
Unfortunately, although I'm sure I had something in mind while I was riding the 99 B-Line home once, I can't for the life of me remember now how the explode phase was supposed to work. Whatever it was, it would need to position the points far apart, based on their last shrink-phase configuration, in order to be interesting. Extremely far apart, like exponentially so, would be best. Even then, it's not clear if it would be possible to make this system Turing-complete. With 5 points, you do have 10 unbounded counters at your disposal — although your only real operation in the shrink-phase is something akin to division by two...
I recently found some of my old notes on Kig. Watch this space for more details.
- Belabour
(2006-ish)
"relative entropy meets Prolog: a crossword logic game"
Also while in Vancouver, I came across a copy of that classic book on Information Theory, The Mathematical Theory of Communication (Claude Shannon and Warren Weaver, University of Illinois Press, Urbana IL, 1969), and found this quoted on page 14:
"...[I]t is interesting to note that a language must have at least 50 per cent of real freedom (or relative entropy) in the choice of letters if one is to be able to construct satisfactory crossword puzzles. If it has complete freedom, then every array of letters is a crossword puzzle. If it has only 20 per cent of freedom, then it would be impossible to construct crossword puzzles in such complexity and number as would make the game popular. Shannon has estimated that if the English language had only about 30 per cent redundancy, then it would be possible to construct three-dimensional crossword puzzles."
— Warren Weaver, Recent Contributions to the Mathematical Theory of Communication
So let us consider another puzzle game based on another language. The language is not so strange; take Prolog's semantics and cast them in Forth's syntax (Reverse Polish Notation — RPN.) That includes identifiers — to construct the atom
catyou might sayca.t., or equally well,tca... To express a Horn clause likegp(X,Y) :- p(X,Z), p(Z,Y), you might sayXY,gp.*ZY,p*XZ,p*&:, or any of a number of other valid permutations.Now, let us say that that clause need not only be read left to right, but, like a crossword puzzle entry, could be read top to bottom. And that a set of such clauses shall be cast into a grid, with the rows and columns both being clauses in the logic program, even though they are overlapping. And, just to keep it feasible, we provide in the language a symbol
#which does nothing except waste space.So, here is the game: I give you a logic puzzle, hopefully of some moderate "interestingness", like that old chestnut about the farmer with his goose and his fox who's trying to smuggle them onto a boat into Eurozone, or get them to market, or something. You give me a square n×n grid of symbols which contains a Belabour program which solves this puzzle. You get points for making n small; you have points deducted for using
#.What would this game tell us about the percent of relative entropy in logic? If nothing so profound could learned about "pure logic", per se, then what about Horn-clause-based logic programming — and is that not still an interesting question?
- Potro
(2007-ish)
"programs form a ring"
Since finishing the first design for Burro, I've been interested in languages where the set of all programs in the language forms an abstract algebraic structure (over a semantic equivalence relation) under some non-contrived binary operations.
So, for instance, Burro programs form a group under concatenation, which means that, for every Burro program, you can find some other Burro program that annihilates it (concatenating it to the first program yields a null program.)
A ring is a somewhat stricter algebraic structure: it is a set (of programs, in this case) with two binary operations (called the additive operation and the multiplicative operation). The operations could be any transformation that can be applied to any two programs to produce another program. They must meet certain constraints, however — algebraic properties which must hold for all members of the set. Rings build on groups (Abelian groups, actually) and extend them with a couple of further properties, including distributivity, which should look familiar from even high school algebra: (a + b) * c = (a * c) + (b * c).
I set out looking for a language whose programs form a ring under some appropriate operations, but I have yet to find one.
I tried using sequential composition as the multiplicative operation and parallel execution as the additive operation in Cabra, but the result was a slightly different algebraic structure, an idempotent semiring (also called a dioid.) Another, somewhat more experimental (and unpublished) design I toyed with treats a program as a map from a set of labels to basic blocks, with Cartesian product of label sets as the multiplicative operation, and set-union of label sets as the additive operation. This too is only a dioid.
So what semantics, specifically what choice of operations, would lead to an actual ring?
Obviously it'd be best if the operations weren't completely contrived. It's probably possible to have rather degenerate operations where the additive operation is based directly on addition of numbers, while the multiplicative operation is based on numerical multiplication. It's possible this could be done by extending addition and multiplication to unbounded arrays of numbers — that is, tapes — and have programs manipulate the tapes. But I don't think it's a very interesting route to explore.
The most natural (in some sense) choice for the mulitplicative operator is sequential composition (which corresponds to concatenation of program texts in many simple languages.) The problem is that when you sequentially compose two programs, the first program might never halt — so the second program doesn't ever run, and it doesn't matter what that program is. In algebraic terms, the first program absorbs the second during sequential composition: a * b = a, for any choice of b. But in ring theory, this implies a + a = a, and there is only one element for which that fits: a = 0. Moreover, we need x + 0 = x for some x that's not zero. That means we need some way to "ignore" a program which doesn't halt during addition, if it doesn't halt, but some way to pay attention to it if it does halt (to evaluate general statements like x + y = z.) The problem is, that means we need some way to tell if programs halt or not, which is (drum roll please) the Halting Problem.
(So why do dioid languages seem much easier to get than ring languages? Well, unlike rings, dioids do provide idempotency, as a property of addition — a + a = a for any choice of a — and that can be exploited here. It's also worth noting that there are some fundamental theoretical computer science concepts which are dioid-based; for example, regular expressions can be defined by Kleene algebras, which are dioids with an extra "Kleene star" operation.)
Even if we restrict ourselves to the programs that always halt, multiplicative sequential composition poses an extremely tricky problem when we try to address distributivity — in particular, right-distributivity. We need (a + b) * c = (a * c) + (b * c). If we think of the left side as combining the results of a and b, and then running c on those results, then the right side means that we ought to get the same thing if we combine the results of two runs: running a then c, and running b then c. Problem is, c can do all kinds of crazy things, even when it always halts. It could check that its input is the result of what you would get by running a only, or b only, but not anything else. The right side is then effectively c + c, and how do you in general enforce that combining some result with itself has the same semantics as combining two different results and possibly post-processing them?
I suspect that the underlying reason that getting a ring language is hard (at least when you insist on sticking to "intuitive" operations) is because a ring is starting to approach a structure (a Euclidean domain) in which you can perform the Euclidean algorithm and obtain greatest common denominators. Now, if you could perform that algorithm on programs, you'd be able to decompose every program into "prime subprograms". This sounds intuitively a little too powerful — you'd be able to find programs which are in some sense optimal! I'm not certain this would actually violate established undecidability results like Rice's theorem, but it certainly sounds like it's edging up against them.
- Goldbach
(2007-ish)
"only conjecturally Turing-complete"
Goldbach's Conjecture states that every even integer greater than 2 can be written as the sum of two primes. It has, as of this writing, never been proved nor shown false.
So consider a language in which arithmetic processing, and by consequence Turing-completeness, is dependent on Goldbach's Conjecture being true.
How would this work? Well, we would provide operations to construct primes and to add them, and not much else. Further, we set it up so that some property crucial to being Turing-complete, like arbitrary looping, is dependent on, say, factorization of large even numbers. (Perhaps there is a BASIC-like
GOTOstatement, and all line numbers must be even, and each must be 10100 times bigger than the previous, or something to this effect.)Then, if Goldbach's Conjecture is true, we can construct any integer we want, and our language is Turing-complete. But if it's not, we can't, and there are some programs we can't construct, and our language is not Turing-complete.
- Naoko
(2008-ish)
"everything is explicit"
While living in Hyde Park in Chicago, I had the idea to design a high-level programming language where everything is explicit. If you are reading that and thinking "how absurd", congratulations, your brain works! Every string of symbols has multiple possible interpretations, so there will always be some semantic function providing the interpretation; that function must either be implicit, or must be described in terms of some other system — i.e. some other strings of symbols — which suffers the exact same problem. So, if you try to make everything explicit, the best you can really do is make it circularly defined.
But still, it could be an interesting line of inquiry, if we tack onto "everything is explicit" a disclaimer like "insofar as this is possible," and pick a few especially significant things to make explcit. It could be interesting because, in programming, explicit phenomena are manageable phenomena — this is why I picked the name Naoko, which means "obedient child" in Japanese. It is also interesting because we are having it be a high-level language, and one of the hallmarks of being "high-level" is that many things are taken care of for the programmer, i.e. not explicit.
For Naoko, I planned to start with a simple eager functional language, similar to Scheme or Haskell (except eager), and aimed to make at least the following three things explicit:
Lexical environments for variable binding. This was to be done by annotating every variable name with the environment it comes from. Since environment are lexically nested, the variable
fooin the immediate enclosing environment would be referred to byfoo^0, the one from the next level upfoo^1, and so forth.Control flow. This was to be made explicit by continuations; functions could never return, but would be forced to continue a continuation as their final act. In addition, there could be no "call-with-current-continuation"-like construct, as the "current continuation" is naturally a somewhat implicit object. These two things would force the entire program to be written in a very verbose continuation-passing style.
Memory management. The plan was to add explicit allocation and deallocation operations. Explicit allocation does not add much except for, every time you create a list or closure you should put
newin front of it. It's deallocation where it gets interesting. At the last use of a datum, you would need to mark itfree, and if you make copies of it to pass to functions, you would need to explicitly mark themcopy. There's kind of a simplistic syntactic linear typing thing going on.
As if each of these wasn't bad enough alone, their combination and interaction brings up a couple of further issues. I never addressed them fully, but similar ideas with similar interaction eventually materialized in my programming language Unlikely.
- Paneer
(2008-ish)
"comprehensive compositional language system"
A self-modifying programming language which is "input-universal". That is, a programming language which can transform itself in a way similar to how Mascarpone transforms itself, but which also makes the guarantee that it can "turn itself into any other language," meaning that after a finite number of prefix symbols from the input program have been interpreted, the remaining symbols can be interpreted as a program in any other Turing-complete programming language.
(Given a reasonable input alphabet, I suppose, and ignoring tedious and trivial details of encoding.)
This is a matroiska language, because the input can be divided into distinct languages. But it's possibly the most flexible possible matroiska language.
I'm not 100% certain that Mascarpone doesn't already qualify.
One part of this process — say, turning Mascarpone into Pascal — would be turning every symbol into a little state machine that scans and parses itself (when encountered in keyword context, or adds itself to a string constant if encountered between quotes.)
- Seltzer Spigot
(2008-ish)
"computational ascii art"
Once, quite innocently, I ran a Perl script I was writing to produce a simple inverted index. Completely nothing special. But when I ran it, I was surprised to get the following beautiful output — it turns out I got a regular expression wrong (I always think whitespace is
\wwhen it's really\s...) So, the challenge is to find a programming language in which this output is a meaningful program and, in general, programs look something like this.: : : ../../: /: ../../: /: \ : ++.: -: .: /: : :
- Milab
(2009-ish)
"memory mapped, but where?"
In the vein of ZOWIE, some operation critical to Turing-completeness, like control flow, is memory-mapped. The twist is, you don't know where. It could be any memory location (out of an unlimited number), so you have to discover it in order to use it and write an arbitrary program.
The most obvious way to discover this memory location would be through systematic probing, much like you might probe for bad RAM. Such probing, however, would seem to necessitate a loop and a test of some kind, and with such operations available, chances are the language is already Turing-complete, or at least so darn close as to not be very interesting.
So the trick to making this a good design would be figuring out what to hide, and what to provide to find it without making it unnecessary.
- Faradaisical
(2009-ish)
"programmagnetic induction"
Execution of instructions in one program creates a "programmagnetic field" which induces the execution of instructions in another nearby program.
Now, I think that's a pretty good joke language, as is. But it could maybe even be developed into something usable. Of course, it would ideally adhere to as many of the rules of electromagnetic induction as possible — cross product of the flux, and all that jazz. Being able to wind a program around into a coil whose inductance can be measured in Turinghenries would be a bonus! (Would it add ringing to a computation of the Ackermann function...?)