Roco programming language - Specification - version 20071014 ------------------------------------------------------------ Roco is a programming language that implements some form of coroutines. They may be a bit different than what one would expect from coroutines though, because by default they loop forever, they have no input/output parameters, and there's no system with multiple stacks nor continuations. Each coroutine has exactly one instruction pointer, no copies of an IP are ever made. Roco was initially made on 13 october 2007. Basic Syntax ------------ The basic language syntax of roco consists out of the following keywords (where ### is a user chosen identifier, that follows the same rules as a function name in the C programming language, when it concerns parsing the names of coroutines in this and higher scopes). co ###{} : define a coroutine with name X, put the routine body inside the {} co ###; : forward declare a coroutine, it must be defined elsewhere in the *same* scope. Use this to make its name available already ro : the name of the root coroutine yi ### : yield to a coroutine. Yielding to a coroutine will not push or pop anything to/from the coroutine stack ca ### : call a coroutine. Calling pushes the address of the current coroutine to the coroutine stack, then yields to the given coroutine ac : the inverse of ca: it pops a coroutine address from the coroutine stack, and yields to that coroutine. When the stack is empty, it ends the program. A valid coroutine identifier, starts with a letter or underscore, and can contain only letters, underscores and numbers. Names of existing keywords or instructions of the core language, shouldn't be used as coroutine identifier. Auxiliary instructions ---------------------- Apart from the instructions yi, ca and ac, there are also the following instructions available in the core language. For those instructions, in the list below is given with "o" and "i" symbols how many output and input parameters the instruction has. For an output parameter, one needs to put the address of the variable in square brackets [] or a pointer to the address of the variable in double square brackets [[]]. For an input parameter, one can use a literal number, or an address of a variable in square brackets [] or a pointer in double square brackets [[]]. All variables are integers on one big memory heap. if i: skip the next instruction if i is 0 set o i: set o to i eq o i1 i2: check if i1 equals i2 (puts 0 or 1 in o) neq o i1 i2: check if i1 is not equal to i2 (puts 0 or 1 in o) gt o i1 i2: check if i1 is greater than i2 (puts 0 or 1 in o) lt o i1 i2: check if i1 is smaller than i2 (puts 0 or 1 in o) inc o: increment dec o: decrement add o i1 i2 sub o i1 i2 mul o i1 i2 div o i1 i2: integer division mod o i1 i2: modulo division and o i1 i2 or o i1 i2 xor o i1 i2 not o i: invert bits of i and store result in o cout i: output a character to user cin o: input a character from user iout i: output an integer to user iin o: input an integer from user Variables --------- All variables are on one big heap. The difference between a literal, a variable and a pointer is: literal: a number, like 123. This can only serve as input to operations. Negative integers are possible, e.g. -123. variable: the name of the variable is its number in the heap. For example variable123 is denoted by [123] and can have any value. pointer: [[123]] is the same as [variable123], so it looks in variable123 what the address of the actual variable is. The variables are signed integers of at least 32 bits. The number of variables (the size of the memory heap) is determined while the program runs and is long enough as long as there's enough RAM installed. They are not initialized at any value when the program starts. Errors in roco code that result in trying to access a variable with e.g. number 466821123, will make the computer really slow because it'll allocate that much integers. Whitespace and comments ----------------------- Whitespace is ignored, except that for variables [123] and pointers [[123]], there may be no spaces between the brackets and the numbers. Any character with ascii value less than 33 is considered to be whitespace (this includes spaces, newlines and tabs). Comments can be entered in the source code in /*comment*/ blocks. Comments can be nested, so the following will be parsed as 1 complete comment: /*commented out/*blah blah*/commented out*/ Remarks ------- Every coroutine has its own instruction pointer, pointing to one of its own instructions. Whenever one yields to a coroutine, it continues where its instruction pointer was. At the start of the program this is 0. This means that recursion can't be achieved by making a coroutine call or yield itself, because it just continues from there. When the instruction pointer reaches the end of a coroutine, it continues again at the start. This means coroutines can be used to achieve loops. Coroutines can be defined inside a coroutine (= nested). As mentioned earlier in this doc, each coroutine has a name, and when the "compiler" parses "yi" or "ca" commands, it searches for the name first higher in the current scope, then higher in the higher scopes (the same as the C programming language rules for parsing names of functions and variables). The root coroutine is not given by "co ro{}", but is the top level of the source code. Commands can be typed immediatly in the source code, which means typing them in the root routine. To exit the program, use the command "ac" when the coroutine stack is empty. To go to the root routine, use "yi ro" or "ca ro", or use "ac" when the top of the coroutine stack is the root routine. One will want to have at least an "ac", "ca" or "yi" in all coroutines, because if a coroutine is entered that hasn't got any way to go to another coroutine, it'll keep looping in itself forever. The empty program gives an infinite loop. The smallest program that gives no infinite loop is "ac". Due to the coroutine property, if for example "ac" is used in a coroutine to stop on some condition and this "ac" is not at the end of the coroutine, it may be surprising that the next time it yields to that coroutine, it'll execute that what is after the "ac" instead of the beginning of the coroutine! This can cause some unexpected bugs if no care is taken. Examples -------- Here's a simple dummy example showing the syntax. It'll print "P20". /*BEGIN OF PROGRAM*/ cout 80 /*print a P (ascii character 80)*/ co a { add [0] 3 5 /*add 3 and 5 and store the result in variable0*/ co b; /*forward declare b. This allows us to call it here already. Without this, only after the definition we could call it.*/ yi b /*yield to b a first time*/ co b /*here comes the definition of b, a coroutine nested in a*/ { add [0] [0] 6 /*add 6 to variable0*/ yi a /*go back to a*/ } yi b /*yield b a second time*/ ac /*return to coroutine that called me*/ } ca a /*call coroutine a*/ set [1] 0 /*we'll use variable1 as a pointer to variable0*/ iout [[1]] /*display the value of variable0 (which is 3+5+6+6=20)*/ ac /*end the program*/ /*END OF PROGRAM*/ For more examples, see hello1.txt up to hello4.txt, pointer.txt, nest.txt, cat.txt and loop.txt To run these examples, compile the interpreter (roco.cpp) with g++ (or e.g. using Dev-C++ in Windows), then run the interpreter with e.g. hello1.txt as parameter. -------- Copyright (c) 2007 Lode Vandevenne All rights reserved.