slashbinbash.de / SASM

SASM is a stack-oriented programming language that is based on Assembly style syntax. Conceptually the language has many similarities to PostScript.

init:
    print "Hello World!"
    ret

Download SASM Interpreter (sasm-20230520.zip)

To run the interpreter, you will need a Java 17 Runtime Environment.

java -jar sasm.jar helloworld.init

Language

Data-Types

The standard data-types are:

NameExample
Booleantrue or false
Integer42
String"string"
List[true, 1, "string", ["list", var]]

Stack

The stack is the central data structure of the language. All instructions transform the stack in one way or another.

All arguments of an instruction are automatically pushed onto the stack. The instruction then pops the required amount of elements from the stack (consume) and pushes the result back onto the stack (produce).

The following example shows how the stack changes when the instructions are executed by the interpreter:

push 6, 2  ;[6, 2]
add 8, 4   ;[12, 6, 2]
mul 2      ;[24, 6, 2]
sub        ;[18, 2]
div        ;[9]

Notice how sub has no arguments. It consumes two elements from the stack and produces one.

Comments

Comments are prefaced by a semi-colon ';' and can appear anywhere in the code:

;comment
push A ;comment

Modules

A module is a collection of instructions and labels. Every file or document in SASM is a module. The name of the module is the file name, without the .sasm extension.

Labels

A label is a symbolic reference to an instruction address. The labels are local to the module they are defined in. Labels are data-type objects like numbers and strings.

There are three types of labels:

  1. Public symbolic labels
  2. Private symbolic labels
  3. Private numeric labels

To define a public symbolic label, you write a name, followed by a colon ':'.

label:

The name must be unique within the module. You can reference the label from within the same module like this:

label

To reference a public label from another module, you preface the label with the name of the module, followed by a period.

module.label

To define a private label, preface the name with a period '.'.

.local_label:

The name must be unique within the module. You can reference the label from within the same module like this:

.local_label

You cannot reference a private label from another module.

To define a private numeric label write a positive integer number followed by a colon ':':

1:
1:
2:
33:

Numeric labels can be redefined. You reference the label from within the module by appending either 'b' or 'f', to indicate where the label is located relative to the reference:

42b  ;references the previous numeric label 42
42f  ;references the next numeric label 42

Instructions

An instruction consist of the name of the instruction, followed by a list of arguments that are separated by commas ','.

[instruction] [arg0],[arg1],...,[argN]

Labels as Instructions

You can use labels instead of instructions. The following two lines do exactly the same thing:

call      list.sort, [4, 2, 7, 0]
list.sort [4, 2, 7, 0]

When the interpreter encounters a label instead of an instruction, it uses the call instruction on the label, by default.

Variadic Instructions

Each instruction has a minimum amount of arguments that it consumes from the stack. For instance, the instruction print consumes at least one argument. If no argument is provided, it will pop one element from the stack.

However, the implementation of print allows you to write a variable amount of arguments, which will be considered when the instruction is executed. This is different from instructions that do not support a variable amount of arguments, in which case the excess arguments remain on the stack.

This allows you to write the following code:

mov   /name, "Bob"
print "Hello ", name, "!"
    "Hello Bob!"

Concatenation

Instructions can be concatenated on one line with the pipe character '|':

push 8, 4, 2, 6, 2     ;[8, 4, 2, 6, 2]
add | mul | sub | div  ;[9]

Variables

Variables are symbolic references to data-type objects. Any object can be assigned to a variable. To assign an object to a variable use the mov instruction.

mov /A, 3  ;A=3

The slash character '/' tells the interpreter not to resolve the variable to a value. Use the slash character when you want to pass a variable name to an instruction, instead of the value.

Once a data-type object is assigned to a variable, you can use the variable in other instructions:

add A, 5  ;[8]

If you want to assign the top most element of the stack to a variable, you can call:

push 7   ;[7]
mov  /A  ;[] A=7

If you use a variable that is undefined, the interpreter will throw an error.

Visibility

You cannot define global variables in the language. All variables are local to the context they are defined in.

Branching has an effect on the visibility of variables. Jump instructions do not change the context, meaning that after a jump, you can use the previously defined variables. If you use the call instruction, a new context is created in which the variables defined before the call are not accessible until ret is called.

Immutability

All data-type objects are immutable, meaning that they cannot be modified after their creation. This is also true for lists. If you pass a list to a function, the original list cannot be modified - it has to be copied. The effect of this can be seen when using variables. Think of it this way, if you give the list [1, 2, 3, 4] the name A, you expect A to have very distinct properties. It doesn't make sense for a function to implicitly redefine A when the function is applied to A. Thus, the result of the function is unnamed. It is a different list with different properties. You can still choose to name the new list A but you have to explicitly do that.

mov   /A, [4, 1, 3, 2]  ;[] A=[4, 1, 3, 2]
call  list.sort, A      ;[[1, 2, 3, 4]]
print A                 ;[[1, 2, 3, 4]]
    [4, 1, 3, 2]

Note how sorting A only pushes a sorted version of A to the stack, but it did not change the list that we named A, which is why printing A results in [4, 1, 3, 2] and not [1, 2, 3, 4].

Having data-type objects immutable allows you to make clear assumptions about how functions behave and what values variables have, even if they contain more complex structures.

Branching

If you want to jump to a label in your code, use the jmp instruction.

jmp label

If the jump to a label depends on a condition, use a conditional jump.

cmp 3, 8
jl  labelA  ;if 3 < 8 jump to labelA
jmp labelB  ;else jump to labelB

If you want to jump to a label, execute some code, and return when it is done, use the instructions call and ret.

label:
    [...]
    ret

init:
    call label
    ret

Higher Order Functions

Since labels are data-type objects, you can push a label to the stack, or assign it to a variable, and call it at a later time:

mov /fn, label
[...]
call fn

You can pass labels to other functions:

call list.reduce, math.add, [1, 2, 3] ;[6]

This is similar to function pointers in other languages.

Anonymous Functions

You can create an anonymous function by placing concatenated instructions between two curly braces '{', '}'.

{ add | sub 3 | call fn | dup }

You can write the previous example like this:

call list.reduce, { add }, [1, 2, 3]  ;[6]

Anonymous functions are data-type objects that you can push, pop, assign to variables, and call:

push { add A, B }
[...]
call

Variables used in anonymous functions will not be resolved until the function is called!

Note: The curly braces only tell the interpreter that this block of text is an object, and that it does not have to be interpreted, unless execute is called on the object.

Stack character

The stack character '_' can be used to place an element from the stack into the arguments list.

push 1, 2, 3, 4  ;[1, 2, 3, 4]
push _, _, 0, 0  ;[1, 2, 0, 0, 3, 4]

It is useful in cases in which the order of arguments is incorrect, or the stack needs to be rearranged.

The example below shows how you can filter a list. list.filter takes a function that returns a boolean value. If true, the element is kept. If false, the element is discarded.

call list.filter, { le _, 3 }, [1, 2, 3, 4]  ;[[1, 2, 3]]

When the argument list is parsed, the interpreter will replace the stack character with the top element of the stack. The resulting instruction calls are:

le 1, 3  ;[true]
le 2, 3  ;[true]
le 3, 3  ;[true]
le 4, 3  ;[false]

Resulting in the following list: [1, 2, 3]

Wildcard character

The wildcard character '*' can be used when the value of the object does not matter. For instance, when you are doing comparisons:

lst_test:
    test []        | je 0f
    test [*]       | je 1f
    test [1, *, 3] | je 3f
    ret -1
0:
    ret 0
1:
    ret 1
3:
    ret 3

The first test checks if the list is empty. The second test checks if the list has one element. The wildcard tells the interpreter that it does not matter what value it is. The third test checks if the list has 3 elements. The first and the third element are set to a concrete value. The second is a wildcard, which means that lst_test will return 3 for any of the following inputs:

[1, 5, 3]
[1, [42, -23], 3]
[1, "test", 3]

Sources