Croissant Runtime
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
crsn/README.md

585 lines
16 KiB

4 years ago
# CROISSANT VIRTUAL MACHINE
Croissant (or *crsn* for short) is an extensible runtime emulating a weird microcomputer (or not so micro, that depends on what extensions you install).
4 years ago
## FAQ
### What is this for?
F U N
### How is the performance?
Silly fast, actually. 60fps animations are perfectly doable if that's your thing.
It's probably faster than you need for most things, actually.
You can slow it down using the `-C` argument, or using sleep instructions.
### What if I don't enjoy writing assembly that looks like weird Lisp?
4 years ago
4 years ago
Maybe this is not for you
4 years ago
### Shebang?
Yes! You can use crsn as a scripting language!
The first line from a source file is skipped if it starts with `#!`
### Contributing
Yup, go ahead. You can also develop your own private *crsn* extensions, they work like plugins.
4 years ago
# Architecture
4 years ago
The runtime is built as a register machine with a stack and status flags.
- All mutable state (registers and status), called "execution frame", is local to the running routine or the root of the program.
- A call pushes the active frame onto a frame stack and a clean frame is created for the callee.
- The frame stack is not accessible to the running program, it is entirely handled by the runtime.
- When a call is made, the new frame's argument registers are pre-filled with arguments passed by the caller.
- Return values are inserted into the callee's frame's result registers before its execution resumes.
4 years ago
## Registers
- 8 general purpose registers `r0`-`r7`
4 years ago
- 8 argument registers `arg0`-`arg7`
- 8 result registers `res0`-`res7`
All registers are 64-bit unsigned integers that can be treated as
signed, if you want to. Overflow is allowed and reported by status flags.
4 years ago
8-, 16-, 32-bit and floating point arithmetic is not currently implemented, but will be added later. Probably. Maybe.
4 years ago
4 years ago
## Status flags
4 years ago
Arithmetic and other operations set status flags that can be used for conditional jumps.
- Equal … Values are equal
- Lower … A < B
- Greater … A > B
- Zero … Value is zero, buffer is empty, etc.
- Positive … Value is positive
- Negative … Value is negative
- Overflow … Arithmetic overflow or underflow, buffer underflow, etc.
- Invalid … Invalid arguments for an instruction
4 years ago
- Carry … Arithmetic carry *this is currently unused*
4 years ago
4 years ago
### Status tests (conditions)
4 years ago
These keywords (among others) are used in conditional branches to specify flag tests:
- `eq` … Equal,
- `ne` … NotEqual,
- `z` … Zero,
- `nz` … NotZero,
- `lt` … Lower,
- `le` … LowerOrEqual,
- `gt` … Greater,
- `ge` … GreaterOrEqual,
- `pos` … Positive,
- `neg` … Negative,
- `npos` … NonPositive,
- `nneg` … NonNegative,
- `c` … Carry,
- `nc` … NotCarry,
- `val`, `valid`, `ok` … Valid,
- `inval`, `nok` … Invalid,
4 years ago
- `ov` … Overflow,
- `nov` … NotOverflow,
# Syntax
*The syntax is very much subject to change at the moment. The format described here
is valid at the time this file is added to version control.*
Instructions are written using S-expressions, because they are easy to parse
and everyone loves Lisp.
4 years ago
## Program
4 years ago
A program has this format:
```
(
4 years ago
...<instructions and routines>...
4 years ago
)
```
4 years ago
e.g.
```
(
(ld r0 100) ; load value into a register
(:again) ; a label
(sub r0 1 ; subtract from a register
(nz? ; conditional branch "not zero?"
(j :again))) ; jump to the label :again
)
```
The same program can be written in a compact form:
```
((ld r0 100)(:again)(sub r0 1 (nz? (j :again))))
```
## Instruction
4 years ago
Instructions are written like this:
```
4 years ago
(<keyword> <args>... <conditional branches>...)
4 years ago
```
4 years ago
### Conditional instructions
All instructions can be made conditional by appending `.<cond>` to the keyword, i.e. `(j.ne :LABEL)` means "jump if not equal".
These modifiers are mainly used by the assembler when translating conditional branches to executable code.
Note that the flags can only be tested immediately after the instruction that produced them, or after instructions that do not
affect flags (pseudo-instructions like `def` and `sym`, `nop`, `j`, `fj`, `s`, `call` etc). Instructions that can set flags first
clear all flags to make the result predictable.
Status flags can be saved to and restored from a register using the `stf` and `ldf` instructions. This can also be used to set
or test flags manually, but the binary format may change
4 years ago
### Instruction arguments
Arguments are always ordered writes-first, reads-last.
This document uses the following notation for arguments:
- `REG` - one of the registers (`regX`, `argX`, `resX`)
- `SYM` - a symbol defined as a register alias (e.g. `(sym x r0)`)
- `@REG` / `@SYM` - access an object referenced by a handle. Handle is simply a numeric value stored in a register of some kind.
- `_` - a special "register" that discards anything written to it.
The "discard register" is used when you do not need the value and only care about side effects or status flags.
- `CONST` - name of a constant defined earlier in the program (e.g. `(def SCREEN_WIDTH 640)`)
- `NUM` - literal values
- unsigned `123`
- signed `-123`
- float `-45.6789`
- hex `0xabcd`, `#abcd`
- binary `0b0101`
- character `'a'`, `'🐁'`. Supports unicode and C-style escapes. Use `\\` for a literal backslash.
- `"str"` - a double-quoted string (`"ahoj\n"`). Supports unicode and C-style escapes. Use `\\` for a literal backslash.
- `:LABEL` - label name
- `PROC` - routine name
- `PROC/A` - routine name with arity (number of arguments)
The different ways to specify a value can be grouped as "reads" and "writes":
- `Rd` - read: `REG`, `SYM`, `@REG`, `@SYM`, `VALUE`, `CONST`
- `Wr` - writes: `REG`, `SYM`, `@REG`, `@SYM`, `_`
- `RW` - intersection of the two sets, capable of reading and writing: `REG`, `SYM`, `@REG`, `@SYM`
Objects (`@reg`, `@sym`) can be read or written as if they were a register, but only if the referenced object supports it.
Other objects may produce a runtime fault or set the INVALID flag.
In the instruction lists below, I will use the symbols `Rd` for reads, `Wr` for writes, `RW` for read-writes, and `@Obj` for object handles,
with optional description after a colon, such as: `(add Wr:dst Rd:a Rd:b)`.
4 years ago
### Conditional branches
4 years ago
Conditonal branches are written like this:
```
4 years ago
(<cond>? <instructions>...)
4 years ago
```
4 years ago
- If there is more than one conditional branch chained to an instruction,
then only one branch is taken - there is no fall-through.
- The definition order is preserved, i.e. if the `inval` flag is to be checked, it should be done
before checking e.g. `nz`, which is, incidentally, true by default, because most flags are cleared by instructions that affects flags.
## Routines
4 years ago
4 years ago
A routine is defined as:
4 years ago
```
4 years ago
(proc <name>/<arity> instructions...)
```
- `name` is a unique routine name
- `arity` is the number of arguments it takes, e.g. `3`.
- you can define multiple routines with the same name and different arities, the correct one will be used depending on how it's called
Or, with named arguments:
```
(proc <name> <arguments>... instructions...)
```
Arguments are simply aliases for the argument registers that can then be used inside the routine.
Here is an example routine to calculate the factorial of `arg0`:
```
(proc fac/1
(cmp arg0 2 (eq? (ret 2)))
4 years ago
(sub r0 arg0 1)
(call fac r0)
(mul r0 arg0 res0)
(ret r0)
)
```
4 years ago
It can also be written like this:
```
(proc fac num
...
)
```
...or by specifying both the arity and argument names:
```
(proc fac/1 num
...
)
```
4 years ago
# Instruction Set
Crsn instruction set is composed of extensions.
4 years ago
Extensions can define new instructions as well as new syntax, so long as it's composed of valid S-expressions.
4 years ago
## Labels, jumps and barriers
These are defined as part of the built-in instruction set (see below).
- Barrier - marks the boundary between routines to prevent overrun. Cannot be jumped across.
- Local labels - can be jumped to within the same routine, both forward and backward.
- Far labels - can be jumped to from any place in the code using a far jump (disregarding barriers).
This is a very cursed functionality that may or may not have some valid use case.
4 years ago
- Skips - cannot cross a barrier, similar to a jump but without explicitly defining a label.
All local jumps are turned into skips by the assembler.
4 years ago
4 years ago
Skipping across conditional branches may have *surprising results* - conditional branches are expanded
to a varying number of skips and conditional instructions by the assembler. Only use skips if you really know what you're doing.
Jumping to a label is always safer than a manual skip.
4 years ago
## Built-in Instructions
```
; Do nothing
(nop)
; Stop execution
(halt)
; Mark a jump target.
(:LABEL)
4 years ago
; Numbered labels
(:#NUM)
4 years ago
; Mark a far jump target (can be jumped to from another routine).
; This label is preserved in optimized code.
(far :LABEL)
; Jump to a label
(j :LABEL)
; Jump to a label that can be in another function
(fj :LABEL)
; Skip backward or forward
(s Rd)
4 years ago
; Mark a routine entry point (call target).
(routine PROC)
(routine PROC/A)
4 years ago
; Call a routine with arguments.
; The arguments are passed as argX. Return values are stored in resX registers.
(call PROC Rd...)
4 years ago
; Exit the current routine with return values
(ret Rd...)
4 years ago
; Deny jumps, skips and run across this address, producing a run-time fault with a message.
(barrier)
(barrier message)
4 years ago
(barrier "message text")
4 years ago
; Block barriers are used for routines. They are automatically skipped in execution
; and the whole pair can be jumped *across*.
; The label can be a numeric or string label, its sole purpose is tying the two together. They must be unique in the program.
4 years ago
(barrier-open LABEL)
(barrier-close LABEL)
4 years ago
; Generate a run-time fault with a debugger message
(fault)
(fault message)
4 years ago
(fault "message text")
; Copy value
(ld Wr Rd)
4 years ago
; Swap values
(swap RW RW)
4 years ago
; Store status flags to a register
(stf Wr)
4 years ago
; Load status flags from a register
(ldf Rd)
4 years ago
; Define a register alias. The alias is only valid in the current routine or in the root of the program.
(sym SYM REG)
4 years ago
; Define a constant. These are valid in the whole program.
; Value must be known at compile time.
(def CONST VALUE)
4 years ago
```
## Arithmetic Module
This module makes heavy use of status flags.
Many instructions have two forms:
- 3 args ... explicit source and destination
- 2 args ... destination is also used as the first argument
```lisp
4 years ago
; Test properties of a value - zero, positive, negative
(tst SRC)
; Compare two values. Sets EQ, LT, GT, and Z, POS and NEG if the values equal
(cmp Rd Rd)
4 years ago
; Check if a value is in a range (inclusive).
; Sets the EQ, LT and GT flags. Also sets Z, POS and NEG based on the value.
(rcmp Rd:val Rd:start Rd:end)
4 years ago
; Add A+B
(add Wr Rd Rd)
(add RW Rd)
4 years ago
; Subtract A-B
(sub Wr Rd Rd)
(sub RW Rd)
4 years ago
; Multiply A*B
(mul Wr Rd Rd)
(mul RW Rd)
4 years ago
; Divide A/B
(div Wr Rd Rd:divider)
(div RW Rd:divider)
4 years ago
; Divide and get remainder
; Both DST and REM are output registers
(divr Wr:result Wr:remainder Rd Rd:divider)
(divr RW Wr:remainder Rd:divider)
4 years ago
; Get remainder A%B
; This is equivalent to (divr _ REM A B),
; except status flags are updated by the remainder value
4 years ago
(mod Wr Rd Rd:divider)
(mod RW Rd:divider)
4 years ago
; AND A&B
(and Wr Rd Rd)
(and RW Rd)
4 years ago
; OR A|B
(or Wr Rd Rd)
(or RW Rd)
4 years ago
; XOR A&B
(xor Wr Rd Rd)
(xor RW Rd)
4 years ago
; CPL ~A (negate all bits)
(cpl DST A)
(cpl DST)
; Rotate right (wrap around)
(ror Wr Rd Rd)
(ror RW Rd)
4 years ago
; Rotate left (wrap around)
(rol Wr Rd:value Rd:count)
(rol RW Rd:count)
4 years ago
; Logical shift right (fill with zeros)
(lsr Wr Rd Rd:count)
(lsr RW Rd:count)
4 years ago
; Logical shift left (fill with zeros)
(lsl Wr Rd Rd:count)
(lsl RW Rd:count)
4 years ago
; Arithmetic shift right (copy sign bit)
(asr Wr Rd Rd:count)
(asr RW Rd:count)
4 years ago
; Arithmetic shift left (this is identical to `lsl`, added for completeness)
(asl Wr Rd Rd:count)
(asl RW Rd:count)
4 years ago
; Delete an object by its handle. Objects are used by some extensions.
(del @Rd)
```
## Buffers Module
This module defines dynamic size integer buffers.
A buffer needs to be created using one of the init instructions:
```lisp
; Create an empty buffer and store its handle into a register
(mkbf Wr)
; Create a buffer of a certain size, filled with zeros.
; COUNT may be a register or an immediate value
(mkbf Wr Rd:count)
; Create a buffer and fill it with characters from a string (unicode code points)
(mkbf Wr "string")
; Create a buffer and fill it with values.
(mkbf Wr (Rd...))
```
Primitive buffer ops (position is always 0-based)
```lisp
; Get buffer size
(bfsz Wr @Obj)
; Read from a position
(bfrd Wr @Obj Rd:index)
; Write to a position
(bfwr @Obj Rd:index Rd)
; Insert at a position, shifting the rest to the right
(bfins @Obj Rd:index Rd)
; Remove item at a position, shifting the rest to the left to fill the empty space
(bfrm Wr @Obj Rd:index)
```
Whole buffer manipulation:
```lisp
; Resize the buffer. Removes trailing elements or inserts zero to match the new size.
(bfrsz @Obj Rd:len)
4 years ago
; Reverse a buffer
(bfrev @Obj)
4 years ago
; Append a buffer
(bfapp @Obj @Obj:other)
4 years ago
; Prepend a buffer
(bfprep @Obj @Obj:other)
4 years ago
```
Stack-style buffer ops:
```lisp
; Push (insert at the end)
(bfpush @Obj Rd)
4 years ago
; Pop (remove from the end)
(bfpop Wr @Obj)
4 years ago
; Reverse push (insert to the beginning)
(bfrpush @Obj Rd)
4 years ago
; Reverse pop (remove from the beginning)
(bfrpop Wr @Obj)
4 years ago
```
To delete a buffer, use the `del` instruction - `(del @Obj)`
4 years ago
## Screen module
This module uses the minifb rust crate to provide a framebuffer with key and mouse input.
Colors use the `RRGGBB` hex format.
4 years ago
If input events are required, then make sure to periodically call `(sc-blit)` or `(sc-poll)`.
This may not be needed if the auto-blit function is enabled and the display is regularly written.
The default settings are 60 FPS and auto-blit enabled.
NOTE: Logging can significantly reduce crsn run speed.
Make sure the log level is at not set to "trace" when you need high-speed updates,
such as animations.
```lisp
; Initialize the screen (opens a window)
(sc-init WIDTH HEIGHT)
; Erase the screen (fill with black)
(sc-erase)
; Fill with a custom color
(sc-erase 0xFF00FF)
; Set pixel color
(sc-px X Y COLOR)
; Set screen option
; 1 ... auto-blit (blit automatically on pixel write when needed to achieve the target FPS)
; 2 ... frame rate
(sc-opt OPTION VALUE)
; Blit (render the pixel buffer).
; This function also updates key and mouse states and handles the window close button
(sc-blit)
; Blit if needed (when the auto-blit function is enabled)
(sc-blit 0)
; Update key and mouse state, handle the window close button
(sc-poll)
; Read mouse position into two registers.
; Sets the overflow flag if the cursour is out of the window
(sc-mouse X Y)
; Check key status. Keys are 0-127. Reads 1 if the key is pressed, 0 if not.
; A list of supported keys can be found in the extension source code.
(sc-key PRESSED KEY)
; Check mouse button state
; 0-left, 1-right, 2-middle
(sc-mbtn PRESSED BTN)
```
## Stdio module
- This module currently defines two global handles (resp. constants): `@stdin` and `@stdout`.
- You can think of these handles as streams or SFRs (special function registers).
To use them, simply load data to or from the handles (e.g. `(ld r0 @stdin)`).
- They operate over unicode code points, which are a superset of ASCII.
You can use these special handles in almost all instructions:
```lisp
(cmp @stdin 'y'
(eq? (ld @stdout '1'))
(ne? (ld @stdout '0')))
```
When you compile a program using such handles, you will get a strange looking assembly:
```
0000 : (ld @0x6372736e00000001 72)
0001 : (ld @0x6372736e00000001 101)
0002 : (ld @0x6372736e00000001 108)
```
These are unique constants assigned to the streams at compile time. They are not meant to be used
directly, but the value can be obtained by simply leaving out the '@' sign: `(ld r0 stdin)`.
That can be useful when these stream handles need to be passed to a function. Obviously this makes
more sense when there are different kinds of streams available, not just these two default ones.
.