forked from MightyPork/crsn
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
596 lines
16 KiB
596 lines
16 KiB
# CROISSANT VIRTUAL MACHINE
|
|
|
|
Croissant (or *crsn* for short) is an extensible runtime emulating a weird microcomputer (or not so micro, that depends on what extensions you install).
|
|
|
|
## FAQ
|
|
|
|
### What is this for?
|
|
|
|
F U N
|
|
|
|
### How is the performance?
|
|
|
|
Silly fast, actually. 60fps animations are perfectly doable if that's your thing.
|
|
It's probably faster than you need for most things, actually.
|
|
|
|
You can slow it down using the `-C` argument, or using sleep instructions.
|
|
|
|
### What if I don't enjoy writing assembly that looks like weird Lisp?
|
|
|
|
Maybe this is not for you
|
|
|
|
### Shebang?
|
|
|
|
Yes! You can use crsn as a scripting language!
|
|
|
|
The first line from a source file is skipped if it starts with `#!`
|
|
|
|
### Contributing
|
|
|
|
Yup, go ahead. You can also develop your own private *crsn* extensions, they work like plugins.
|
|
|
|
# Architecture
|
|
|
|
The runtime is built as a register machine with a stack and status flags.
|
|
|
|
- All mutable state (registers and status), called "execution frame", is local to the running routine or the root of the program.
|
|
- A call pushes the active frame onto a frame stack and a clean frame is created for the callee.
|
|
- The frame stack is not accessible to the running program, it is entirely handled by the runtime.
|
|
- When a call is made, the new frame's argument registers are pre-filled with arguments passed by the caller.
|
|
- Return values are inserted into the callee's frame's result registers before its execution resumes.
|
|
|
|
## Registers
|
|
|
|
- 8 general purpose registers `r0`-`r7`
|
|
- 8 argument registers `arg0`-`arg7`
|
|
- 8 result registers `res0`-`res7`
|
|
|
|
All registers are 64-bit unsigned integers that can be treated as
|
|
signed, if you want to. Overflow is allowed and reported by status flags.
|
|
|
|
8-, 16-, 32-bit and floating point arithmetic is not currently implemented, but will be added later. Probably. Maybe.
|
|
|
|
## Status flags
|
|
|
|
Arithmetic and other operations set status flags that can be used for conditional jumps.
|
|
|
|
- Equal … Values are equal
|
|
- Lower … A < B
|
|
- Greater … A > B
|
|
- Zero … Value is zero, buffer is empty, etc.
|
|
- Positive … Value is positive
|
|
- Negative … Value is negative
|
|
- Overflow … Arithmetic overflow or underflow, buffer underflow, etc.
|
|
- Invalid … Invalid arguments for an instruction
|
|
- Carry … Arithmetic carry; used by extensions (currently unused, planned for the byte/halfword/word versions of the arith module)
|
|
- Full … full condition; used by extensions
|
|
- Empty … empty condition; used by extensions
|
|
- EOF … end of a stream, file, etc; used by extensions
|
|
|
|
### Status tests (conditions)
|
|
|
|
These keywords (among others) are used in conditional branches to specify flag tests:
|
|
|
|
- `eq` … Equal
|
|
- `ne` … NotEqual
|
|
- `z` … Zero
|
|
- `nz` … NotZero
|
|
- `lt` … Lower
|
|
- `le` … LowerOrEqual
|
|
- `gt` … Greater
|
|
- `ge` … GreaterOrEqual
|
|
- `pos` … Positive
|
|
- `neg` … Negative
|
|
- `npos` … NonPositive
|
|
- `nneg` … NonNegative
|
|
- `c` … Carry
|
|
- `nc` … NotCarry
|
|
- `val`, `valid`, `ok` … Valid
|
|
- `inval`, `nok` … Invalid
|
|
- `ov` … Overflow
|
|
- `nov` … NotOverflow
|
|
- `f`, `full` … Full
|
|
- `nf`, `nfull` … Not full
|
|
- `em`, `empty` … Empty
|
|
- `nem`, `nempty` … Not empty
|
|
- `eof` … EOF
|
|
- `neof` … Not EOF
|
|
|
|
# Syntax
|
|
|
|
*The syntax is very much subject to change at the moment. The format described here
|
|
is valid at the time this file is added to version control.*
|
|
|
|
Instructions are written using S-expressions, because they are easy to parse
|
|
and everyone loves Lisp.
|
|
|
|
## Program
|
|
|
|
A program has this format:
|
|
|
|
```
|
|
(
|
|
...<instructions and routines>...
|
|
)
|
|
```
|
|
|
|
e.g.
|
|
|
|
```
|
|
(
|
|
(ld r0 100) ; load value into a register
|
|
(:again) ; a label
|
|
(sub r0 1 ; subtract from a register
|
|
(nz? ; conditional branch "not zero?"
|
|
(j :again))) ; jump to the label :again
|
|
)
|
|
```
|
|
|
|
The same program can be written in a compact form:
|
|
|
|
```
|
|
((ld r0 100)(:again)(sub r0 1 (nz? (j :again))))
|
|
```
|
|
|
|
## Instruction
|
|
|
|
Instructions are written like this:
|
|
|
|
```
|
|
(<keyword> <args>... <conditional branches>...)
|
|
```
|
|
|
|
### Conditional instructions
|
|
|
|
All instructions can be made conditional by appending `.<cond>` to the keyword, i.e. `(j.ne :LABEL)` means "jump if not equal".
|
|
These modifiers are mainly used by the assembler when translating conditional branches to executable code.
|
|
|
|
Note that the flags can only be tested immediately after the instruction that produced them, or after instructions that do not
|
|
affect flags (pseudo-instructions like `def` and `sym`, `nop`, `j`, `fj`, `s`, `call` etc). Instructions that can set flags first
|
|
clear all flags to make the result predictable.
|
|
|
|
Status flags can be saved to and restored from a register using the `stf` and `ldf` instructions. This can also be used to set
|
|
or test flags manually, but the binary format may change
|
|
|
|
### Instruction arguments
|
|
|
|
Arguments are always ordered writes-first, reads-last.
|
|
|
|
This document uses the following notation for arguments:
|
|
- `REG` - one of the registers (`regX`, `argX`, `resX`)
|
|
- `SYM` - a symbol defined as a register alias (e.g. `(sym x r0)`)
|
|
- `@REG` / `@SYM` - access an object referenced by a handle. Handle is simply a numeric value stored in a register of some kind.
|
|
- `_` - a special "register" that discards anything written to it.
|
|
The "discard register" is used when you do not need the value and only care about side effects or status flags.
|
|
- `CONST` - name of a constant defined earlier in the program (e.g. `(def SCREEN_WIDTH 640)`)
|
|
- `NUM` - literal values
|
|
- unsigned `123`
|
|
- signed `-123`
|
|
- float `-45.6789`
|
|
- hex `0xabcd`, `#abcd`
|
|
- binary `0b0101`
|
|
- character `'a'`, `'🐁'`. Supports unicode and C-style escapes. Use `\\` for a literal backslash.
|
|
- `"str"` - a double-quoted string (`"ahoj\n"`). Supports unicode and C-style escapes. Use `\\` for a literal backslash.
|
|
- `:LABEL` - label name
|
|
- `PROC` - routine name
|
|
- `PROC/A` - routine name with arity (number of arguments)
|
|
|
|
The different ways to specify a value can be grouped as "reads" and "writes":
|
|
|
|
- `Rd` - read: `REG`, `SYM`, `@REG`, `@SYM`, `VALUE`, `CONST`
|
|
- `Wr` - writes: `REG`, `SYM`, `@REG`, `@SYM`, `_`
|
|
- `RW` - intersection of the two sets, capable of reading and writing: `REG`, `SYM`, `@REG`, `@SYM`
|
|
|
|
Objects (`@reg`, `@sym`) can be read or written as if they were a register, but only if the referenced object supports it.
|
|
Other objects may produce a runtime fault or set the INVALID flag.
|
|
|
|
In the instruction lists below, I will use the symbols `Rd` for reads, `Wr` for writes, `RW` for read-writes, and `@Obj` for object handles,
|
|
with optional description after a colon, such as: `(add Wr:dst Rd:a Rd:b)`.
|
|
|
|
### Conditional branches
|
|
|
|
Conditonal branches are written like this:
|
|
|
|
```
|
|
(<cond>? <instructions>...)
|
|
```
|
|
|
|
- If there is more than one conditional branch chained to an instruction,
|
|
then only one branch is taken - there is no fall-through.
|
|
- The definition order is preserved, i.e. if the `inval` flag is to be checked, it should be done
|
|
before checking e.g. `nz`, which is, incidentally, true by default, because most flags are cleared by instructions that affects flags.
|
|
|
|
## Routines
|
|
|
|
A routine is defined as:
|
|
|
|
```
|
|
(proc <name>/<arity> instructions...)
|
|
```
|
|
|
|
- `name` is a unique routine name
|
|
- `arity` is the number of arguments it takes, e.g. `3`.
|
|
- you can define multiple routines with the same name and different arities, the correct one will be used depending on how it's called
|
|
|
|
Or, with named arguments:
|
|
|
|
```
|
|
(proc <name> <arguments>... instructions...)
|
|
```
|
|
|
|
Arguments are simply aliases for the argument registers that can then be used inside the routine.
|
|
|
|
Here is an example routine to calculate the factorial of `arg0`:
|
|
|
|
```
|
|
(proc fac/1
|
|
(cmp arg0 2 (eq? (ret 2)))
|
|
(sub r0 arg0 1)
|
|
(call fac r0)
|
|
(mul r0 arg0 res0)
|
|
(ret r0)
|
|
)
|
|
```
|
|
|
|
It can also be written like this:
|
|
|
|
```
|
|
(proc fac num
|
|
...
|
|
)
|
|
```
|
|
|
|
...or by specifying both the arity and argument names:
|
|
|
|
```
|
|
(proc fac/1 num
|
|
...
|
|
)
|
|
```
|
|
|
|
# Instruction Set
|
|
|
|
Crsn instruction set is composed of extensions.
|
|
|
|
Extensions can define new instructions as well as new syntax, so long as it's composed of valid S-expressions.
|
|
|
|
## Labels, jumps and barriers
|
|
|
|
These are defined as part of the built-in instruction set (see below).
|
|
|
|
- Barrier - marks the boundary between routines to prevent overrun. Cannot be jumped across.
|
|
- Local labels - can be jumped to within the same routine, both forward and backward.
|
|
- Far labels - can be jumped to from any place in the code using a far jump (disregarding barriers).
|
|
This is a very cursed functionality that may or may not have some valid use case.
|
|
- Skips - cannot cross a barrier, similar to a jump but without explicitly defining a label.
|
|
All local jumps are turned into skips by the assembler.
|
|
|
|
Skipping across conditional branches may have *surprising results* - conditional branches are expanded
|
|
to a varying number of skips and conditional instructions by the assembler. Only use skips if you really know what you're doing.
|
|
|
|
Jumping to a label is always safer than a manual skip.
|
|
|
|
## Built-in Instructions
|
|
|
|
```
|
|
; Do nothing
|
|
(nop)
|
|
|
|
; Stop execution
|
|
(halt)
|
|
|
|
; Mark a jump target.
|
|
(:LABEL)
|
|
; Numbered labels
|
|
(:#NUM)
|
|
|
|
; Mark a far jump target (can be jumped to from another routine).
|
|
; This label is preserved in optimized code.
|
|
(far :LABEL)
|
|
|
|
; Jump to a label
|
|
(j :LABEL)
|
|
|
|
; Jump to a label that can be in another function
|
|
(fj :LABEL)
|
|
|
|
; Skip backward or forward
|
|
(s Rd)
|
|
|
|
; Mark a routine entry point (call target).
|
|
(routine PROC)
|
|
(routine PROC/A)
|
|
|
|
; Call a routine with arguments.
|
|
; The arguments are passed as argX. Return values are stored in resX registers.
|
|
(call PROC Rd...)
|
|
|
|
; Exit the current routine with return values
|
|
(ret Rd...)
|
|
|
|
; Deny jumps, skips and run across this address, producing a run-time fault with a message.
|
|
(barrier)
|
|
(barrier message)
|
|
(barrier "message text")
|
|
|
|
; Block barriers are used for routines. They are automatically skipped in execution
|
|
; and the whole pair can be jumped *across*.
|
|
; The label can be a numeric or string label, its sole purpose is tying the two together. They must be unique in the program.
|
|
(barrier-open LABEL)
|
|
(barrier-close LABEL)
|
|
|
|
; Generate a run-time fault with a debugger message
|
|
(fault)
|
|
(fault message)
|
|
(fault "message text")
|
|
|
|
; Copy value
|
|
(ld Wr Rd)
|
|
|
|
; Swap values
|
|
(swap RW RW)
|
|
|
|
; Store status flags to a register
|
|
(stf Wr)
|
|
|
|
; Load status flags from a register
|
|
(ldf Rd)
|
|
|
|
; Define a register alias. The alias is only valid in the current routine or in the root of the program.
|
|
(sym SYM REG)
|
|
|
|
; Define a constant. These are valid in the whole program.
|
|
; Value must be known at compile time.
|
|
(def CONST VALUE)
|
|
```
|
|
|
|
## Arithmetic Module
|
|
|
|
This module makes heavy use of status flags.
|
|
|
|
Many instructions have two forms:
|
|
- 3 args ... explicit source and destination
|
|
- 2 args ... destination is also used as the first argument
|
|
|
|
```lisp
|
|
; Test properties of a value - zero, positive, negative
|
|
(tst SRC)
|
|
|
|
; Compare two values. Sets EQ, LT, GT, and Z, POS and NEG if the values equal
|
|
(cmp Rd Rd)
|
|
|
|
; Check if a value is in a range (inclusive).
|
|
; Sets the EQ, LT and GT flags. Also sets Z, POS and NEG based on the value.
|
|
(rcmp Rd:val Rd:start Rd:end)
|
|
|
|
; Add A+B
|
|
(add Wr Rd Rd)
|
|
(add RW Rd)
|
|
|
|
; Subtract A-B
|
|
(sub Wr Rd Rd)
|
|
(sub RW Rd)
|
|
|
|
; Multiply A*B
|
|
(mul Wr Rd Rd)
|
|
(mul RW Rd)
|
|
|
|
; Divide A/B
|
|
(div Wr Rd Rd:divider)
|
|
(div RW Rd:divider)
|
|
|
|
; Divide and get remainder
|
|
; Both DST and REM are output registers
|
|
(divr Wr:result Wr:remainder Rd Rd:divider)
|
|
(divr RW Wr:remainder Rd:divider)
|
|
|
|
; Get remainder A%B
|
|
; This is equivalent to (divr _ REM A B),
|
|
; except status flags are updated by the remainder value
|
|
(mod Wr Rd Rd:divider)
|
|
(mod RW Rd:divider)
|
|
|
|
; AND A&B
|
|
(and Wr Rd Rd)
|
|
(and RW Rd)
|
|
|
|
; OR A|B
|
|
(or Wr Rd Rd)
|
|
(or RW Rd)
|
|
|
|
; XOR A&B
|
|
(xor Wr Rd Rd)
|
|
(xor RW Rd)
|
|
|
|
; CPL ~A (negate all bits)
|
|
(cpl DST A)
|
|
(cpl DST)
|
|
|
|
; Rotate right (wrap around)
|
|
(ror Wr Rd Rd)
|
|
(ror RW Rd)
|
|
|
|
; Rotate left (wrap around)
|
|
(rol Wr Rd:value Rd:count)
|
|
(rol RW Rd:count)
|
|
|
|
; Logical shift right (fill with zeros)
|
|
(lsr Wr Rd Rd:count)
|
|
(lsr RW Rd:count)
|
|
|
|
; Logical shift left (fill with zeros)
|
|
(lsl Wr Rd Rd:count)
|
|
(lsl RW Rd:count)
|
|
|
|
; Arithmetic shift right (copy sign bit)
|
|
(asr Wr Rd Rd:count)
|
|
(asr RW Rd:count)
|
|
|
|
; Arithmetic shift left (this is identical to `lsl`, added for completeness)
|
|
(asl Wr Rd Rd:count)
|
|
(asl RW Rd:count)
|
|
|
|
; Delete an object by its handle. Objects are used by some extensions.
|
|
(del @Rd)
|
|
```
|
|
|
|
## Buffers Module
|
|
|
|
This module defines dynamic size integer buffers.
|
|
|
|
A buffer needs to be created using one of the init instructions:
|
|
|
|
```lisp
|
|
; Create an empty buffer and store its handle into a register
|
|
(mkbf Wr)
|
|
|
|
; Create a buffer of a certain size, filled with zeros.
|
|
; COUNT may be a register or an immediate value
|
|
(mkbf Wr Rd:count)
|
|
|
|
; Create a buffer and fill it with characters from a string (unicode code points)
|
|
(mkbf Wr "string")
|
|
|
|
; Create a buffer and fill it with values.
|
|
(mkbf Wr (Rd...))
|
|
```
|
|
|
|
Primitive buffer ops (position is always 0-based)
|
|
|
|
```lisp
|
|
; Get buffer size
|
|
(bfsz Wr @Obj)
|
|
|
|
; Read from a position
|
|
(bfrd Wr @Obj Rd:index)
|
|
|
|
; Write to a position
|
|
(bfwr @Obj Rd:index Rd)
|
|
|
|
; Insert at a position, shifting the rest to the right
|
|
(bfins @Obj Rd:index Rd)
|
|
|
|
; Remove item at a position, shifting the rest to the left to fill the empty space
|
|
(bfrm Wr @Obj Rd:index)
|
|
```
|
|
|
|
Whole buffer manipulation:
|
|
|
|
```lisp
|
|
; Resize the buffer. Removes trailing elements or inserts zero to match the new size.
|
|
(bfrsz @Obj Rd:len)
|
|
|
|
; Reverse a buffer
|
|
(bfrev @Obj)
|
|
|
|
; Append a buffer
|
|
(bfapp @Obj @Obj:other)
|
|
|
|
; Prepend a buffer
|
|
(bfprep @Obj @Obj:other)
|
|
```
|
|
|
|
Stack-style buffer ops:
|
|
|
|
```lisp
|
|
; Push (insert at the end)
|
|
(bfpush @Obj Rd)
|
|
|
|
; Pop (remove from the end)
|
|
(bfpop Wr @Obj)
|
|
|
|
; Reverse push (insert to the beginning)
|
|
(bfrpush @Obj Rd)
|
|
|
|
; Reverse pop (remove from the beginning)
|
|
(bfrpop Wr @Obj)
|
|
```
|
|
|
|
To delete a buffer, use the `del` instruction - `(del @Obj)`
|
|
|
|
## Screen module
|
|
|
|
This module uses the minifb rust crate to provide a framebuffer with key and mouse input.
|
|
|
|
Colors use the `RRGGBB` hex format.
|
|
|
|
If input events are required, then make sure to periodically call `(sc-blit)` or `(sc-poll)`.
|
|
This may not be needed if the auto-blit function is enabled and the display is regularly written.
|
|
|
|
The default settings are 60 FPS and auto-blit enabled.
|
|
|
|
NOTE: Logging can significantly reduce crsn run speed.
|
|
Make sure the log level is at not set to "trace" when you need high-speed updates,
|
|
such as animations.
|
|
|
|
```lisp
|
|
; Initialize the screen (opens a window)
|
|
(sc-init WIDTH HEIGHT)
|
|
|
|
; Erase the screen (fill with black)
|
|
(sc-erase)
|
|
; Fill with a custom color
|
|
(sc-erase 0xFF00FF)
|
|
|
|
; Set pixel color
|
|
(sc-px X Y COLOR)
|
|
|
|
; Set screen option
|
|
; 1 ... auto-blit (blit automatically on pixel write when needed to achieve the target FPS)
|
|
; 2 ... frame rate
|
|
(sc-opt OPTION VALUE)
|
|
|
|
; Blit (render the pixel buffer).
|
|
; This function also updates key and mouse states and handles the window close button
|
|
(sc-blit)
|
|
; Blit if needed (when the auto-blit function is enabled)
|
|
(sc-blit 0)
|
|
|
|
; Update key and mouse state, handle the window close button
|
|
(sc-poll)
|
|
|
|
; Read mouse position into two registers.
|
|
; Sets the overflow flag if the cursour is out of the window
|
|
(sc-mouse X Y)
|
|
|
|
; Check key status. Keys are 0-127. Reads 1 if the key is pressed, 0 if not.
|
|
; A list of supported keys can be found in the extension source code.
|
|
(sc-key PRESSED KEY)
|
|
|
|
; Check mouse button state
|
|
; 0-left, 1-right, 2-middle
|
|
(sc-mbtn PRESSED BTN)
|
|
```
|
|
|
|
## Stdio module
|
|
|
|
- This module defines 4 global handles: `@cin`, `@cout`, `@cin_r`, `@cout_r`.
|
|
- You can think of these handles as streams or SFRs (special function registers).
|
|
To use them, simply load data to or from the handles (e.g. `(ld r0 @cin)`).
|
|
- They operate over unicode code points, which are a superset of ASCII.
|
|
- The "_r" variants work with raw bytes. Do not combine them, or you may get problems with multi-byte characters.
|
|
|
|
End of stream is reported by the 'eof' status flag when a stream is read or written.
|
|
|
|
You can use these special handles in almost all instructions:
|
|
|
|
```lisp
|
|
(cmp @cin 'y'
|
|
(eq? (ld @cout '1'))
|
|
(ne? (ld @cout '0')))
|
|
```
|
|
|
|
When you compile a program using such handles, you will get a strange looking assembly:
|
|
|
|
```
|
|
0000 : (ld @0x6372736e00000001 72)
|
|
0001 : (ld @0x6372736e00000001 101)
|
|
0002 : (ld @0x6372736e00000001 108)
|
|
```
|
|
|
|
These are unique constants assigned to the streams at compile time. They are not meant to be used
|
|
directly, but the value can be obtained by simply leaving out the '@' sign: `(ld r0 cin)`.
|
|
That can be useful when these stream handles need to be passed to a function. Obviously this makes
|
|
more sense when there are different kinds of streams available, not just these two default ones.
|
|
|
|
.
|
|
|