Table of Contents
Get the sourcecode
There is currently no binary package of VBCC available. This will change when a new official release is made, or when it is working well enough for a beta release from me.
The sources are in several parts:
- VASM assembler: get PulkoMandy's modified version:
git clone “https://pulkomandy.tk/gerrit/vasm”
- VLINK linker: get Frank Wille's latest source snapshot
- VBCC compiler: get PulkoMandy's modified version:
git clone “https://pulkomandy.tk/gerrit/vbcc”
It is also a good idea to download and read the manuals for each of them.
Compiling
VASM
make CPU=unsp SYNTAX=std
Copy vasmunsp_std and vobjdump somewhere in your PATH
VLINK
make
Copy vlink executable in your PATH
VBCC
make TARGET=unsp
The first time this asks you various questions about the host system. If you're not on a strange platform, the default answers should be correct.
Copy vc and vbccunsp executables from the bin/ directory in your PATH
Target configuration
The vc
executable is a compiler frontend. It does not itself compile the C code, but it knows how to call the compiler, assembler and linker to perform the usual steps of a compilation.
However, this knowledge is not hardcoded in the tool, instead, it is loaded from a configuration file. This way, a single version of the compiler frontend can be used to work with many different CPU architectures and platforms.
Create a vbcc directory somewhere. When building anything with VBCC, you need to set the VBCC environment variable to point to that directory:
export VBCC=/path/to/vbccdir
The content of this directory:
- config/
- vsmile
- targets/unsp-vsmile/
- lib/
- startup.o
- libvc.a
- include/
- ctype.h
- stdlib.h
- string.h
- vlink.cmd
The content of each file or how to generate them is detailed below.
The configuration file
This file located in config/vsmile
is the entry point for the configuration. It is enabled by passing the +vsmile
option to the vc
compiler frontend.
-cc=vbccunsp -I"$VBCC"/targets/unsp-vsmile/include -quiet %s -o= %s %s -O=%ld -ccv=vbccunsp -I"$VBCC"/targets/unsp-vsmile/include %s -o= %s %s -O=%ld -as=vasmunsp_std -quiet -ile -Fvobj %s -o %s -asv=vasmunsp_std -ile -Fvobj %s -o %s -rm=rm %s -rmv=rm %s -ld=vlink -ole -b rawbin1 -Cvbcc -T"$VBCC"/targets/unsp-vsmile/vlink.cmd -L"$VBCC"/targets/unsp-vsmile/lib "$VBCC"/targets/unsp-vsmile/lib/startup.o %s %s -o %s -lvc -ldv=vlink -ole -b rawbin1 -Cvbcc -T"$VBCC"/targets/unsp-vsmile/vlink.cmd -L"$VBCC"/targets/unsp-vsmile/lib "$VBCC"/targets/unsp-vsmile/lib/startup.o %s %s -o %s -lvc -Mmapfile -l2=vlink -ole -b rawbin1 -Cvbcc -T"$VBCC"/targets/unsp-vsmile/vlink.cmd -L"$VBCC"/targets/unsp-vsmile/lib %s %s -o %s -l2v=vlink -ole -b rawbin1 -Cvbcc -T"$VBCC"/targets/unsp-vsmile/vlink.cmd -L"$VBCC"/targets/unsp-vsmile/lib %s %s -o %s -Mmapfile
This defines the commands to run for each step of the compilation: cc
for compiling, as
for assembling, and ld
for linking. The variants with a v suffix are for verbose output.
Some of the paths used refer to the VBCC environment variable we have set before.
For the details of each option used, refer to the documentation of the corresponding tool, but in short:
- vbccunsp is called with the sourcefile and optimization options,
- vasmunsp_std is called to assemble the resulting file and output a vobj relocatable object file,
- vlink is used to link all the vobj files together, it outputs the final cartridge image using the rawbin1 output module (raw binary), with the
-ole
option to generate it in little endian (as expected by all existing tools).
The frontend looks at the file extension of the input and output, as well as some of the options (like -c, to compile and generate a .o file, but not link), and automatically determines which tools to call.
The linker script
The file vlink.cmd contains the linker script. This tells vlink about the memory layout and what to do with the code and data.
MEMORY { ram : org = 0x0000, len = 0x2800 lorom : org = 0x0000, len = 0xfff4 res : org = 0xfff5, len = 11 rom : org = 0x10000, len = 0x3f0000 } SECTIONS { .bss (NOLOAD): { *(.bss) } > ram .empty: { RESERVE(0x4000); } > lorom .rodatal: { *(.rodata) } > lorom .rodata: { *(.rodata) *(.rodata2) } > rom .textl: { *(.text) } > lorom .text: { *(.text) } > rom .ctorsl: { *(.ctors) } > lorom .ctors: { *(.ctors) } > rom .dtorsl: { *(.dtors) } > lorom .dtors: { *(.dtors) } > rom .data: { *(.data) } > ram AT > lorom .res: { *(.res); } > res __BS = ADDR(.bss); __BL = SIZEOF(.bss); __DS = ADDR(.data); __DD = LOADADDR(.data); __DL = SIZEOF(.data); __STACK = 0x2800; } ENTRY(_vectortable)
First of all this script defines several memory regions:
- RAM from 0000 to 2800 (all addresses are of course in 16-bit words)
- “low” ROM starting at 0000 and until the reset vectors
- Reset vectors section
- “high” ROM after the reset vectors until the end of the cartridge
Note that the “low ROM” section starts at 0 and overlaps the RAM. This is unusual, but it is how V.Smile cartridges are done: they have 4000 words at the start which are unused and unreachable by the CPU.
The script then defines sections. These follow the usual conventions of C compilers:
bss
is for uninitialized data, and is stored at the start of the RAM. Typically, the tilemap will be allocated there, as well as global and static variables if they are not explicitly initialized in the code.- The
empty
section is just for these empty 4000 words at the start of the cartridge. It is put in the “low ROM” area. - All following sections are duplicated, with one instance loaded into low ROM and the other into high ROM. This tells the linker to put things as much as possible in the low ROM area, and then, if there is no space there anymore, start using the high ROM area.
rodata
is for constants (such as strings),text
is for the executable code,ctors
anddtors
is for code that should be called before and after themain()
function (not implemented yet)data
is for initialized variables. This one is a bit special, since variables have to be in RAM, but their initial values will be stored in ROM. A bit of code in the startup (see below) is used to initialize the variables before calling themain()
function.res
is for the reset and interrupt vectors. These are loaded at address FFF5 in the matching section. We will see below in the startup file how they are filled in.
Finally, the script defines a few variables that will be usable from assembler code, to indicate the start and size of the bss
section and of the data
section as well as the stack pointer initial value. These are also used by the startup file.
And the last line defines the vector table as the “entry point”. This allows the linker to follow references from this entry point to other parts of the code (functions, variables, …) and determine what is actually used. Functions that are not called from anywhere can be removed at this stage, for example.
this linker script needs some changes to allow developers to more easily put some things explicitly in the “low ROM” section. For example, reset vectors have to be there. Also check how alignment works for the things that need it.
The startup file
; TODO move this font out of here into the projects that use it. .section .rodata .globl _RES_FONT_BIN_SA _RES_FONT_BIN_SA: .incbin font.bin .section .text ; Default handler for all interrupts (except RESET). ; Just return from the interupt without doing anything. ; Defined as weak symbols, so, if an application redefines them, the version from the app will ; be used instead. .weak BREAK,FIQ,IRQ0,IRQ1,IRQ2,IRQ3,IRQ4,IRQ5,IRQ6,IRQ7 BREAK: FIQ: IRQ0: IRQ1: IRQ2: ; IRQ3: IRQ4: ; IRQ5: IRQ6: IRQ7: RETI ; Handler for the RESET vector. ; Does the early initialization, then jumps into the main function. _start: ; Disable interrupts since we're probably not ready to handle them yet IRQ OFF ; Set up the stack at the end of RAM LD SP, 0x27FF ; Copy the initialized variables into RAM ; __DL = length of variables section ; __DS = start of variables section in RAM ; __DD = start of variables section in ROM LD R1, __DS LD R2, __DD LD R3, __DL JZ gomain ADD R3, R1 _startloop: LD R4, (R2++) ST R4, (R1++) CMP R3, R1 JNE _startloop ; Finally, jump into the main function gomain: GOTO main ; Handlers for indirect calls. Used by VBCC to handle function pointers, because unSP does not have ; an indirect call (CALL R1 or similar) operation and it is not trivial to emulate one. .globl __indirect_R1 .globl __indirect_R2 .globl __indirect_R3 .globl __indirect_R4 __indirect_R1: LD PC,R1 __indirect_R2: LD PC,R2 __indirect_R3: LD PC,R3 __indirect_R4: LD PC,R4 ; The interrupt vectors section, contains pointers to the interrupt handlers .section .res,"adr" _vectortable: .globl _vectortable .size _vectortable, 11 .2byte BREAK .2byte FIQ .2byte _start .2byte IRQ0 .2byte IRQ1 .2byte IRQ2 .2byte IRQ3 .2byte IRQ4 .2byte IRQ5 .2byte IRQ6 .2byte IRQ7
This file defines the initialization routine and a few other things.
The first thing it currently does is including the font binary file used for rendering text.
- move that elsewhere, it has nothing to do in the startup file, but it was convenient for me to put it there for now.
Then come default definitions for most of the interrupt vectors. They all execute a RETI
instruction without doing anything. Currently, the ones used by the application need to be commented out here. Later, this should be replaced with weak symbols that can be replaced if the application provides something else.
The _start function is the reset vector, and so it is the first code that will run when the CPU resets. It turns off the interrupts, initializes the stack pointer, and copies the initial values for variables in the data
section into RAM. Then, it jumps into the main function.
Finally, there is the table of reset vectors.
Of course, vbcc will expect this in vobj (.o) format, so let's assemble it:
vasmunsp_std -ile -Fvobj -o startup.o crt0.asm
and place the resulting file startup.o in the target/unsp-vsmile/lib/ directory, where our frontend config file says to look for it.
Note: the -ile option tells vasm that files included using the incbin directive are in little endian. This gives the same results as the official unSP toolchain when importing binary files.
The C library
vbcc does not come with an open source C library. A full one from other sources could be compiled, but we don't need a complete C library.
So here is a very minimal one with enough support to run the compiler and to run Contiki, which has minimal needs from the C library (just a few string functions).
/* * Copyright (C) 2024 Adrien Destugues <pulkomandy@pulkomandy.tk> * * Distributed under terms of the MIT license. */ int strlen(const char* str) { int i = 0; while(*str++) i++; return i; } int isprint(char c) { return c >= 32; } void strcpy(const char* src, char* dst) { char c; while (c = *src++) *dst++ = c; *dst = 0; } void memset(int* s, int v, int n) { while(--n >= 0) *s++ = v; } int strncmp(const char* a, const char* b, int n) { while(--n >= 0) { if (*a != *b) return *a - *b; if (*a == 0) return *b; if (*b == 0) return -*a; } } int __div(int a, int b) { int c = 0; while (a >= b) { a -= b; c++; } return c; } int __mod(int a, int b) { while (a >= b) { a -= b; } return a; }
To generate the libc.a file:
vc +vsmile -nostdlib -O3 -c libc.c ar cru libvc.a libc.o
We also need some include files. ctype.h and stdlib.h can be empty, but they are used by Contiki so they must exist.
string.h has some basic functions declarations and type definitions:
#ifndef __STRING_H #define __STRING_H 1 /* Adapt according to stddef.h. */ #ifndef __SIZE_T #define __SIZE_T 1 typedef unsigned int size_t; #endif #undef NULL #define NULL ((void *)0) /* Many of these functions should perhaps be implemented as inline-assembly or assembly-functions. Most suitable are: - memcpy - strcpy - strlen - strcmp - strcat */ void *memcpy(void *,const void *,size_t n); void *memmove(void *,const void *,size_t); void *memset(void *,int,size_t); int memcmp(const void *,const void *,size_t); void *memchr(const void *,int,size_t); char *strcat(char *,const char *); char *strncat(char *,const char *,size_t); char *strchr(const char *,int); size_t strcspn(const char *,const char *); char *strpbrk(const char *,const char *); char *strrchr(const char *,int); size_t strspn(const char *,const char *); char *strstr(const char *,const char *); char *strtok(char *,const char *); char *strerror(int); size_t strlen(const char *); char *strcpy(char *,const char *); char *strncpy(char *,const char *,size_t); int strcmp(const char *,const char *); int strncmp(const char *,const char *,size_t); int strcoll(const char *,const char *); size_t strxfrm(char *,const char *,size_t); #endif
Running the compiler
Note: make sure you have set the VBCC environment variable as explained earlier, otherwise the compiler will not find the target configuration and will complain and not compile anything.
To compile a .c file into a .o file:
vc +vsmile -c file.c -o
To generate a binary file from .o files:
vc +vsmile -ole -o cartridge.bin *.o
And of course you can run the resulting cartridge in MAME if you want:
mame -debug -debugger qt -rompath /path/to/vsmile/bios/ vsmile -cart ./cartridge.bin -nomax
Generating debug symbols for MAME
Use the -v option to vbcc to enable verbose mode, this will also generate a mapfile containing info about the generated binary.
The mapfile list of symbols can be turned into a file that the MAME debugger can use to create comments:
sed mapfile -ne 's! 0x\(.*\) \(.*\):.*!comadd \1,\2!p' > symbols.mame
Then in MAME, use “source symbols.mame” to load it. You can switch the disassembly view to show comments, and it will be very helpful to understand where you are when stepping through the code. Even local labels are exported, which can be matched with .ic2 files to compare the intermediate representation with the generated code. Very useful when debugging the compiler.
TODO list
Bugs
There is an internal compiler error when compiling the sound demo- long int values are mostly untested and broken in various places (code is missing, they will be handled the same as 1-byte values)
Contiki crashes when trying to call a function pointerProbably more problems waiting in Contiki once that is fixed- Try to compile larger projects such as Copyright Infringement to find more bugs
Memory addressing
- Make it possible to put things in low ROM explicitly using dedicated sections (needed for example for the reset vectors)
- Add a “far pointer” type. The current pointer type is only 16 bit so it can only point into the current data segment
- Add a “far pointer” type also for functions
Startup and library
- The startup script should clear the bss segment to 0 (MAME will do this on first start, but not on reset)
- Implement division and modulo for “long” types, signed vs unsigned, etc.
- Floating point support routines
- Move font.bin outside of crt0/startup, and provide a more convenient way to handle resources (some Makefile rules maybe?)
Optimizations
- Due to the allocation of temporary registers by the backend and related register spilling, this sequence can appear in the generated assembly code:
POP R1, R1, [SP] PUSH R1, R1, [SP] LD R1, ...
The POP and PUSH can be eliminated, they are useless. This should be done using the peephole optimizer since the register spilling is generated by the backend and does not come from ICs.
- ALLOCREG ICs are not always located just before the first use of the register. This results in registers being marked as allocated for the next few ICs, when in fact it's not. Propagating the ALLOCREG down until the first IC that actually uses the register would allow more use of temporary registers.
- Optimize TEST ICs if the latest generated instruction already set the flags to the correct value. In particular, sequences like:
LD R1, ... CMP R1, 0
- The frontend does not make use of special addressing modes (post-increment, etc). The backend should detect sequences of ICs that can be collapsed to a single instruction. The IC sequence is something like:
Move pointer to register Add 1 to register Store to pointer-register
(there are a few variants for pre and post increments, and some other ICs may be inserted in between, making this not so easy to track).
- When setting several registers to the same value, there is one IC to move a constant to each of them. This results in a lot of repeated
LD Rx,nn ; ST Rx,[yyyy]
with the same nn. This can be optimized at the IC level (create an allocreg to store the constant) or maybe a bit lower (just check if the previous IC used the same register with the same value?) - BRA IC is always translated as GOTO, use LJMP instead so short jumps are used where possible