← back ABI 0x0 Jul 4, 2026I'm learning a lot designing my own programming language, 0x0. Some terms I'm recalling from my days of compiling linux. But most, I'm totally lost. Here's my journey with ABI's Ox0.
Wine can run a Windows program on Linux. On x86-64, that does not mean every instruction is being interpreted by an emulator. Windows code and Linux code can exist in the same process, in the same virtual address space.
The magic ends when one side tries to call the other.
At that point the CPU does not care about "Windows function" or "Linux function". It cares about registers, stack layout, memory addresses, return values, symbol tables, object formats, and ownership rules. That pile of promises is the ABI: the Application Binary Interface.
An API says:
read_proc_version(args)An ABI says:
- which register contains
args; - which register contains the return value;
- whether the caller or callee cleans anything up;
- how the stack must be aligned;
- what object file symbol exposes the function;
- what the pointed-to memory looks like;
- who owns that memory after the call returns;
- what counts as success or failure.
That is why ABI work is writing a border treaty between two runtimes.
The fun part is that this is also a useful way to understand what 0x0
is doing with ELF output. 0x0 is not just "emitting a binary".
It is deciding what binary promises exist, how to stamp them into
artifacts, and how to reject artifacts that do not keep those promises.
The Wine: Same Process, Different Contracts
ArcaneNibble's post, "How to call Linux code from a Wine process", is a great real-world ABI tour because it starts from a weird but practical question: if Wine has Windows and Linux code in one process, can a Windows-side program call Linux-side code?
The answer is "yes", but every version of "yes" is really an ABI story.
One route is the raw, low-level route. If Linux shared libraries are already
mapped into the process, you can go hunting through loaded ELF metadata.
The post walks through finding the loaded libc image, finding PT_DYNAMIC,
reading dynamic table entries such as DT_SYMTAB, DT_STRTAB,
and DT_GNU_HASH, then manually locating dlsym.
That is already a good ELF lesson: section headers are mostly a link-time and tooling convenience. The runtime loader cares about program headers and dynamic metadata. When inspecting a loaded image, file offsets are not the main thing anymore. Virtual addresses are.
But finding the function address is not enough. The post has to call the Linuxfunction with the System V x86-64 calling convention, even though the caller is still a Windows program. A function pointer value by itself is just an address. The ABI tells you how to enter that address without corrupting the conversation.
That matters on x86-64 because Windows and System V use different
calling conventions. The first few integer or pointer arguments
live in different registers, stack rules differ, and a compiler needs
to know which contract it is targeting. In Rust terms, this is the
difference between tagging a function as something like
extern "system" for the Windows side and extern "sysv64" for
the Linux side.
The cleaner route in the article is Wine's unixlib mechanism.
Instead of freely mixing Windows and Linux code inside one module,
Wine pairs a PE DLL with a normal native ELF .so. The PE side
asks Wine for a handle to the native side, then calls
__wine_unix_call_dispatcher with:
- a unixlib handle;
- a numeric function index;
- one pointer-sized argument.
The ELF side exports a known data symbol, __wine_unix_call_funcs,
which is an array of function pointers. Each function has the same
simple shape: one pointer argument, one 32-bit status result.
Complex arguments get packed behind the pointer.
This is the part I like most: unixlib makes the ABI bridge boring on purpose. The dispatcher can bridge between the Windows ABI and the System V ABI because the callable surface has been reduced to one uniform function shape.
That is a general systems trick. When the boundary is dangerous, make the shape of the boundary small.
ELF, the Container
ELF is the file format in the middle of this story. It can describe executable images, relocatable object files, shared objects, symbols, relocation records, loadable segments, interpreter paths, dynamic linking data, notes, and more.
An ELF file can say:
- "I am ELF64";
- "I target x86-64";
- "load this segment at this virtual address";
- "this is a relocation against this symbol";
- "this dynamic table points to these runtime structures".
It does not, by itself, tell you that a language's Text
value is a NUL-terminated UTF-8 pointer. It does not tell
you that a return value is split between a payload register
and a tag register. It does not tell you whether an object
file is safe to link with another object file from the
same language.
Those are ABI choices layered on top of the object format.
Wine's unixlib works because it uses ELF for the Linux side,
PE for the Windows side, and a deliberately tiny ABI bridge
between them. 0x0 is interesting for the same reason: it
treats ELF as the serialization format, then defines a small
language ABI on top of it.
What 0x0 Actually Manages
0x0 is a symbolic, Lisp-like, pure-functional language kernel
with a self-hosted compiler chain. The repository has several
native artifact paths:
- direct ELF64/x86-64 executables;
- ELF64 relocatable objects;
- archives;
- linked executables;
- compatibility C and GAS paths used for comparison and recovery.
The important thing is that the ABI is not left as an implied property of the current compiler. The repository writes it down and checks it.
The ABI guide says ABI version 0.1 is Linux x86-64 only. It
uses ELF64, little-endian encoding, and an exact marker:
0x0 ABI 0.1
Direct ELF binaries append that marker as a NUL-terminated
string. Relocatable objects carry it in .note.0x0.abi. The
linker rejects inputs with a different marker and appends
the accepted marker to linked output.
That marker is doing the same kind of work as the unixlib exported symbol array, but at a different layer. It is a cheap, inspectable way to say: this artifact claims to speak this binary contract.
The current runtime value ABI is intentionally small:
payload: rax
tag: r15
Current direct-ELF tags are:
0 = Unit / nil
1 = I64 or Bool payload
2 = NUL-terminated UTF-8 Text pointer
3 = cons/list pointer
Function returns use rax and r15. Function arguments
use paired payload and tag registers for up to six arguments.
Payloads go through the normal x86-64 argument registers:
arg0 rdi
arg1 rsi
arg2 rdx
arg3 rcx
arg4 r8
arg5 r9
Tags use another register set:
arg0 r10
arg1 r11
arg2 r12
arg3 r13
arg4 r14
arg5 r15
There is no vague "and then more arguments go somewhere"
clause in ABI 0.1. More than six arguments are rejected
by the current direct ELF path. That is a good thing.
A small ABI with a clear failure mode is better than a
larger ABI that exists only as compiler behavior nobody
has written down yet.
Text values are NUL-terminated UTF-8. Lists are cons cells. A cons node is a 24-byte record:
offset 0 = car payload
offset 8 = car tag
offset 16 = cdr payload
Allocation in the direct ELF runtime is process-lifetime
allocation backed by Linux syscalls such as mmap.
There is no ABI 0.1 garbage collector hiding behind the
scenes. Again, that limitation is useful because it is
explicit.
Direct ELF: Writing The Process Image
One of the more satisfying parts of 0x0 is that direct
ELF output is literal. The compiler emits ELF64/x86-64
bytes as hex text, then the wrapper turns that hex into
an executable file.
In compiler2/elf-runtime.0x0, the ELF helper writes the
ELF magic bytes, ET_EXEC, machine x86-64, an entry
point, one program header, and one PT_LOAD image.
The compatibility compiler has the same spirit: comments in
compiler/compat-main.0x0 say the backend writes a single
PT_LOAD ELF image and serializes little-endian header
fields as hex strings because the seed and compiler paths
pass artifacts as text.
That sounds primitive until you remember what an
executable needs to be at this stage. For the direct path,
0x0 does not need a general-purpose linker. It needs a
deterministic process image for the language slice it supports.
The startup code is also part of the ABI. The direct ELF
startup initializes the runtime arena, preserves process
metadata for argv and env, calls main, and prints
the supported result types. Compiler artifacts use a different
wrapper: they read source and output paths from process
arguments, compile, then write the output file.
This is another place where API and ABI split apart. The
user writes a main function. The emitted executable has
a process entry point, stack assumptions, syscall snippets,
runtime arena initialization, and result-printing behavior.
The ABI is the glue between those layers.
Object Files And The Linker: Refusing To Guess
The object path is where 0x0 starts looking more like a
traditional native toolchain. It emits ELF64 ET_REL
objects and links them into Linux ELF64 executables.
The interesting bit is that zero-link is not just a
wrapper around ld. It reads ELF section and symbol
tables itself and writes the executable itself. The current
linker accepts the compiler's object slice, resolves
symbols, applies supported relocations, lays out .text,
.rodata, .data, and .bss, and writes separate load
segments for executable and writable data.
The supported relocation set is explicit. The diagnostics are explicit. The ABI marker check is explicit.
For example, objects with the wrong ABI marker are not
"probably fine". They are rejected with an ABI mismatch
diagnostic. The tests mutate a deterministic object from
0x0 ABI 0.1 to 0x0 ABI 9.9 and require the linker to
refuse it.
That is the difference between a binary format and a managed ABI. ELF lets you put the bits somewhere. The toolchain policy decides which bits are allowed to cross the boundary.
ABI As Release Evidence
The part I appreciate most in 0x0 is the amount of ABI
state that lives in plain data files:
abi/VERSION;abi/layout-goldens.tsv;abi/value-layouts.tsv;abi/backend-contract-matrix.tsv;abi/diagnostic-contracts.tsv;release/abi-contract-evidence.tsv;compat/abi-matrix.tsv.
The value layout schema separates executable/object
ABI 0.1 from ABI v1 value layout families. That
distinction matters. The executable marker says whether
artifacts can link and run together. The value layout
schema says how language values such as Option,
Result, Map, Bytes, Error, paths, handles,
actors, sockets, closures, and host buffers are
represented when they cross runtime boundaries.
Those two ideas are related, but they are not the same. A project that merges them too early will eventually have to answer painful questions like: did this change break object compatibility, source compatibility, runtime value compatibility, or only one backend's implementation detail?
0x0's answer is to make the rows separate and then
gate them:
make abi-checkchecks released artifacts and ABI examplesmake abi-layout-checkchecks the required layout schema and docsmake abi-v1-layout-checkchecks promoted core value layoutsmake abi-contract-hardening-checkchecks the broader ABI evidence- linker and object tests check marker handling, relocations, symbol resolution, diagnostics, and output layout.
The tests are not just testing "does the program print 42". They are testing whether the artifacts still speak the same binary language.
The Lesson
The Wine article is interesting because it shows how thin the wall between "Windows process" and "Linux process" can be. The wall is the ABI.
The unixlib approach says: make the wall narrow and well-shaped. Use a PE side, a normal ELF side, a known exported function array, a dispatcher, one pointer argument, and one status result.
0x0 is applying the same kind of discipline to its own
compiler outputs. It does not treat "ELF" as a magic word
that makes binaries real. It defines:
- the target;
- the marker;
- the value representation;
- the call registers;
- the stack discipline;
- the object sections;
- the supported relocations;
- the linker behavior;
- the diagnostics;
- the release evidence.
That is what "managing ABI" looks like in practice. You keep the boundary small, stamp the boundary into the artifact, and make every tool that crosses it prove that it understands the stamp.
Maybe the most useful mental model is this:
ELF is the envelope. ABI is the language inside the envelope.
If the language is underspecified, two tools can both produce perfectly valid ELF and still fail to understand each other. If the ABI is explicit, the tools can be small, strict, and boring in the best possible way.
That is the lesson I want to keep from both Wine and 0x0:
the hard part is not always producing bytes. The hard part
is deciding which bytes are a promise.
Sources And Notes
- ArcaneNibble, "How to call Linux code from a Wine process".
0x0repository files inspected while writing this:README.md,docs/abi.md,docs/abi-value-layouts.md,docs/compiler.md,docs/linker.md,docs/native-toolchain.md,compiler2/elf-runtime.0x0,compiler2/object-format.0x0,compiler/compat-main.0x0,tools/zero-link.py,tools/zero-elf-info.py, and ABI/linker tests undertests/.
Glossary
ABI: Application Binary Interface. The binary-level contract for calls, values, registers, stack layout, symbols, object files, and runtime ownership.
API: Application Programming Interface. The source-level interface a programmer uses, such as a function name, parameters, and documented behavior.
ELF: Executable and Linkable Format. The common binary file format for Linux executables, object files, shared libraries, segments, sections, symbols, and relocations.
PE: Portable Executable. The Windows executable and DLL file format.
DLL: Dynamic Link Library. A Windows shared library.
SO: Shared Object. A Unix/Linux shared library, usually
ending in .so.
FFI: Foreign Function Interface. A way for code written for one language or runtime to call code written for another.
GAS: GNU Assembler. The assembler syntax/tooling path used by the compatibility object backend.
SysV: System V. In this post, shorthand for the System V x86-64 ABI used by Linux and other Unix-like systems on x86-64.
WinAPI: Windows API. The operating-system API exposed by Windows.
WOW64: Windows-on-Windows 64-bit. The Windows subsystem for running 32-bit Windows code on 64-bit Windows; Wine has analogous concerns for 32-bit-on-64-bit support.
libc: The C standard library implementation used by a Unix-like system, often also exposing POSIX and Linux-specific functions.
POSIX: Portable Operating System Interface. A family of Unix-like operating system API standards.
mmap: A Unix/Linux system call used to map memory
into a process address space.
DT_*: ELF dynamic table tags, such as DT_SYMTAB
and DT_STRTAB.
PT_*: ELF program header types, such as PT_LOAD
and PT_DYNAMIC.
ET_*: ELF file types, such as ET_EXEC and ET_REL.
PT_LOAD: An ELF program header type telling the loader
to map a segment into memory.
PT_DYNAMIC: An ELF program header type pointing to
dynamic linking metadata.
ET_EXEC: An ELF file type for executable files.
ET_REL: An ELF file type for relocatable object
files.
PLT: Procedure Linkage Table. ELF machinery commonly used for calls that are resolved through dynamic linking.
GOT: Global Offset Table. ELF machinery commonly used to hold addresses needed by position-independent and dynamically linked code.
NUL: The zero byte, often used to terminate C strings.
UTF-8: Unicode Transformation Format 8-bit. A byte encoding for Unicode text.
OISA: The intermediate instruction set used by 0x0
in its compiler pipeline.