← Back to Blogs
HN Story

Optimizing the Calling Convention for Memory Safety: A Deep Dive into Fil-C

May 18, 2026

Optimizing the Calling Convention for Memory Safety: A Deep Dive into Fil-C

Memory safety in systems programming often comes with a steep performance tax. For programs that behave adversarially—such as those casting function pointers to incorrect signatures or misusing va_list—ensuring safety typically requires exhaustive runtime checks. Fil-C, a project by Filip Pizlo, aims to solve this by implementing a calling convention that catches type violations with a panic or ascribes safe behavior to them, without sacrificing efficiency in the common case.

To achieve this, Fil-C employs a tiered approach to function calls: a generic, safe-by-default convention that serves as the fallback, and a series of aggressive optimizations that allow the compiler to bypass checks when it can prove the call is safe.

The Generic Calling Convention: The Safety Baseline

Before optimizing, Fil-C defines a generic calling convention that guarantees safety regardless of how a function is called. The process is rigorous:

  1. Resolution: Direct calls are lowered to "getter calls" that resolve symbol names to flight pointers (tuples of capability pointers and integer values).
  2. Verification: The system verifies that the capability is not null, is specifically a function capability, and that the pointer's integer value matches the capability's callable value.
  3. Buffering: Arguments are rounded to 8 bytes, and two thread-local Calling Convention (CC) buffers are allocated—one for the payload and one for capabilities.
  4. Transfer: Control is transferred to the callee, which heap-allocates byref parameters and copies arguments from the CC buffers into local data flow.
  5. Return: The return process mirrors the argument passing, using CC buffers to transfer the result back to the caller.

While robust, this process is inefficient. It avoids registers entirely, requiring constant memory access to thread-local buffers and multiple layers of indirection for every call.

Register Optimization via Arithmetic Signature Encoding

To eliminate the overhead of CC buffers, Fil-C introduces a register-based calling convention. The core innovation here is the use of arithmetic encoding to represent function signatures as 64-bit integers.

How Arithmetic Encoding Works

Fil-C encodes signatures (up to 16 arguments and 2 return values) into a single int64. By assigning numeric values to types (e.g., int = 0, double = 2, pointer = 7), it creates a perfect hash of the signature. For example, the signature char* (*)(int, char*, double) is encoded as 60125.

The Fast Path and Thunks

Every function object in Fil-C contains a signature field and two entry points: a fast_entrypoint (native register-based) and a generic_entrypoint (buffer-based).

When a call is made, the caller checks if the callee's signature matches the expected encoding. If it matches, the caller jumps directly to the fast_entrypoint, passing arguments in registers. If the signatures differ, the system employs a pair of thunks:

  • Caller Entrypoint Thunk: Translates a register-based call into the generic buffer-based convention.
  • Callee Entrypoint Thunk: Translates a generic buffer-based call into a register-based one.

These thunks are generated as linkonce_odr in LLVM IR, ensuring that the linker only keeps one copy across modules. This mechanism allows Fil-C to maintain safety (via the generic path) while achieving a >1% speed-up on the PizBench9019 benchmark.

Eliminating Direct Caller Resolution

Even with register passing, direct calls still require a getter call and capability checks. Fil-C optimizes this further by leveraging ELF symbol mangling.

Signature-Mangled Implementations

Instead of calling a getter, the compiler exports an ELF symbol for the implementation itself, mangled to include the signature (e.g., pizlonatedFI60125_foo). If the caller and callee agree on the signature, the call becomes a direct jump to the implementation, bypassing the getter, the capability check, and the signature check entirely.

Handling Edge Cases with Weak Symbols

This optimization introduces complexities with ELF loading and C++ inline functions. To prevent infinite loops where a thunk calls itself, Fil-C uses hidden visibility for callsite thunks and a specific naming convention (pizlonatedFIP vs pizlonatedFI) to distinguish between the implementation and the alias.

For C++ inline functions—which are often weak definitions in COMDAT groups—the linker might drop the implementation the caller is trying to reference. Fil-C solves this by:

  1. Modifying LLVM to acknowledge that locally defined COMDAT symbols may be NULL.
  2. Emitting a NULL check for direct calls to these symbols.

This ensures that if a function is dropped by COMDAT resolution, the error is caught at link time rather than resulting in a runtime crash.

Summary of Performance Gains

By moving from a generic buffer-based approach to a direct-call register-based approach, Fil-C removes almost all overhead from the common case of function calls. The transition looks like this:

Feature Generic Convention Optimized Convention
Argument Passing Thread-local buffers CPU Registers
Resolution Getter call Direct jump to mangled symbol
Safety Checks Full capability & size check Single signature match or NULL check
Return Values Buffer-based CPU Registers

Combined, these optimizations provide a significant performance boost while maintaining a strict memory-safety guarantee, proving that high-level safety does not necessarily require a high-level performance penalty.

References

HN Stories