PVM Recompilation
The recompiler translates PVM bytecode into native x86-64 machine code for direct CPU execution, providing 10-100x performance improvements over interpretation.
Overview
The recompiler performs one-time compilation at program load, generating native code that executes directly on the CPU. PVM registers map to x86-64 CPU registers, and PVM memory is accessed through a base pointer.
Compilation Process
1. Program Analysis
Skip Values: Pre-computed for fast instruction traversal Basic Blocks: Identified by scanning for terminator instructions (jumps, branches, trap, ecalli)
# Basic blocks start at terminator + 1 + skip
basic_blocks = [0]
for n in range(len(instruction_set)):
if offset_bitmask[n] and inst_map.is_terminating(instruction_set[n]):
basic_blocks.append(n + 1 + skip(n))
2. Assembly Generation
Labels: Created for each basic block (jump targets) Instructions: Each PVM instruction translated by dispatch table Gas Metering: Inline gas checks per instruction
def assemble(program):
asm = PyAssembler()
# Create labels for all basic blocks
labels = {i: asm.forward_declare_label() for i in range(len(instruction_set))
if offset_bitmask[i]}
# Translate each instruction
for counter in range(len(instruction_set)):
if offset_bitmask[counter]:
asm.define_label(labels[counter])
opcode = instruction_set[counter]
# Emit gas check
emit_gas_check(asm, inst_map.gas_cost(opcode))
# Emit instruction
inst_map.process_instruction(opcode, program, counter, asm)
return asm.finalize()
3. Gas Metering
Gas stored in VM context at offset gas_offset from R15 (base pointer). Each instruction decrements gas and checks for underflow:
; Gas check (per instruction)
sub r15, gas_offset_value ; Point to gas field
sub [r15], <gas_cost> ; Decrement gas
js out_of_gas_handler ; Jump if negative
add r15, gas_offset_value ; Restore pointer
Offset calculation: gas_offset = -8 - (8 * 13) - 8 - 8 - 8 = -128
4. Executable Memory Allocation
Native code requires executable memory with proper permissions:
def allocate_executable_memory(code: bytes):
size = len(code)
page_size = mmap.PAGESIZE
alloc_size = (size + page_size - 1) & ~(page_size - 1)
# Allocate RW memory
buf = mmap.mmap(-1, alloc_size, access=mmap.ACCESS_WRITE)
buf.write(code)
# Change to RX (read + execute)
addr = ctypes.addressof(ctypes.c_char.from_buffer(buf))
aligned_addr = addr & ~(page_size - 1)
libc.mprotect(aligned_addr, alloc_size, PROT_READ | PROT_EXEC)
return buf, addr
Register Mapping
PVM's 13 registers map directly to x86-64 CPU registers:
| PVM | x86-64 | Purpose |
|---|---|---|
| r0 | RDI | Return address |
| r1 | RAX | Stack pointer |
| r2 | RSI | Temporary |
| r3 | RBX | Temporary |
| r4 | RDX | Temporary |
| r5 | RBP | Saved |
| r6 | R8 | Saved |
| r7 | R9 | Argument/Return |
| r8 | R10 | Argument |
| r9 | R11 | Argument |
| r10 | R12 | Saved |
| r11 | R13 | Saved |
| r12 | R14 | Saved |
Special Registers:
- R15: Base pointer to PVM memory (guest memory)
- RCX: Temporary register for internal use
All PVM registers persist in CPU registers during execution for zero-overhead access.
VM Context
VM context stores registers, gas, jump table, and heap state in memory before the guest memory buffer:
Memory Layout:
┌──────────────────────┐
│ Jump Table │ Variable size (8 bytes × jump_table_len)
├──────────────────────┤
│ Jump Table Length │ 8 bytes (offset: jump_len_offset)
├──────────────────────┤
│ Registers[13] │ 104 bytes (offset: regs_offset = -128 - 8)
├──────────────────────┤
│ Gas │ 8 bytes (offset: gas_offset = -128)
├──────────────────────┤
│ Return Address │ 8 bytes (offset: ret_add_offset = -120)
├──────────────────────┤
│ Return Stack │ 8 bytes (offset: ret_stack_offset = -112)
├──────────────────────┤
│ Heap Start │ 4 bytes (offset: heap_start_offset = -4)
├══════════════════════┤ ← R15 points here (guest memory start)
│ Guest Memory │ 2GB (PVM memory)
└──────────────────────┘
Access Pattern:
; Access gas at offset -128 from R15
mov rax, [r15 - 128]
; Access register r7 (9th register) at offset -136 + (8×8)
mov rax, [r15 - 72]
Execution Flow
Caller Wrapper
A "caller" function wraps the generated code to save/restore host state:
def create_caller(code_pointer: int, mem_pointer: int):
asm = PyAssembler()
# Set up pointers
asm.mov_imm64(RCX, code_pointer) # Code entry point
asm.mov_imm64(R15, mem_pointer) # PVM memory base
# Save host CPU state
push_all_regs(asm)
# Load PVM registers from VM context
load_all_regs(asm)
# Call generated code
asm.call(RegMem.Reg(RCX))
# Save PVM registers back to VM context
save_all_regs(asm)
# Restore host CPU state
pop_all_regs(asm)
asm.ret()
return asm.finalize()
Signal Handling
PVM page faults and host calls use SIGSEGV signal handling via C extension (segwrap.c):
Signal Handler Types:
- Status 0: Host call (ecalli) -
si_datacontains host call number - Status 1: Page fault -
si_datacontains faulting address - Status 2: UD2 instruction (used for HALT/OUT_OF_GAS)
Execution Wrapper:
def run_code(addr, vm_ctx, vm_pointer, halt_addr):
# Install signal handlers
segwrap.initialize()
# Execute native code
ret_val = ctypes.c_uint64(0)
result = segwrap.run_code(ctypes.c_uint64(addr), ctypes.byref(ret_val))
if result == 0:
# No signal - normal return (panic)
return PANIC
else:
# Signal occurred - get program state
pg_data = ProgramData()
segwrap.get_program_status(ctypes.byref(pg_data))
if pg_data.status == 0:
return HOST(pg_data.si_data) # Host call
elif pg_data.status == 1:
return PAGE_FAULT(pg_data.si_data - pg_data.r15) # Page fault
elif pg_data.status == 2:
return HALT if pg_data.si_data == halt_addr else OUT_OF_GAS
HALT Implementation:
; HALT handler in generated code
halt_label:
ud2 ; Illegal instruction → SIGSEGV with status 2
Host Call (ecalli) Implementation:
; ecalli <imm>
mov [r15 + si_offset], <imm> ; Store host call number
ud2 ; Trigger SIGSEGV
; After host call returns, execution continues here
ProgramData Structure
Register state captured on signal:
struct ProgramData {
uint64_t r8, r9, r10, r11, r12, r13, r14; // PVM r6-r12
uint64_t r15; // Memory base pointer
uint64_t rdi; // PVM r0
uint64_t rsi; // PVM r2
uint64_t rbp; // PVM r5
uint64_t rbx; // PVM r3
uint64_t rdx; // PVM r4
uint64_t rax; // PVM r1
uint64_t rcx; // Temporary
uint64_t rsp; // Stack pointer
uint64_t rip; // Instruction pointer
uint64_t eflags; // CPU flags
uint64_t si_data; // Signal data (addr/host call number)
int8_t status; // Signal type (0=HOST, 1=PAGE_FAULT, 2=HALT/OOG)
};
PC Mapping
Native code offsets map back to PVM instruction offsets:
PVM → Native:
def pvm_to_msn_index(pvm_offset: int) -> int:
# Count set bits in bitmask up to pvm_offset
bms = offset_bitmask[:pvm_offset]
return pvm_msn_map[bms.count(True)]
Native → PVM:
def msn_to_pvm_index(msn_offset: int) -> int:
# Binary search in pvm_msn_map
for i, addr in enumerate(pvm_msn_map):
if addr > msn_offset:
# Return previous basic block start
return find_pvm_offset_for_index(i - 1)
Performance
Compilation Overhead: 1-5ms (one-time) Execution Speedup: 10-100x vs interpreter
| Workload | Interpreter | Recompiler | Speedup |
|---|---|---|---|
| Arithmetic | 100ms | 2ms | 50x |
| Memory-heavy | 80ms | 8ms | 10x |
| Mixed | 120ms | 6ms | 20x |
Memory Overhead: 2-5x bytecode size for native code
Platform Support
Supported:
- ✅ x86-64
Unsupported:
- ❌ ARM64 (would need separate register mapping)
- ❌ Windows (signal handling differences)
Usage
os.environ['PVM_MODE'] = 'recompiler'
from tsrkit_pvm import Recompiler as PVM
from tsrkit_pvm import REC_Memory as Memory
from tsrkit_pvm import REC_Program as Program
program = Program.from_blob(bytecode)
status, pc, gas, regs, mem = PVM.execute(
program, pc=0, gas=1_000_000,
registers=[0]*13, memory=Memory()
)
References
- Implementation:
tsrkit-pvm/tsrkit_pvm/recompiler/ - Assembler:
tsrkit-asmpackage - Gray Paper: Appendix A - Virtual Machine
Previous: Instructions