Skip to main content

PVM Recompilation

The recompiler translates PVM bytecode into native x86-64 machine code for direct CPU execution, providing 10-100x performance improvements over interpretation.

Overview

The recompiler performs one-time compilation at program load, generating native code that executes directly on the CPU. PVM registers map to x86-64 CPU registers, and PVM memory is accessed through a base pointer.

Compilation Process

1. Program Analysis

Skip Values: Pre-computed for fast instruction traversal Basic Blocks: Identified by scanning for terminator instructions (jumps, branches, trap, ecalli)

# Basic blocks start at terminator + 1 + skip
basic_blocks = [0]
for n in range(len(instruction_set)):
if offset_bitmask[n] and inst_map.is_terminating(instruction_set[n]):
basic_blocks.append(n + 1 + skip(n))

2. Assembly Generation

Labels: Created for each basic block (jump targets) Instructions: Each PVM instruction translated by dispatch table Gas Metering: Inline gas checks per instruction

def assemble(program):
asm = PyAssembler()

# Create labels for all basic blocks
labels = {i: asm.forward_declare_label() for i in range(len(instruction_set))
if offset_bitmask[i]}

# Translate each instruction
for counter in range(len(instruction_set)):
if offset_bitmask[counter]:
asm.define_label(labels[counter])
opcode = instruction_set[counter]

# Emit gas check
emit_gas_check(asm, inst_map.gas_cost(opcode))

# Emit instruction
inst_map.process_instruction(opcode, program, counter, asm)

return asm.finalize()

3. Gas Metering

Gas stored in VM context at offset gas_offset from R15 (base pointer). Each instruction decrements gas and checks for underflow:

; Gas check (per instruction)
sub r15, gas_offset_value ; Point to gas field
sub [r15], <gas_cost> ; Decrement gas
js out_of_gas_handler ; Jump if negative
add r15, gas_offset_value ; Restore pointer

Offset calculation: gas_offset = -8 - (8 * 13) - 8 - 8 - 8 = -128

4. Executable Memory Allocation

Native code requires executable memory with proper permissions:

def allocate_executable_memory(code: bytes):
size = len(code)
page_size = mmap.PAGESIZE
alloc_size = (size + page_size - 1) & ~(page_size - 1)

# Allocate RW memory
buf = mmap.mmap(-1, alloc_size, access=mmap.ACCESS_WRITE)
buf.write(code)

# Change to RX (read + execute)
addr = ctypes.addressof(ctypes.c_char.from_buffer(buf))
aligned_addr = addr & ~(page_size - 1)
libc.mprotect(aligned_addr, alloc_size, PROT_READ | PROT_EXEC)

return buf, addr

Register Mapping

PVM's 13 registers map directly to x86-64 CPU registers:

PVMx86-64Purpose
r0RDIReturn address
r1RAXStack pointer
r2RSITemporary
r3RBXTemporary
r4RDXTemporary
r5RBPSaved
r6R8Saved
r7R9Argument/Return
r8R10Argument
r9R11Argument
r10R12Saved
r11R13Saved
r12R14Saved

Special Registers:

  • R15: Base pointer to PVM memory (guest memory)
  • RCX: Temporary register for internal use

All PVM registers persist in CPU registers during execution for zero-overhead access.

VM Context

VM context stores registers, gas, jump table, and heap state in memory before the guest memory buffer:

Memory Layout:
┌──────────────────────┐
│ Jump Table │ Variable size (8 bytes × jump_table_len)
├──────────────────────┤
│ Jump Table Length │ 8 bytes (offset: jump_len_offset)
├──────────────────────┤
│ Registers[13] │ 104 bytes (offset: regs_offset = -128 - 8)
├──────────────────────┤
│ Gas │ 8 bytes (offset: gas_offset = -128)
├──────────────────────┤
│ Return Address │ 8 bytes (offset: ret_add_offset = -120)
├──────────────────────┤
│ Return Stack │ 8 bytes (offset: ret_stack_offset = -112)
├──────────────────────┤
│ Heap Start │ 4 bytes (offset: heap_start_offset = -4)
├══════════════════════┤ ← R15 points here (guest memory start)
│ Guest Memory │ 2GB (PVM memory)
└──────────────────────┘

Access Pattern:

; Access gas at offset -128 from R15
mov rax, [r15 - 128]

; Access register r7 (9th register) at offset -136 + (8×8)
mov rax, [r15 - 72]

Execution Flow

Caller Wrapper

A "caller" function wraps the generated code to save/restore host state:

def create_caller(code_pointer: int, mem_pointer: int):
asm = PyAssembler()

# Set up pointers
asm.mov_imm64(RCX, code_pointer) # Code entry point
asm.mov_imm64(R15, mem_pointer) # PVM memory base

# Save host CPU state
push_all_regs(asm)

# Load PVM registers from VM context
load_all_regs(asm)

# Call generated code
asm.call(RegMem.Reg(RCX))

# Save PVM registers back to VM context
save_all_regs(asm)

# Restore host CPU state
pop_all_regs(asm)

asm.ret()
return asm.finalize()

Signal Handling

PVM page faults and host calls use SIGSEGV signal handling via C extension (segwrap.c):

Signal Handler Types:

  • Status 0: Host call (ecalli) - si_data contains host call number
  • Status 1: Page fault - si_data contains faulting address
  • Status 2: UD2 instruction (used for HALT/OUT_OF_GAS)

Execution Wrapper:

def run_code(addr, vm_ctx, vm_pointer, halt_addr):
# Install signal handlers
segwrap.initialize()

# Execute native code
ret_val = ctypes.c_uint64(0)
result = segwrap.run_code(ctypes.c_uint64(addr), ctypes.byref(ret_val))

if result == 0:
# No signal - normal return (panic)
return PANIC
else:
# Signal occurred - get program state
pg_data = ProgramData()
segwrap.get_program_status(ctypes.byref(pg_data))

if pg_data.status == 0:
return HOST(pg_data.si_data) # Host call
elif pg_data.status == 1:
return PAGE_FAULT(pg_data.si_data - pg_data.r15) # Page fault
elif pg_data.status == 2:
return HALT if pg_data.si_data == halt_addr else OUT_OF_GAS

HALT Implementation:

; HALT handler in generated code
halt_label:
ud2 ; Illegal instruction → SIGSEGV with status 2

Host Call (ecalli) Implementation:

; ecalli <imm>
mov [r15 + si_offset], <imm> ; Store host call number
ud2 ; Trigger SIGSEGV
; After host call returns, execution continues here

ProgramData Structure

Register state captured on signal:

struct ProgramData {
uint64_t r8, r9, r10, r11, r12, r13, r14; // PVM r6-r12
uint64_t r15; // Memory base pointer
uint64_t rdi; // PVM r0
uint64_t rsi; // PVM r2
uint64_t rbp; // PVM r5
uint64_t rbx; // PVM r3
uint64_t rdx; // PVM r4
uint64_t rax; // PVM r1
uint64_t rcx; // Temporary
uint64_t rsp; // Stack pointer
uint64_t rip; // Instruction pointer
uint64_t eflags; // CPU flags
uint64_t si_data; // Signal data (addr/host call number)
int8_t status; // Signal type (0=HOST, 1=PAGE_FAULT, 2=HALT/OOG)
};

PC Mapping

Native code offsets map back to PVM instruction offsets:

PVM → Native:

def pvm_to_msn_index(pvm_offset: int) -> int:
# Count set bits in bitmask up to pvm_offset
bms = offset_bitmask[:pvm_offset]
return pvm_msn_map[bms.count(True)]

Native → PVM:

def msn_to_pvm_index(msn_offset: int) -> int:
# Binary search in pvm_msn_map
for i, addr in enumerate(pvm_msn_map):
if addr > msn_offset:
# Return previous basic block start
return find_pvm_offset_for_index(i - 1)

Performance

Compilation Overhead: 1-5ms (one-time) Execution Speedup: 10-100x vs interpreter

WorkloadInterpreterRecompilerSpeedup
Arithmetic100ms2ms50x
Memory-heavy80ms8ms10x
Mixed120ms6ms20x

Memory Overhead: 2-5x bytecode size for native code

Platform Support

Supported:

  • ✅ x86-64

Unsupported:

  • ❌ ARM64 (would need separate register mapping)
  • ❌ Windows (signal handling differences)

Usage

os.environ['PVM_MODE'] = 'recompiler'

from tsrkit_pvm import Recompiler as PVM
from tsrkit_pvm import REC_Memory as Memory
from tsrkit_pvm import REC_Program as Program

program = Program.from_blob(bytecode)
status, pc, gas, regs, mem = PVM.execute(
program, pc=0, gas=1_000_000,
registers=[0]*13, memory=Memory()
)

References

  • Implementation: tsrkit-pvm/tsrkit_pvm/recompiler/
  • Assembler: tsrkit-asm package
  • Gray Paper: Appendix A - Virtual Machine

Previous: Instructions