Skip to main content

PVM Instructions

The PVM uses a RISC-V-inspired instruction set with a register-based architecture. Instructions are variable-length (1-16 bytes) with compact encoding for common operations.

Instruction Format

Program Structure

PVM programs consist of three components:

  1. Instruction Data (c): Sequence of bytes encoding instructions and their arguments
  2. Opcode Bitmask (k): Bitmap marking which bytes are instruction opcodes
  3. Jump Table (j): Valid targets for dynamic jumps

Encoding

Instructions use little-endian encoding for multi-byte values. Immediate values can be compactly encoded by eliding high-order bytes - elided bytes default to 0x00 for positive values or 0xFF for negative values (sign extension).

Skip Function: The number of bytes to the next instruction opcode is determined by the bitmask. Each instruction's length is implicit (max 16 bytes).

Instruction Categories

Control Flow

Unconditional:

  • trap (0) - Halt with panic
  • fallthrough (1) - No-op
  • jump - Static jump to address
  • jump_ind - Indirect jump via register

Conditional Branches:

  • branch_eq, branch_ne - Equal/not-equal
  • branch_lt_u/s, branch_ge_u/s - Less-than/greater-equal (unsigned/signed)
  • branch_*_imm - Branches with immediate comparison values

Dynamic Jumps:

  • Must target entries in jump table j
  • Address = (index + 1) × 2 (aligned to 2 bytes)
  • Target must be start of basic block
  • Special address 0xFFFF0000 triggers HALT

Register Operations

Load Immediate:

  • load_imm - Load 32-bit immediate value
  • load_imm_64 - Load 64-bit immediate value

Register-Register:

  • add, sub - Addition, subtraction
  • mul, mul_upper_ss/su/uu - Multiplication (32-bit → 64-bit)
  • div_u/s, rem_u/s - Division and remainder (unsigned/signed)
  • and, or, xor - Bitwise operations
  • shl, shr_u/s - Shifts (logical/arithmetic)
  • set_lt_u/s - Set if less-than (unsigned/signed)
  • move_reg - Copy register value

Memory Operations

Load (2 gas):

  • load_u8/u16/u32 - Load 1/2/4 bytes (zero-extended)
  • Memory must be readable or PAGE_FAULT occurs

Store (2 gas):

  • store_u8/u16/u32 - Store 1/2/4 bytes
  • Memory must be writable or PAGE_FAULT occurs

Address Calculation:

  • Addresses computed as: base_register + offset
  • Offsets can be immediate or register values
  • First 64KB of address space (< 0x10000) always triggers PANIC

Host Calls

ecalli (1 gas):

ecalli <host_call_number>

Suspends PVM execution and invokes host call. Host call number provided as immediate argument. See Host Calls for details.

Instruction Encoding Patterns

No Arguments

[opcode]

Examples: trap, fallthrough

One Immediate

[opcode][imm bytes...]

Example: ecalli 5[0x0A][0x05]

Register + Immediate

[opcode][reg_index][imm bytes...]

Example: load_imm r3, 100[0x14][0x03][0x64]

Two Registers

[opcode][dst_reg | src_reg]

Registers packed into nibbles (4 bits each)

Register + Register + Immediate

[opcode][dst | src][imm bytes...]

Basic Blocks

Instructions that can alter control flow (jumps, branches, calls) are basic-block terminators. The PVM validates that:

  • Static jumps target basic block starts
  • Dynamic jumps index valid jump table entries
  • Jump table targets point to basic block starts

Invalid jumps trigger PANIC.

Gas Costs

CategoryInstructionsCost
Basictrap, fallthrough, moves, logic1
Arithmeticadd, sub, and, or, xor, shifts1
Multiplicationmul, mul_upper_*2
Divisiondiv_u/s, rem_u/s4
Memoryload_, store_2
Host callsecalliVariable

Signed vs Unsigned

Many instructions have unsigned (_u) and signed (_s) variants:

Unsigned: Treats values as 0 to 2³²-1 Signed: Treats values as -2³¹ to 2³¹-1 (two's complement)

Examples:

  • div_u vs div_s - Unsigned vs signed division
  • shr_u vs shr_s - Logical vs arithmetic right shift
  • set_lt_u vs set_lt_s - Unsigned vs signed comparison

Memory Semantics

Page Faults

When accessing unmapped memory:

  1. Instruction execution pauses
  2. PAGE_FAULT status returned with page address
  3. Host allocates page (4KB)
  4. Execution resumes at same instruction

Protected Region

Addresses below 0x10000 (64KB) are permanently inaccessible - any access triggers immediate PANIC.

Complete Instruction Reference

For the full instruction table with exact opcodes, register assignments, and behavioral specifications, see:

Gray Paper: Appendix A, Section "Virtual Machine", subsection "Instruction Tables"

Key tables:

  • Instructions without arguments
  • Instructions with one immediate
  • Instructions with one register and immediate
  • Instructions with two registers
  • Instructions with two registers and immediate
  • Instructions with register and offset
  • Load/store instructions
  • Branch instructions

Implementation

Tessera implements the PVM instruction set in tsrkit-pvm:

Interpreter (tsrkit_pvm/interpreter/):

  • instructions/tables/ - Instruction implementations organized by encoding pattern
  • pvm.py - Main execution loop
  • program.py - Program loading and validation
  • memory.py - Memory management

Instruction Dispatch:

# Opcode → handler mapping
INSTRUCTION_MAP = {
0: trap_handler,
1: fallthrough_handler,
10: ecalli_handler,
20: load_imm_64_handler,
# ... etc
}

Execution Example

# Simple PVM execution
program = Program.from_blob(program_blob)
pc = 0
gas = 1_000_000
registers = [0] * 13
memory = Memory()

while True:
# Fetch instruction
opcode = program.code[pc]

# Execute
status, pc, gas, registers, memory = execute_instruction(
opcode, pc, gas, registers, memory
)

# Check exit condition
if status != CONTINUE:
break

Optimization

The recompiler in tsrkit-pvm translates PVM bytecode to native machine code:

Basic Block Analysis:

  • Identifies basic blocks (sequences without jumps)
  • Pre-computes jump targets
  • Validates jump table

Native Compilation:

  • Translates blocks to native assembly
  • Inline register operations
  • Eliminates interpretation overhead

See Recompilation for details.

References

  • Gray Paper: Appendix A - Virtual Machine (Instruction Tables)
  • Implementation: tsrkit-pvm/interpreter/instructions/
  • RISC-V Inspiration: Similar to RV32IM subset

Next: Recompilation