PVM Instructions
The PVM uses a RISC-V-inspired instruction set with a register-based architecture. Instructions are variable-length (1-16 bytes) with compact encoding for common operations.
Instruction Format
Program Structure
PVM programs consist of three components:
- Instruction Data (
c): Sequence of bytes encoding instructions and their arguments - Opcode Bitmask (
k): Bitmap marking which bytes are instruction opcodes - Jump Table (
j): Valid targets for dynamic jumps
Encoding
Instructions use little-endian encoding for multi-byte values. Immediate values can be compactly encoded by eliding high-order bytes - elided bytes default to 0x00 for positive values or 0xFF for negative values (sign extension).
Skip Function: The number of bytes to the next instruction opcode is determined by the bitmask. Each instruction's length is implicit (max 16 bytes).
Instruction Categories
Control Flow
Unconditional:
trap(0) - Halt with panicfallthrough(1) - No-opjump- Static jump to addressjump_ind- Indirect jump via register
Conditional Branches:
branch_eq,branch_ne- Equal/not-equalbranch_lt_u/s,branch_ge_u/s- Less-than/greater-equal (unsigned/signed)branch_*_imm- Branches with immediate comparison values
Dynamic Jumps:
- Must target entries in jump table
j - Address =
(index + 1) × 2(aligned to 2 bytes) - Target must be start of basic block
- Special address
0xFFFF0000triggers HALT
Register Operations
Load Immediate:
load_imm- Load 32-bit immediate valueload_imm_64- Load 64-bit immediate value
Register-Register:
add,sub- Addition, subtractionmul,mul_upper_ss/su/uu- Multiplication (32-bit → 64-bit)div_u/s,rem_u/s- Division and remainder (unsigned/signed)and,or,xor- Bitwise operationsshl,shr_u/s- Shifts (logical/arithmetic)set_lt_u/s- Set if less-than (unsigned/signed)move_reg- Copy register value
Memory Operations
Load (2 gas):
load_u8/u16/u32- Load 1/2/4 bytes (zero-extended)- Memory must be readable or PAGE_FAULT occurs
Store (2 gas):
store_u8/u16/u32- Store 1/2/4 bytes- Memory must be writable or PAGE_FAULT occurs
Address Calculation:
- Addresses computed as:
base_register + offset - Offsets can be immediate or register values
- First 64KB of address space (
< 0x10000) always triggers PANIC
Host Calls
ecalli (1 gas):
ecalli <host_call_number>
Suspends PVM execution and invokes host call. Host call number provided as immediate argument. See Host Calls for details.
Instruction Encoding Patterns
No Arguments
[opcode]
Examples: trap, fallthrough
One Immediate
[opcode][imm bytes...]
Example: ecalli 5 → [0x0A][0x05]
Register + Immediate
[opcode][reg_index][imm bytes...]
Example: load_imm r3, 100 → [0x14][0x03][0x64]
Two Registers
[opcode][dst_reg | src_reg]
Registers packed into nibbles (4 bits each)
Register + Register + Immediate
[opcode][dst | src][imm bytes...]
Basic Blocks
Instructions that can alter control flow (jumps, branches, calls) are basic-block terminators. The PVM validates that:
- Static jumps target basic block starts
- Dynamic jumps index valid jump table entries
- Jump table targets point to basic block starts
Invalid jumps trigger PANIC.
Gas Costs
| Category | Instructions | Cost |
|---|---|---|
| Basic | trap, fallthrough, moves, logic | 1 |
| Arithmetic | add, sub, and, or, xor, shifts | 1 |
| Multiplication | mul, mul_upper_* | 2 |
| Division | div_u/s, rem_u/s | 4 |
| Memory | load_, store_ | 2 |
| Host calls | ecalli | Variable |
Signed vs Unsigned
Many instructions have unsigned (_u) and signed (_s) variants:
Unsigned: Treats values as 0 to 2³²-1 Signed: Treats values as -2³¹ to 2³¹-1 (two's complement)
Examples:
div_uvsdiv_s- Unsigned vs signed divisionshr_uvsshr_s- Logical vs arithmetic right shiftset_lt_uvsset_lt_s- Unsigned vs signed comparison
Memory Semantics
Page Faults
When accessing unmapped memory:
- Instruction execution pauses
PAGE_FAULTstatus returned with page address- Host allocates page (4KB)
- Execution resumes at same instruction
Protected Region
Addresses below 0x10000 (64KB) are permanently inaccessible - any access triggers immediate PANIC.
Complete Instruction Reference
For the full instruction table with exact opcodes, register assignments, and behavioral specifications, see:
Gray Paper: Appendix A, Section "Virtual Machine", subsection "Instruction Tables"
Key tables:
- Instructions without arguments
- Instructions with one immediate
- Instructions with one register and immediate
- Instructions with two registers
- Instructions with two registers and immediate
- Instructions with register and offset
- Load/store instructions
- Branch instructions
Implementation
Tessera implements the PVM instruction set in tsrkit-pvm:
Interpreter (tsrkit_pvm/interpreter/):
instructions/tables/- Instruction implementations organized by encoding patternpvm.py- Main execution loopprogram.py- Program loading and validationmemory.py- Memory management
Instruction Dispatch:
# Opcode → handler mapping
INSTRUCTION_MAP = {
0: trap_handler,
1: fallthrough_handler,
10: ecalli_handler,
20: load_imm_64_handler,
# ... etc
}
Execution Example
# Simple PVM execution
program = Program.from_blob(program_blob)
pc = 0
gas = 1_000_000
registers = [0] * 13
memory = Memory()
while True:
# Fetch instruction
opcode = program.code[pc]
# Execute
status, pc, gas, registers, memory = execute_instruction(
opcode, pc, gas, registers, memory
)
# Check exit condition
if status != CONTINUE:
break
Optimization
The recompiler in tsrkit-pvm translates PVM bytecode to native machine code:
Basic Block Analysis:
- Identifies basic blocks (sequences without jumps)
- Pre-computes jump targets
- Validates jump table
Native Compilation:
- Translates blocks to native assembly
- Inline register operations
- Eliminates interpretation overhead
See Recompilation for details.
References
- Gray Paper: Appendix A - Virtual Machine (Instruction Tables)
- Implementation:
tsrkit-pvm/interpreter/instructions/ - RISC-V Inspiration: Similar to RV32IM subset
Next: Recompilation