Exercise: Just-In-Time Compiler
Difficulty - Advanced
Learning Objectives
- Understand JIT compilation principles and techniques
- Implement bytecode interpretation and native code generation
- Master low-level memory management and executable memory regions
- Build optimization passes for generated code
- Handle runtime type information and dynamic dispatch
- Implement register allocation and calling conventions
Problem Statement
Build a Just-In-Time compiler that translates a simple bytecode language into native machine code at runtime. The compiler should support basic arithmetic operations, control flow, and function calls, with optimization passes that improve the generated code.
JIT compilation is a critical technology in modern programming language runtimes. It combines the portability of interpreted languages with the performance of compiled languages by generating native machine code at runtime based on actual execution patterns.
Core Components
1package jit
2
3import (
4 "encoding/binary"
5 "fmt"
6 "syscall"
7 "unsafe"
8)
9
10// Opcode represents a bytecode instruction
11type Opcode byte
12
13const (
14 OP_LOAD_CONST Opcode = iota
15 OP_ADD
16 OP_SUB
17 OP_MUL
18 OP_DIV
19 OP_RETURN
20)
21
22// Bytecode represents compiled bytecode
23type Bytecode struct {
24 Instructions []byte
25 Constants []int64
26}
27
28// VM executes bytecode with profiling
29type VM struct {
30 Stack []int64
31 ProfileCounts map[int]int
32 JITCache map[int]CompiledFunc
33 JITThreshold int
34}
35
36// CompiledFunc is a JIT-compiled function
37type CompiledFunc func() int64
38
39// CodeBuffer manages native code generation
40type CodeBuffer struct {
41 Code []byte
42 Mem []byte // Executable memory
43}
44
45func Execute(bc *Bytecode) int64
46func Interpret(bc *Bytecode) int64
47func Compile(bc *Bytecode)
Solution
Click to see the solution
Algorithm Overview
The JIT compiler implements a tiered compilation approach with three levels:
1. Tier 0 - Interpreter:
- Simple bytecode interpreter as baseline
- Profiles execution to identify hot code
- Minimal overhead for rarely executed code
- Uses: Cold code paths, one-time initialization
2. Tier 1 - Template JIT:
- Direct translation of bytecode to machine code
- No optimization, just eliminates interpretation overhead
- Fast compilation for moderately hot code
- Generates: Stack-based x86-64 instructions
3. Tier 2 - Optimizing JIT:
- Constant folding, dead code elimination
- Register allocation for better performance
- Triggered after many executions
- Generates: Optimized register-based code
Profiling-Guided Compilation:
- Track execution counts per bytecode location
- Compile after threshold
- Cache compiled functions to avoid recompilation
- Fall back to interpreter if JIT fails
Time Complexity Analysis
| Operation | Interpreter | Template JIT | Optimizing JIT |
|---|---|---|---|
| OP_LOAD_CONST | O(1) | O(1) | O(1) |
| OP_ADD/SUB/MUL | O(1) | O(1) | O(1) |
| OP_DIV | O(1) | O(1) | O(1) |
| Compilation | N/A | O(n) bytecodes | O(n²) optimization |
| Execution | ~100ns/op | ~10ns/op | ~2ns/op |
Space Complexity:
- Bytecode: O(n) where n = number of instructions
- Interpreter stack: O(d) where d = max stack depth
- JIT code buffer: ~10-20 bytes per bytecode op
- Total overhead: ~100-200 bytes per function
Implementation
1package jit
2
3import (
4 "encoding/binary"
5 "fmt"
6 "syscall"
7 "unsafe"
8)
9
10// ===== Bytecode Definition =====
11
12// Opcode represents a bytecode instruction
13type Opcode byte
14
15const (
16 OP_LOAD_CONST Opcode = iota // Push constant from pool
17 OP_LOAD_VAR // Push variable from frame
18 OP_STORE_VAR // Pop and store in variable
19 OP_ADD // Pop 2, add, push result
20 OP_SUB // Pop 2, subtract, push result
21 OP_MUL // Pop 2, multiply, push result
22 OP_DIV // Pop 2, divide, push result
23 OP_MOD // Pop 2, modulo, push result
24 OP_NEG // Pop 1, negate, push result
25 OP_EQ // Pop 2, compare equal
26 OP_NE // Pop 2, compare not equal
27 OP_LT // Pop 2, compare less than
28 OP_LE // Pop 2, compare less or equal
29 OP_GT // Pop 2, compare greater than
30 OP_GE // Pop 2, compare greater or equal
31 OP_JUMP // Unconditional jump
32 OP_JUMP_IF_FALSE // Pop and jump if false
33 OP_CALL // Call function
34 OP_RETURN // Return from function
35 OP_PRINT // Print top of stack
36 OP_DUP // Duplicate top of stack
37 OP_POP // Pop and discard
38)
39
40// Bytecode represents compiled bytecode
41type Bytecode struct {
42 Instructions []byte
43 Constants []int64
44 NumVars int
45}
46
47// AddInstruction adds an instruction to bytecode
48func AddInstruction(op Opcode, args ...int) {
49 b.Instructions = append(b.Instructions, byte(op))
50 for _, arg := range args {
51 b.Instructions = append(b.Instructions, byte(arg))
52 }
53}
54
55// AddConstant adds a constant and returns its index
56func AddConstant(value int64) int {
57 idx := len(b.Constants)
58 b.Constants = append(b.Constants, value)
59 return idx
60}
61
62// ===== Interpreter =====
63
64type VM struct {
65 Stack []int64
66 Vars []int64
67 IP int // Instruction pointer
68 ProfileCounts map[int]int
69 JITCache map[int]CompiledFunc
70 JITThreshold int
71}
72
73func NewVM() *VM {
74 return &VM{
75 Stack: make([]int64, 0, 256),
76 Vars: make([]int64, 256),
77 ProfileCounts: make(map[int]int),
78 JITCache: make(map[int]CompiledFunc),
79 JITThreshold: 100,
80 }
81}
82
83func Push(value int64) {
84 vm.Stack = append(vm.Stack, value)
85}
86
87func Pop() int64 {
88 if len(vm.Stack) == 0 {
89 panic("stack underflow")
90 }
91 value := vm.Stack[len(vm.Stack)-1]
92 vm.Stack = vm.Stack[:len(vm.Stack)-1]
93 return value
94}
95
96func Peek() int64 {
97 if len(vm.Stack) == 0 {
98 return 0
99 }
100 return vm.Stack[len(vm.Stack)-1]
101}
102
103// Execute runs bytecode with profiling and JIT
104func Execute(bc *Bytecode) int64 {
105 // Check if we should JIT compile
106 vm.ProfileCounts[0]++
107
108 if vm.ProfileCounts[0] >= vm.JITThreshold {
109 if compiledFunc, exists := vm.JITCache[0]; exists {
110 // Use JIT compiled version
111 return compiledFunc()
112 }
113
114 // Try to JIT compile
115 if compiledFunc, err := Compile(bc); err == nil {
116 vm.JITCache[0] = compiledFunc
117 return compiledFunc()
118 }
119 }
120
121 // Fall back to interpretation
122 return vm.Interpret(bc)
123}
124
125// Interpret executes bytecode in interpreter
126func Interpret(bc *Bytecode) int64 {
127 vm.IP = 0
128 vm.Stack = vm.Stack[:0] // Clear stack
129
130 for vm.IP < len(bc.Instructions) {
131 opcode := Opcode(bc.Instructions[vm.IP])
132 vm.IP++
133
134 switch opcode {
135 case OP_LOAD_CONST:
136 idx := int(bc.Instructions[vm.IP])
137 vm.IP++
138 vm.Push(bc.Constants[idx])
139
140 case OP_LOAD_VAR:
141 idx := int(bc.Instructions[vm.IP])
142 vm.IP++
143 vm.Push(vm.Vars[idx])
144
145 case OP_STORE_VAR:
146 idx := int(bc.Instructions[vm.IP])
147 vm.IP++
148 vm.Vars[idx] = vm.Pop()
149
150 case OP_ADD:
151 b := vm.Pop()
152 a := vm.Pop()
153 vm.Push(a + b)
154
155 case OP_SUB:
156 b := vm.Pop()
157 a := vm.Pop()
158 vm.Push(a - b)
159
160 case OP_MUL:
161 b := vm.Pop()
162 a := vm.Pop()
163 vm.Push(a * b)
164
165 case OP_DIV:
166 b := vm.Pop()
167 a := vm.Pop()
168 if b == 0 {
169 panic("division by zero")
170 }
171 vm.Push(a / b)
172
173 case OP_MOD:
174 b := vm.Pop()
175 a := vm.Pop()
176 vm.Push(a % b)
177
178 case OP_NEG:
179 vm.Push(-vm.Pop())
180
181 case OP_EQ:
182 b := vm.Pop()
183 a := vm.Pop()
184 if a == b {
185 vm.Push(1)
186 } else {
187 vm.Push(0)
188 }
189
190 case OP_NE:
191 b := vm.Pop()
192 a := vm.Pop()
193 if a != b {
194 vm.Push(1)
195 } else {
196 vm.Push(0)
197 }
198
199 case OP_LT:
200 b := vm.Pop()
201 a := vm.Pop()
202 if a < b {
203 vm.Push(1)
204 } else {
205 vm.Push(0)
206 }
207
208 case OP_DUP:
209 vm.Push(vm.Peek())
210
211 case OP_POP:
212 vm.Pop()
213
214 case OP_RETURN:
215 return vm.Pop()
216
217 default:
218 panic(fmt.Sprintf("unknown opcode: %d", opcode))
219 }
220 }
221
222 if len(vm.Stack) > 0 {
223 return vm.Pop()
224 }
225 return 0
226}
227
228// ===== JIT Compiler =====
229
230type CompiledFunc func() int64
231
232type CodeBuffer struct {
233 Code []byte
234 Mem []byte // Executable memory
235}
236
237func NewCodeBuffer() *CodeBuffer {
238 return &CodeBuffer{
239 Code: make([]byte, 0, 4096),
240 }
241}
242
243// AllocateExecutable allocates executable memory and copies code
244func AllocateExecutable() {
245 size := len(cb.Code)
246 if size == 0 {
247 return nil, fmt.Errorf("no code to execute")
248 }
249
250 // Round up to page size
251 pageSize := 4096
252 size = / pageSize) * pageSize
253
254 // Allocate executable memory
255 mem, err := syscall.Mmap(
256 -1, 0, size,
257 syscall.PROT_READ|syscall.PROT_WRITE|syscall.PROT_EXEC,
258 syscall.MAP_PRIVATE|syscall.MAP_ANONYMOUS,
259 )
260 if err != nil {
261 return nil, fmt.Errorf("mmap failed: %w", err)
262 }
263
264 // Copy code to executable memory
265 copy(mem, cb.Code)
266 cb.Mem = mem
267
268 // Create function pointer
269 return *(*func() int64)(unsafe.Pointer(&mem)), nil
270}
271
272func Free() error {
273 if cb.Mem != nil {
274 return syscall.Munmap(cb.Mem)
275 }
276 return nil
277}
278
279// x86-64 code generation helpers
280
281// EmitPrologue sets up stack frame
282func EmitPrologue() {
283 // PUSH RBP
284 cb.Code = append(cb.Code, 0x55)
285 // MOV RBP, RSP
286 cb.Code = append(cb.Code, 0x48, 0x89, 0xE5)
287}
288
289// EmitEpilogue tears down stack frame
290func EmitEpilogue() {
291 // POP RBP
292 cb.Code = append(cb.Code, 0x5D)
293 // RET
294 cb.Code = append(cb.Code, 0xC3)
295}
296
297// EmitMovImm64 emits: MOV reg, imm64
298func EmitMovImm64(reg int, value int64) {
299 // REX.W + B8+reg
300 cb.Code = append(cb.Code, 0x48, byte(0xB8+reg))
301 // 8-byte immediate
302 buf := make([]byte, 8)
303 binary.LittleEndian.PutUint64(buf, uint64(value))
304 cb.Code = append(cb.Code, buf...)
305}
306
307// EmitPush emits: PUSH reg
308func EmitPush(reg int) {
309 cb.Code = append(cb.Code, byte(0x50+reg))
310}
311
312// EmitPop emits: POP reg
313func EmitPop(reg int) {
314 cb.Code = append(cb.Code, byte(0x58+reg))
315}
316
317// EmitAdd emits: ADD dst, src
318func EmitAdd(dst, src int) {
319 // ADD r64, r64: REX.W + 01 /r
320 cb.Code = append(cb.Code, 0x48, 0x01, byte(0xC0+(src<<3)+dst))
321}
322
323// EmitSub emits: SUB dst, src
324func EmitSub(dst, src int) {
325 // SUB r64, r64: REX.W + 29 /r
326 cb.Code = append(cb.Code, 0x48, 0x29, byte(0xC0+(src<<3)+dst))
327}
328
329// EmitIMul emits: IMUL dst, src
330func EmitIMul(dst, src int) {
331 // IMUL r64, r64: REX.W + 0F AF /r
332 cb.Code = append(cb.Code, 0x48, 0x0F, 0xAF, byte(0xC0+(dst<<3)+src))
333}
334
335// EmitIDiv emits: IDIV src
336func EmitIDiv(src int) {
337 // CQO
338 cb.Code = append(cb.Code, 0x48, 0x99)
339 // IDIV r64: REX.W + F7 /7
340 cb.Code = append(cb.Code, 0x48, 0xF7, byte(0xF8+src))
341}
342
343// EmitNeg emits: NEG reg
344func EmitNeg(reg int) {
345 // NEG r64: REX.W + F7 /3
346 cb.Code = append(cb.Code, 0x48, 0xF7, byte(0xD8+reg))
347}
348
349// EmitRet emits: RET
350func EmitRet() {
351 cb.Code = append(cb.Code, 0xC3)
352}
353
354// Register allocation
355const (
356 RAX = 0
357 RCX = 1
358 RDX = 2
359 RBX = 3
360 RSP = 4
361 RBP = 5
362 RSI = 6
363 RDI = 7
364)
365
366// Compile translates bytecode to native code
367func Compile(bc *Bytecode) {
368 cb := NewCodeBuffer()
369
370 cb.EmitPrologue()
371
372 // Use RAX as accumulator, RBX as temp
373 // We simulate a stack using native stack
374
375 ip := 0
376 for ip < len(bc.Instructions) {
377 opcode := Opcode(bc.Instructions[ip])
378 ip++
379
380 switch opcode {
381 case OP_LOAD_CONST:
382 idx := int(bc.Instructions[ip])
383 ip++
384 value := bc.Constants[idx]
385
386 // MOV RAX, value
387 cb.EmitMovImm64(RAX, value)
388 // PUSH RAX
389 cb.EmitPush(RAX)
390
391 case OP_ADD:
392 // POP RAX
393 cb.EmitPop(RAX)
394 // POP RBX
395 cb.EmitPop(RBX)
396 // ADD RAX, RBX
397 cb.EmitAdd(RAX, RBX)
398 // PUSH RAX
399 cb.EmitPush(RAX)
400
401 case OP_SUB:
402 // POP RBX
403 cb.EmitPop(RBX)
404 // POP RAX
405 cb.EmitPop(RAX)
406 // SUB RAX, RBX
407 cb.EmitSub(RAX, RBX)
408 // PUSH RAX
409 cb.EmitPush(RAX)
410
411 case OP_MUL:
412 // POP RAX
413 cb.EmitPop(RAX)
414 // POP RBX
415 cb.EmitPop(RBX)
416 // IMUL RAX, RBX
417 cb.EmitIMul(RAX, RBX)
418 // PUSH RAX
419 cb.EmitPush(RAX)
420
421 case OP_DIV:
422 // POP RBX
423 cb.EmitPop(RBX)
424 // POP RAX
425 cb.EmitPop(RAX)
426 // IDIV RBX
427 cb.EmitIDiv(RBX)
428 // PUSH RAX
429 cb.EmitPush(RAX)
430
431 case OP_NEG:
432 // POP RAX
433 cb.EmitPop(RAX)
434 // NEG RAX
435 cb.EmitNeg(RAX)
436 // PUSH RAX
437 cb.EmitPush(RAX)
438
439 case OP_RETURN:
440 // POP RAX
441 cb.EmitPop(RAX)
442 // Epilogue and return
443 cb.EmitEpilogue()
444 goto done
445
446 default:
447 return nil, fmt.Errorf("unsupported opcode in JIT: %d", opcode)
448 }
449 }
450
451 // If no explicit return, return 0
452 cb.EmitMovImm64(RAX, 0)
453 cb.EmitEpilogue()
454
455done:
456 return cb.AllocateExecutable()
457}
458
459// ===== Optimizer =====
460
461type Optimizer struct{}
462
463// OptimizeBytecode applies optimization passes
464func Optimize(bc *Bytecode) *Bytecode {
465 optimized := &Bytecode{
466 Instructions: make([]byte, 0, len(bc.Instructions)),
467 Constants: bc.Constants,
468 NumVars: bc.NumVars,
469 }
470
471 // Pass 1: Constant folding
472 o.constantFolding(bc, optimized)
473
474 return optimized
475}
476
477// constantFolding folds constant expressions
478func constantFolding(bc *Bytecode, out *Bytecode) {
479 ip := 0
480 stack := []int64{}
481
482 for ip < len(bc.Instructions) {
483 opcode := Opcode(bc.Instructions[ip])
484 ip++
485
486 switch opcode {
487 case OP_LOAD_CONST:
488 idx := int(bc.Instructions[ip])
489 ip++
490 stack = append(stack, bc.Constants[idx])
491
492 case OP_ADD, OP_SUB, OP_MUL, OP_DIV:
493 // Check if we can fold
494 if len(stack) >= 2 {
495 b := stack[len(stack)-1]
496 a := stack[len(stack)-2]
497 stack = stack[:len(stack)-2]
498
499 var result int64
500 switch opcode {
501 case OP_ADD:
502 result = a + b
503 case OP_SUB:
504 result = a - b
505 case OP_MUL:
506 result = a * b
507 case OP_DIV:
508 if b != 0 {
509 result = a / b
510 } else {
511 // Can't fold division by zero
512 goto noFold
513 }
514 }
515
516 stack = append(stack, result)
517 continue
518 }
519
520 noFold:
521 // Flush stack to output
522 for _, val := range stack {
523 idx := out.AddConstant(val)
524 out.AddInstruction(OP_LOAD_CONST, idx)
525 }
526 stack = stack[:0]
527
528 // Add the operation
529 out.Instructions = append(out.Instructions, byte(opcode))
530 if opcode == OP_LOAD_CONST {
531 out.Instructions = append(out.Instructions, bc.Instructions[ip-1])
532 }
533
534 case OP_RETURN:
535 // Flush stack
536 for _, val := range stack {
537 idx := out.AddConstant(val)
538 out.AddInstruction(OP_LOAD_CONST, idx)
539 }
540 stack = stack[:0]
541 out.AddInstruction(OP_RETURN)
542
543 default:
544 // Flush and copy
545 for _, val := range stack {
546 idx := out.AddConstant(val)
547 out.AddInstruction(OP_LOAD_CONST, idx)
548 }
549 stack = stack[:0]
550 out.Instructions = append(out.Instructions, byte(opcode))
551 }
552 }
553}
Usage Example
1package main
2
3import (
4 "fmt"
5 "jit"
6)
7
8func main() {
9 // Example 1: Simple expression
10 fmt.Println("=== Simple Expression ===")
11
12 bc1 := &jit.Bytecode{}
13 idx1 := bc1.AddConstant(15)
14 idx2 := bc1.AddConstant(7)
15
16 bc1.AddInstruction(jit.OP_LOAD_CONST, idx1)
17 bc1.AddInstruction(jit.OP_LOAD_CONST, idx2)
18 bc1.AddInstruction(jit.OP_MUL)
19 bc1.AddInstruction(jit.OP_RETURN)
20
21 vm := jit.NewVM()
22 fmt.Printf("Interpreted result: %d\n", vm.Interpret(bc1))
23
24 compiled, _ := jit.Compile(bc1)
25 fmt.Printf("JIT compiled result: %d\n", compiled())
26
27 // Example 2: Complex expression
28 fmt.Println("\n=== Complex Expression ===")
29
30 bc2 := &jit.Bytecode{}
31 // / 3) - 10
32 c1 := bc2.AddConstant(100)
33 c2 := bc2.AddConstant(50)
34 c3 := bc2.AddConstant(3)
35 c4 := bc2.AddConstant(10)
36
37 bc2.AddInstruction(jit.OP_LOAD_CONST, c1)
38 bc2.AddInstruction(jit.OP_LOAD_CONST, c2)
39 bc2.AddInstruction(jit.OP_ADD)
40 bc2.AddInstruction(jit.OP_LOAD_CONST, c3)
41 bc2.AddInstruction(jit.OP_DIV)
42 bc2.AddInstruction(jit.OP_LOAD_CONST, c4)
43 bc2.AddInstruction(jit.OP_SUB)
44 bc2.AddInstruction(jit.OP_RETURN)
45
46 fmt.Printf("Result: %d\n", vm.Interpret(bc2))
47
48 // Example 3: With optimization
49 fmt.Println("\n=== With Optimization ===")
50
51 optimizer := &jit.Optimizer{}
52 optimized := optimizer.Optimize(bc2)
53
54 fmt.Printf("Original instructions: %d bytes\n", len(bc2.Instructions))
55 fmt.Printf("Optimized instructions: %d bytes\n", len(optimized.Instructions))
56 fmt.Printf("Optimized result: %d\n", vm.Interpret(optimized))
57}
Benchmarking Code
1package jit_test
2
3import (
4 "testing"
5 "jit"
6)
7
8func BenchmarkInterpreter(b *testing.B) {
9 bc := &jit.Bytecode{}
10
11 // Compute: * 5
12 idx1 := bc.AddConstant(10)
13 idx2 := bc.AddConstant(20)
14 idx3 := bc.AddConstant(5)
15
16 bc.AddInstruction(jit.OP_LOAD_CONST, idx1)
17 bc.AddInstruction(jit.OP_LOAD_CONST, idx2)
18 bc.AddInstruction(jit.OP_ADD)
19 bc.AddInstruction(jit.OP_LOAD_CONST, idx3)
20 bc.AddInstruction(jit.OP_MUL)
21 bc.AddInstruction(jit.OP_RETURN)
22
23 vm := jit.NewVM()
24
25 b.ResetTimer()
26 for i := 0; i < b.N; i++ {
27 vm.Interpret(bc)
28 }
29}
30
31func BenchmarkJIT(b *testing.B) {
32 bc := &jit.Bytecode{}
33
34 // Compute: * 5
35 idx1 := bc.AddConstant(10)
36 idx2 := bc.AddConstant(20)
37 idx3 := bc.AddConstant(5)
38
39 bc.AddInstruction(jit.OP_LOAD_CONST, idx1)
40 bc.AddInstruction(jit.OP_LOAD_CONST, idx2)
41 bc.AddInstruction(jit.OP_ADD)
42 bc.AddInstruction(jit.OP_LOAD_CONST, idx3)
43 bc.AddInstruction(jit.OP_MUL)
44 bc.AddInstruction(jit.OP_RETURN)
45
46 compiledFunc, err := jit.Compile(bc)
47 if err != nil {
48 b.Fatalf("Compilation failed: %v", err)
49 }
50
51 b.ResetTimer()
52 for i := 0; i < b.N; i++ {
53 compiledFunc()
54 }
55}
56
57// Example benchmark results:
58// BenchmarkInterpreter-8 5000000 250 ns/op 0 B/op 0 allocs/op
59// BenchmarkJIT-8 200000000 5 ns/op 0 B/op 0 allocs/op
60// Speedup: 50x faster with JIT!
Production Considerations
1. Advanced Register Allocation:
1type RegisterAllocator struct {
2 Available []bool // Which registers are free
3 VarToReg map[int]int // Variable to register mapping
4}
5
6func Allocate() {
7 for i, avail := range ra.Available {
8 if avail {
9 ra.Available[i] = false
10 return i, nil
11 }
12 }
13 return -1, fmt.Errorf("out of registers")
14}
15
16func Free(reg int) {
17 ra.Available[reg] = true
18}
2. Tiered Compilation Strategy:
1type TieredCompiler struct {
2 Threshold1 int // Interpreter -> Template JIT
3 Threshold2 int // Template -> Optimizing JIT
4}
5
6func ShouldCompile(count int) int {
7 if count < tc.Threshold1 {
8 return 0 // Interpret
9 } else if count < tc.Threshold2 {
10 return 1 // Template JIT
11 } else {
12 return 2 // Optimizing JIT
13 }
14}
3. Inline Caching for Dynamic Dispatch:
1type InlineCache struct {
2 Expected int64 // Expected value
3 Target uintptr // Cached code pointer
4}
5
6func Lookup(value int64) {
7 if value == ic.Expected {
8 return ic.Target, true
9 }
10 return 0, false
11}
4. Deoptimization Support:
1type Deoptimizer struct {
2 SafePoints map[uintptr]int // PC -> bytecode offset
3}
4
5func Deoptimize(pc uintptr, vm *VM) {
6 offset := d.SafePoints[pc]
7 // Restore interpreter state at this offset
8 vm.IP = offset
9}
Performance Characteristics:
| Metric | Interpreter | Template JIT | Optimizing JIT |
|---|---|---|---|
| Compilation time | 0ms | ~1ms | ~10ms |
| Execution speed | 1x | 10-30x faster | 50-100x faster |
| Code size | N/A | ~15 bytes/op | ~8 bytes/op |
| Warmup time | None | Fast | Slow |
| Best for | Cold code | Warm code | Hot code |
Scaling Considerations:
- Use code cache to avoid recompilation
- Implement tiered compilation for startup performance
- Add speculative optimization with deopt support
- Monitor code cache size and evict cold code
- Profile to identify true hot paths
Key Takeaways
- JIT compilation bridges interpreted and compiled execution by generating native code at runtime
- Profile-guided optimization uses runtime information to make better optimization decisions
- Executable memory requires special OS permissions
- x86-64 encoding is complex but can be wrapped in helper functions
- Optimization must preserve semantics - aggressive optimization can break programs
- Tiered compilation balances startup time and peak performance by compiling hot code first
- Register allocation is critical for performance - stack-based code is simple but slower
- System V ABI defines calling conventions that must be followed for interop