Memory Allocator

Exercise: Memory Allocator

Difficulty - Advanced

Learning Objectives

  • Implement custom memory allocation strategies
  • Manage memory pools and arenas
  • Handle memory alignment requirements
  • Implement slab allocation
  • Track memory usage and fragmentation
  • Understand unsafe operations

Problem Statement

Create a custom memory allocator with arena-based allocation and memory pooling for high-performance applications.

Core Components

 1package allocator
 2
 3import (
 4    "unsafe"
 5)
 6
 7type Arena struct {
 8    buf    []byte
 9    offset uintptr
10    size   uintptr
11}
12
13type Pool struct {
14    blockSize int
15    blocks    []unsafe.Pointer
16    arena     *Arena
17}
18
19func NewArena(size int) *Arena
20func Alloc(size int) unsafe.Pointer
21func Reset()
22func NewPool(blockSize int, capacity int) *Pool
23func Get() unsafe.Pointer
24func Put(ptr unsafe.Pointer)

Solution

Click to see the solution

Algorithm Overview

This implementation provides three memory allocation strategies optimized for different use cases:

1. Arena Allocator:

  • Linear allocation from pre-allocated buffer
  • Fastest possible allocation: increment offset pointer
  • No individual deallocation - reset entire arena
  • Best for: Request-scoped allocations, parsers, temporary computations
  • Trade-off: Cannot free individual objects

2. Pool Allocator:

  • Pre-allocates fixed-size blocks from arena
  • Maintains free list for reuse
  • O(1) allocation and deallocation
  • Best for: Objects of same size
  • Trade-off: Fixed block size, potential fragmentation

3. Slab Allocator:

  • Multiple pools for different object sizes
  • Round up to next power of 2
  • Reduces external fragmentation
  • Best for: Mixed object sizes with predictable patterns
  • Trade-off: Internal fragmentation from rounding

Memory Alignment:

  • All allocations aligned to 8-byte boundaries
  • Ensures optimal CPU access patterns
  • Required for atomic operations and SIMD
  • Uses bitwise operations for fast alignment

Key Design Principles:

  • Unsafe operations for direct memory manipulation
  • Mutex synchronization for thread safety
  • Zero allocation overhead after initialization
  • Explicit memory lifetime management

Time Complexity Analysis

Operation Arena Pool Slab Explanation
Alloc O(1) O(1) O(1) Pointer increment or pop from free list
Free N/A O(1) O(1) Push to free list
Reset O(1) O(1) O(n) Reset offset or rebuild pools
Alignment O(1) O(1) O(1) Bitwise AND operation

Space Complexity

Arena:

  • O(n) where n = arena size
  • Zero overhead per allocation
  • Total waste: alignment padding only

Pool:

  • O(n × m) where n = capacity, m = block size
  • Per-block overhead: 8 bytes
  • Waste: Unused capacity in pre-allocated blocks

Slab:

  • O(k × n × m) where k = number of size classes
  • Overhead: Multiple pools, each with own waste
  • Typical waste: 25-50% due to power-of-2 rounding

Example Memory Usage:

  • Go heap: ~50KB + GC overhead
  • Arena: ~10KB
  • Pool: ~12KB
  • Slab: ~15KB

Implementation

  1package allocator
  2
  3import (
  4    "sync"
  5    "unsafe"
  6)
  7
  8const (
  9    defaultAlignment = 8
 10)
 11
 12// Arena provides bump-pointer allocation
 13type Arena struct {
 14    buf    []byte
 15    offset uintptr
 16    size   uintptr
 17    mu     sync.Mutex
 18}
 19
 20func NewArena(size int) *Arena {
 21    return &Arena{
 22        buf:  make([]byte, size),
 23        size: uintptr(size),
 24    }
 25}
 26
 27func Alloc(size int) unsafe.Pointer {
 28    a.mu.Lock()
 29    defer a.mu.Unlock()
 30
 31    aligned := align(uintptr(size), defaultAlignment)
 32    if a.offset+aligned > a.size {
 33        return nil // Out of memory
 34    }
 35
 36    ptr := unsafe.Pointer(&a.buf[a.offset])
 37    a.offset += aligned
 38    return ptr
 39}
 40
 41func Reset() {
 42    a.mu.Lock()
 43    defer a.mu.Unlock()
 44    a.offset = 0
 45}
 46
 47func Usage() {
 48    a.mu.Lock()
 49    defer a.mu.Unlock()
 50    return a.offset, a.size
 51}
 52
 53// Pool provides object pooling with fixed-size blocks
 54type Pool struct {
 55    blockSize int
 56    free      []unsafe.Pointer
 57    arena     *Arena
 58    mu        sync.Mutex
 59}
 60
 61func NewPool(blockSize int, capacity int) *Pool {
 62    arena := NewArena(blockSize * capacity)
 63    p := &Pool{
 64        blockSize: blockSize,
 65        free:      make([]unsafe.Pointer, 0, capacity),
 66        arena:     arena,
 67    }
 68
 69    // Pre-allocate blocks
 70    for i := 0; i < capacity; i++ {
 71        ptr := arena.Alloc(blockSize)
 72        if ptr != nil {
 73            p.free = append(p.free, ptr)
 74        }
 75    }
 76
 77    return p
 78}
 79
 80func Get() unsafe.Pointer {
 81    p.mu.Lock()
 82    defer p.mu.Unlock()
 83
 84    if len(p.free) == 0 {
 85        // Allocate new block
 86        return p.arena.Alloc(p.blockSize)
 87    }
 88
 89    ptr := p.free[len(p.free)-1]
 90    p.free = p.free[:len(p.free)-1]
 91    return ptr
 92}
 93
 94func Put(ptr unsafe.Pointer) {
 95    p.mu.Lock()
 96    defer p.mu.Unlock()
 97    p.free = append(p.free, ptr)
 98}
 99
100// Slab allocator for multiple object sizes
101type SlabAllocator struct {
102    slabs map[int]*Pool
103    mu    sync.RWMutex
104}
105
106func NewSlabAllocator() *SlabAllocator {
107    return &SlabAllocator{
108        slabs: make(map[int]*Pool),
109    }
110}
111
112func Alloc(size int) unsafe.Pointer {
113    slabSize := nextPowerOf2(size)
114
115    s.mu.RLock()
116    pool, exists := s.slabs[slabSize]
117    s.mu.RUnlock()
118
119    if !exists {
120        s.mu.Lock()
121        pool, exists = s.slabs[slabSize]
122        if !exists {
123            pool = NewPool(slabSize, 1024)
124            s.slabs[slabSize] = pool
125        }
126        s.mu.Unlock()
127    }
128
129    return pool.Get()
130}
131
132func Free(ptr unsafe.Pointer, size int) {
133    slabSize := nextPowerOf2(size)
134
135    s.mu.RLock()
136    pool, exists := s.slabs[slabSize]
137    s.mu.RUnlock()
138
139    if exists {
140        pool.Put(ptr)
141    }
142}
143
144func align(size, alignment uintptr) uintptr {
145    return &^
146}
147
148func nextPowerOf2(n int) int {
149    if n <= 0 {
150        return 1
151    }
152    n--
153    n |= n >> 1
154    n |= n >> 2
155    n |= n >> 4
156    n |= n >> 8
157    n |= n >> 16
158    n++
159    return n
160}
161
162// Helper to convert between unsafe.Pointer and []byte
163func PointerToBytes(ptr unsafe.Pointer, size int) []byte {
164    return unsafe.Slice((*byte)(ptr), size)
165}
166
167func BytesToPointer(b []byte) unsafe.Pointer {
168    return unsafe.Pointer(&b[0])
169}

Usage Example

 1package main
 2
 3import (
 4    "fmt"
 5    "unsafe"
 6)
 7
 8type MyStruct struct {
 9    ID   int
10    Name [64]byte
11}
12
13func main() {
14    // Arena allocation
15    arena := allocator.NewArena(1024 * 1024) // 1MB
16
17    ptr := arena.Alloc(100)
18    if ptr != nil {
19        slice := allocator.PointerToBytes(ptr, 100)
20        copy(slice, []byte("Hello, Arena!"))
21        fmt.Println(string(slice[:13]))
22    }
23
24    arena.Reset() // Free all allocations
25
26    // Pool allocation
27    structSize := int(unsafe.Sizeof(MyStruct{}))
28    pool := allocator.NewPool(structSize, 100)
29
30    ptr = pool.Get()
31    if ptr != nil {
32        obj :=(ptr)
33        obj.ID = 42
34        copy(obj.Name[:], "Pooled Object")
35        fmt.Printf("ID: %d, Name: %s\n", obj.ID, string(obj.Name[:13]))
36
37        pool.Put(ptr) // Return to pool
38    }
39
40    // Slab allocation
41    slab := allocator.NewSlabAllocator()
42
43    ptr1 := slab.Alloc(32)
44    ptr2 := slab.Alloc(128)
45    ptr3 := slab.Alloc(32)
46
47    slab.Free(ptr1, 32)
48    slab.Free(ptr2, 128)
49    slab.Free(ptr3, 32)
50}

Benchmarking Code

  1package allocator_test
  2
  3import (
  4    "sync"
  5    "testing"
  6    "unsafe"
  7)
  8
  9type TestStruct struct {
 10    ID   int
 11    Data [64]byte
 12}
 13
 14// Benchmark standard Go allocation
 15func BenchmarkStandardAllocation(b *testing.B) {
 16    b.ResetTimer()
 17    for i := 0; i < b.N; i++ {
 18        obj := &TestStruct{ID: i}
 19        _ = obj
 20    }
 21}
 22
 23// Benchmark arena allocation
 24func BenchmarkArenaAllocation(b *testing.B) {
 25    arena := allocator.NewArena(1024 * 1024) // 1MB
 26    size := int(unsafe.Sizeof(TestStruct{}))
 27
 28    b.ResetTimer()
 29    for i := 0; i < b.N; i++ {
 30        ptr := arena.Alloc(size)
 31        if ptr == nil {
 32            arena.Reset()
 33            ptr = arena.Alloc(size)
 34        }
 35        obj :=(ptr)
 36        obj.ID = i
 37    }
 38}
 39
 40// Benchmark pool allocation
 41func BenchmarkPoolAllocation(b *testing.B) {
 42    size := int(unsafe.Sizeof(TestStruct{}))
 43    pool := allocator.NewPool(size, 1000)
 44
 45    b.ResetTimer()
 46    for i := 0; i < b.N; i++ {
 47        ptr := pool.Get()
 48        obj :=(ptr)
 49        obj.ID = i
 50        pool.Put(ptr)
 51    }
 52}
 53
 54// Benchmark slab allocation
 55func BenchmarkSlabAllocation(b *testing.B) {
 56    slab := allocator.NewSlabAllocator()
 57    size := int(unsafe.Sizeof(TestStruct{}))
 58
 59    b.ResetTimer()
 60    for i := 0; i < b.N; i++ {
 61        ptr := slab.Alloc(size)
 62        obj :=(ptr)
 63        obj.ID = i
 64        slab.Free(ptr, size)
 65    }
 66}
 67
 68// Benchmark concurrent arena allocation
 69func BenchmarkConcurrentArena(b *testing.B) {
 70    arena := allocator.NewArena(10 * 1024 * 1024) // 10MB
 71    size := int(unsafe.Sizeof(TestStruct{}))
 72
 73    b.ResetTimer()
 74    b.RunParallel(func(pb *testing.PB) {
 75        for pb.Next() {
 76            ptr := arena.Alloc(size)
 77            if ptr != nil {
 78                obj :=(ptr)
 79                obj.ID = 1
 80            }
 81        }
 82    })
 83}
 84
 85// Benchmark concurrent pool allocation
 86func BenchmarkConcurrentPool(b *testing.B) {
 87    size := int(unsafe.Sizeof(TestStruct{}))
 88    pool := allocator.NewPool(size, 10000)
 89
 90    b.ResetTimer()
 91    b.RunParallel(func(pb *testing.PB) {
 92        for pb.Next() {
 93            ptr := pool.Get()
 94            if ptr != nil {
 95                obj :=(ptr)
 96                obj.ID = 1
 97                pool.Put(ptr)
 98            }
 99        }
100    })
101}
102
103// Benchmark allocation patterns
104func BenchmarkMixedSizes(b *testing.B) {
105    slab := allocator.NewSlabAllocator()
106    sizes := []int{16, 32, 64, 128, 256}
107
108    b.ResetTimer()
109    for i := 0; i < b.N; i++ {
110        size := sizes[i%len(sizes)]
111        ptr := slab.Alloc(size)
112        slab.Free(ptr, size)
113    }
114}
115
116// Benchmark memory fragmentation
117func BenchmarkFragmentation(b *testing.B) {
118    slab := allocator.NewSlabAllocator()
119    var ptrs []unsafe.Pointer
120
121    b.ResetTimer()
122    for i := 0; i < b.N; i++ {
123        // Allocate various sizes
124        ptr1 := slab.Alloc(32)
125        ptr2 := slab.Alloc(128)
126        ptr3 := slab.Alloc(64)
127
128        ptrs = append(ptrs, ptr1, ptr2, ptr3)
129
130        // Free every other allocation
131        if len(ptrs) > 6 {
132            slab.Free(ptrs[0], 32)
133            slab.Free(ptrs[2], 64)
134            ptrs = ptrs[3:]
135        }
136    }
137}
138
139// Benchmark vs sync.Pool
140func BenchmarkSyncPool(b *testing.B) {
141    pool := &sync.Pool{
142        New: func() interface{} {
143            return &TestStruct{}
144        },
145    }
146
147    b.ResetTimer()
148    b.RunParallel(func(pb *testing.PB) {
149        for pb.Next() {
150            obj := pool.Get().(*TestStruct)
151            obj.ID = 1
152            pool.Put(obj)
153        }
154    })
155}
156
157// Benchmark arena reset
158func BenchmarkArenaReset(b *testing.B) {
159    arena := allocator.NewArena(1024 * 1024)
160    size := 100
161
162    b.ResetTimer()
163    for i := 0; i < b.N; i++ {
164        // Allocate until full
165        for j := 0; j < 1000; j++ {
166            arena.Alloc(size)
167        }
168        // Reset
169        arena.Reset()
170    }
171}
172
173// Example benchmark results:
174// BenchmarkStandardAllocation-8       50000000     28.5 ns/op    80 B/op    1 allocs/op
175// BenchmarkArenaAllocation-8         100000000     12.3 ns/op     0 B/op    0 allocs/op
176// BenchmarkPoolAllocation-8          200000000      7.8 ns/op     0 B/op    0 allocs/op
177// BenchmarkSlabAllocation-8          150000000     10.2 ns/op     0 B/op    0 allocs/op
178// BenchmarkConcurrentArena-8          80000000     15.5 ns/op     0 B/op    0 allocs/op
179// BenchmarkConcurrentPool-8          150000000      9.1 ns/op     0 B/op    0 allocs/op
180// BenchmarkMixedSizes-8              100000000     11.8 ns/op     0 B/op    0 allocs/op
181// BenchmarkSyncPool-8                200000000      8.5 ns/op     0 B/op    0 allocs/op
182// BenchmarkArenaReset-8                 500000   3200.0 ns/op     0 B/op    0 allocs/op

Production Considerations

1. Thread-Safe Arena with Sharding:

 1type ShardedArena struct {
 2    shards    []*Arena
 3    shardMask uint32
 4}
 5
 6func NewShardedArena(size int, numShards int) *ShardedArena {
 7    shards := make([]*Arena, numShards)
 8    for i := 0; i < numShards; i++ {
 9        shards[i] = NewArena(size / numShards)
10    }
11
12    return &ShardedArena{
13        shards:    shards,
14        shardMask: uint32(numShards - 1), // Must be power of 2
15    }
16}
17
18func Alloc(size int) unsafe.Pointer {
19    // Use goroutine ID or random for shard selection
20    shardID := getGoroutineID() & s.shardMask
21    return s.shards[shardID].Alloc(size)
22}

2. Memory Pressure Monitoring:

 1type MonitoredAllocator struct {
 2    arena     *Arena
 3    allocated atomic.Int64
 4    peak      atomic.Int64
 5    threshold int64
 6    onPressure func()
 7}
 8
 9func Alloc(size int) unsafe.Pointer {
10    current := m.allocated.Add(int64(size))
11
12    // Update peak
13    for {
14        peak := m.peak.Load()
15        if current <= peak || m.peak.CompareAndSwap(peak, current) {
16            break
17        }
18    }
19
20    // Check memory pressure
21    if current > m.threshold && m.onPressure != nil {
22        m.onPressure()
23    }
24
25    return m.arena.Alloc(size)
26}
27
28func Reset() {
29    m.arena.Reset()
30    m.allocated.Store(0)
31}

3. Buddy Allocator for Variable Sizes:

 1type BuddyAllocator struct {
 2    memory    []byte
 3    freeLists [32][]int // One per power of 2
 4    minOrder  int       // Minimum block size = 2^minOrder
 5    mu        sync.Mutex
 6}
 7
 8func Alloc(size int) unsafe.Pointer {
 9    b.mu.Lock()
10    defer b.mu.Unlock()
11
12    // Round up to power of 2
13    order := b.orderFor(size)
14
15    // Find smallest available block
16    for o := order; o < len(b.freeLists); o++ {
17        if len(b.freeLists[o]) > 0 {
18            // Pop block
19            idx := b.freeLists[o][len(b.freeLists[o])-1]
20            b.freeLists[o] = b.freeLists[o][:len(b.freeLists[o])-1]
21
22            // Split if necessary
23            for o > order {
24                o--
25                buddyIdx := idx ^
26                b.freeLists[o] = append(b.freeLists[o], buddyIdx)
27            }
28
29            offset := idx << b.minOrder
30            return unsafe.Pointer(&b.memory[offset])
31        }
32    }
33
34    return nil
35}
36
37func Free(ptr unsafe.Pointer, size int) {
38    b.mu.Lock()
39    defer b.mu.Unlock()
40
41    order := b.orderFor(size)
42    offset := uintptr(ptr) - uintptr(unsafe.Pointer(&b.memory[0]))
43    idx := int(offset) >> b.minOrder
44
45    // Coalesce with buddy if free
46    for order < len(b.freeLists)-1 {
47        buddyIdx := idx ^
48
49        // Find and remove buddy from free list
50        found := -1
51        for i, freeIdx := range b.freeLists[order] {
52            if freeIdx == buddyIdx {
53                found = i
54                break
55            }
56        }
57
58        if found == -1 {
59            break
60        }
61
62        // Remove buddy and merge
63        b.freeLists[order] = append(
64            b.freeLists[order][:found],
65            b.freeLists[order][found+1:]...,
66        )
67
68        idx = min(idx, buddyIdx)
69        order++
70    }
71
72    // Add to free list
73    b.freeLists[order] = append(b.freeLists[order], idx)
74}

4. Integration with CGO:

 1// #include <stdlib.h>
 2import "C"
 3
 4type CAllocator struct {
 5    allocations map[unsafe.Pointer]C.size_t
 6    mu          sync.Mutex
 7}
 8
 9func Alloc(size int) unsafe.Pointer {
10    ptr := C.malloc(C.size_t(size))
11
12    c.mu.Lock()
13    c.allocations[ptr] = C.size_t(size)
14    c.mu.Unlock()
15
16    return ptr
17}
18
19func Free(ptr unsafe.Pointer) {
20    c.mu.Lock()
21    delete(c.allocations, ptr)
22    c.mu.Unlock()
23
24    C.free(ptr)
25}

5. NUMA-Aware Allocation:

 1type NUMAAllocator struct {
 2    nodes []*Arena
 3}
 4
 5func NewNUMAAllocator(size int) *NUMAAllocator {
 6    numNodes := runtime.NumCPU() / 2 // Approximate NUMA nodes
 7
 8    nodes := make([]*Arena, numNodes)
 9    for i := 0; i < numNodes; i++ {
10        nodes[i] = NewArena(size / numNodes)
11    }
12
13    return &NUMAAllocator{nodes: nodes}
14}
15
16func Alloc(size int) unsafe.Pointer {
17    // Allocate from local NUMA node
18    nodeID := getCurrentNUMANode()
19    return n.nodes[nodeID].Alloc(size)
20}

6. Debugging and Leak Detection:

 1type DebugAllocator struct {
 2    allocator *SlabAllocator
 3    tracking  map[unsafe.Pointer]AllocationInfo
 4    mu        sync.Mutex
 5}
 6
 7type AllocationInfo struct {
 8    Size      int
 9    Timestamp time.Time
10    Stack     string
11}
12
13func Alloc(size int) unsafe.Pointer {
14    ptr := d.allocator.Alloc(size)
15
16    d.mu.Lock()
17    d.tracking[ptr] = AllocationInfo{
18        Size:      size,
19        Timestamp: time.Now(),
20        Stack:     string(debug.Stack()),
21    }
22    d.mu.Unlock()
23
24    return ptr
25}
26
27func Free(ptr unsafe.Pointer, size int) {
28    d.mu.Lock()
29    delete(d.tracking, ptr)
30    d.mu.Unlock()
31
32    d.allocator.Free(ptr, size)
33}
34
35func DetectLeaks() []AllocationInfo {
36    d.mu.Lock()
37    defer d.mu.Unlock()
38
39    leaks := make([]AllocationInfo, 0)
40    threshold := time.Now().Add(-5 * time.Minute)
41
42    for _, info := range d.tracking {
43        if info.Timestamp.Before(threshold) {
44            leaks = append(leaks, info)
45        }
46    }
47
48    return leaks
49}

Performance Characteristics:

Allocator Alloc Speed Free Speed Memory Overhead Best Use Case
Arena 12ns N/A 0% Request-scoped, parsers
Pool 8ns 8ns 10% Fixed-size objects
Slab 10ns 10ns 25% Mixed sizes
Buddy 50ns 50ns 15% Variable sizes with coalescing
Go heap 28ns Auto 50-100% General purpose

When to Use Custom Allocators:

  • High allocation rate
  • Predictable allocation patterns
  • Batch processing with clear lifetimes
  • Real-time systems requiring predictable latency
  • Minimizing GC pressure
  • FFI/CGO interop

When NOT to Use:

  • Unpredictable allocation patterns
  • Long-lived objects with complex lifecycles
  • Small number of allocations
  • Prototype/development phase

Key Takeaways

  • Custom allocators reduce GC pressure
  • Arena allocation is fast but requires bulk deallocation
  • Pool allocation enables object reuse
  • Slab allocation handles multiple object sizes
  • Memory alignment is critical for performance
  • unsafe package enables low-level memory control
  • Track memory usage to prevent leaks
  • Consider memory fragmentation in long-running applications