Lock-Free Concurrent Structures

Exercise: Lock-Free Concurrent Data Structures

Difficulty - Advanced

Estimated Time - 8-10 hours

Learning Objectives

  • Understand lock-free programming principles and theory
  • Implement lock-free data structures using atomic operations
  • Master Compare-And-Swap algorithms
  • Handle the ABA problem and memory reclamation
  • Build wait-free and obstruction-free algorithms
  • Measure scalability and performance of lock-free vs mutex-based code

Problem Statement

Build lock-free concurrent data structures that allow multiple threads to safely access shared data without using traditional locks. Implement a lock-free queue, stack, and hash table using atomic operations and compare-and-swap algorithms.

Lock-free programming is essential for high-performance concurrent systems where lock contention becomes a bottleneck. Modern systems like database engines, message brokers, and real-time systems rely on lock-free algorithms for scalability.

Real-World Scenario:

 1// Traditional queue with mutex - poor scalability
 2type MutexQueue struct {
 3    mu    sync.Mutex
 4    items []int
 5}
 6
 7func Enqueue(val int) {
 8    q.mu.Lock()
 9    q.items = append(q.items, val)
10    q.mu.Unlock()
11}
12
13// Lock-free queue - scales with cores
14type LockFreeQueue struct {
15    head unsafe.Pointer // atomic
16    tail unsafe.Pointer // atomic
17}
18
19func Enqueue(val int) {
20    node := &Node{Value: val}
21    for {
22        tail :=(atomic.LoadPointer(&q.tail))
23        next := atomic.LoadPointer(&tail.Next)
24
25        if next == nil {
26            if atomic.CompareAndSwapPointer(&tail.Next, nil, unsafe.Pointer(node)) {
27                atomic.CompareAndSwapPointer(&q.tail, unsafe.Pointer(tail), unsafe.Pointer(node))
28                return
29            }
30        } else {
31            atomic.CompareAndSwapPointer(&q.tail, unsafe.Pointer(tail), next)
32        }
33    }
34}
35
36// Result: 10x better scalability under contention!

Requirements

Functional Requirements

  1. Lock-Free Queue: Implement Michael-Scott lock-free queue
  2. Lock-Free Stack: Implement Treiber stack
  3. Lock-Free Hash Table: Implement concurrent hash table
  4. Memory Reclamation: Handle safe memory reclamation
  5. ABA Problem Solution: Prevent ABA problem with tagged pointers
  6. Progress Guarantees: Ensure wait-free or lock-free progress
  7. Benchmarks: Compare performance vs mutex-based implementations

Non-Functional Requirements

  • Support 1-64 concurrent threads
  • Scale linearly with number of cores
  • No data races or memory leaks
  • Handle high contention scenarios
  • Minimal memory overhead

Background and Theory

Lock-Free Programming Fundamentals

Progress Guarantees:

  1. Blocking: Thread may wait indefinitely for a lock
  2. Obstruction-Free: Thread makes progress if it runs in isolation
  3. Lock-Free: At least one thread makes progress in finite steps
  4. Wait-Free: Every thread makes progress in bounded steps

Key Concepts:

  • Atomic Operations: Operations that complete without interruption
  • Compare-And-Swap: Atomically compare and update a value
  • Memory Ordering: Guarantees about when memory writes become visible
  • ABA Problem: Value changes from A to B and back to A, making it seem unchanged

Compare-And-Swap

CAS is the fundamental atomic primitive for lock-free algorithms:

 1// Atomic compare-and-swap
 2func CAS(addr *int64, old, new int64) bool {
 3    // Atomically:
 4    // if *addr == old {
 5    //     *addr = new
 6    //     return true
 7    // }
 8    // return false
 9}
10
11// In Go:
12atomic.CompareAndSwapInt64(&value, oldValue, newValue)

CAS Loop Pattern:

1for {
2    old := atomic.LoadInt64(&value)
3    new := old + 1  // Compute new value
4
5    if atomic.CompareAndSwapInt64(&value, old, new) {
6        break  // Success
7    }
8    // CAS failed, another thread modified value, retry
9}

The ABA Problem

The ABA problem occurs when:

  1. Thread 1 reads value A
  2. Thread 2 changes A to B, then back to A
  3. Thread 1's CAS succeeds even though value changed

Solution: Tagged Pointers

 1type TaggedPointer struct {
 2    Ptr uintptr // Lower bits are pointer
 3    Tag uint32  // Upper bits are version tag
 4}
 5
 6// Pack pointer and tag into single uint64
 7func Pack(ptr unsafe.Pointer, tag uint32) uint64 {
 8    return uint64(uintptr(ptr)) | << 48)
 9}
10
11// Unpack
12func Unpack(packed uint64) {
13    ptr := unsafe.Pointer(uintptr(packed & 0x0000FFFFFFFFFFFF))
14    tag := uint32(packed >> 48)
15    return ptr, tag
16}

Memory Reclamation Problem

Lock-free algorithms face a challenge: when can we safely free a node?

Problem:

1// Thread 1: Dequeue
2node := q.head
3value := node.Value
4
5// Thread 2: Enqueue
6free(node)  // Dangerous!
7
8// Thread 1: Read after free!
9return value  // CRASH

Solutions:

  1. Hazard Pointers: Threads mark pointers they're using
  2. Epoch-Based Reclamation: Group frees into epochs
  3. Reference Counting: Track references to each node
  4. Quiescent-State-Based Reclamation: Free when all threads quiesce

Michael-Scott Lock-Free Queue

The classic lock-free FIFO queue:

Initial state:
head -> [dummy] <- tail
         next=nil

Enqueue(A):
head -> [dummy] -> [A] <- tail
         next=A    next=nil

Enqueue(B):
head -> [dummy] -> [A] -> [B] <- tail
         next=A    next=B   next=nil

Dequeue():
head -> [A] -> [B] <- tail
         next=B   next=nil
(returns dummy's value)

Treiber Stack

Simple lock-free LIFO stack:

Initial: top -> nil

Push(A):
top -> [A] -> nil

Push(B):
top -> [B] -> [A] -> nil

Pop():
top -> [A] -> nil
(returns B)

Use Cases

1. High-Performance Queues

  • Message passing systems
  • Task schedulers
  • Network packet buffers

2. Concurrent Data Structures

  • Non-blocking hash tables
  • Skip lists
  • Priority queues

3. Real-Time Systems

  • Audio/video processing
  • Game engines
  • Trading systems

4. System Software

  • OS kernel data structures
  • Database transaction logs
  • RCU in Linux

Implementation Challenges

Challenge 1: ABA Problem

Issue: CAS cannot detect if value changed and reverted.

Solution: Use tagged pointers or version numbers:

 1type Node struct {
 2    Value int
 3    Next  uint64 // Packed pointer + tag
 4}
 5
 6func GetNext() {
 7    packed := atomic.LoadUint64(&n.Next)
 8    ptr := unsafe.Pointer(uintptr(packed & 0x0000FFFFFFFFFFFF))
 9    tag := uint32(packed >> 48)
10    return(ptr), tag
11}
12
13func CASNext(old, new *Node, oldTag, newTag uint32) bool {
14    oldPacked := pack(old, oldTag)
15    newPacked := pack(new, newTag)
16    return atomic.CompareAndSwapUint64(&n.Next, oldPacked, newPacked)
17}

Challenge 2: Memory Reclamation

Issue: Cannot free nodes that other threads may be accessing.

Solution: Hazard pointers - threads advertise which pointers they're using:

 1type HazardPointer struct {
 2    pointers [MaxThreads]unsafe.Pointer
 3}
 4
 5func Acquire(threadID int, ptr unsafe.Pointer) {
 6    atomic.StorePointer(&hp.pointers[threadID], ptr)
 7}
 8
 9func Release(threadID int) {
10    atomic.StorePointer(&hp.pointers[threadID], nil)
11}
12
13func CanFree(ptr unsafe.Pointer) bool {
14    for i := 0; i < MaxThreads; i++ {
15        if atomic.LoadPointer(&hp.pointers[i]) == ptr {
16            return false // Another thread is using it
17        }
18    }
19    return true
20}

Challenge 3: Memory Ordering

Issue: Compiler/CPU reordering can break lock-free algorithms.

Solution: Use proper memory barriers:

1// Go's atomic package provides sequential consistency
2atomic.LoadPointer(&ptr)  // Acquire semantics
3atomic.StorePointer(&ptr, val)  // Release semantics
4atomic.CompareAndSwapPointer(&ptr, old, new)  // Full barrier

Challenge 4: Performance Tuning

Issue: False sharing and cache line bouncing hurt performance.

Solution: Cache line padding:

1type CacheLinePadded struct {
2    Value int64
3    _     [7]int64 // Pad to 64 bytes
4}

Hints

Hint 1: Start with Lock-Free Stack

The stack is simpler than the queue:

 1type Node struct {
 2    Value int
 3    Next  unsafe.Pointer // atomic
 4}
 5
 6type LockFreeStack struct {
 7    top unsafe.Pointer // atomic
 8}
 9
10func Push(value int) {
11    node := &Node{Value: value}
12    for {
13        top := atomic.LoadPointer(&s.top)
14        node.Next = top
15        if atomic.CompareAndSwapPointer(&s.top, top, unsafe.Pointer(node)) {
16            return
17        }
18    }
19}
Hint 2: Michael-Scott Queue Needs Dummy Node

Keep a dummy node to simplify empty queue handling:

 1type LockFreeQueue struct {
 2    head unsafe.Pointer // atomic
 3    tail unsafe.Pointer // atomic
 4}
 5
 6func NewLockFreeQueue() *LockFreeQueue {
 7    dummy := &Node{}
 8    return &LockFreeQueue{
 9        head: unsafe.Pointer(dummy),
10        tail: unsafe.Pointer(dummy),
11    }
12}
Hint 3: Handle Tail Lag in Queue

Tail pointer may lag behind actual tail:

 1func Enqueue(value int) {
 2    node := &Node{Value: value}
 3    for {
 4        tail :=(atomic.LoadPointer(&q.tail))
 5        next :=(atomic.LoadPointer(&tail.Next))
 6
 7        // Check if tail is still consistent
 8        if tail !=(atomic.LoadPointer(&q.tail)) {
 9            continue
10        }
11
12        if next == nil {
13            // Try to link node
14            if atomic.CompareAndSwapPointer(&tail.Next, nil, unsafe.Pointer(node)) {
15                // Try to swing tail
16                atomic.CompareAndSwapPointer(&q.tail, unsafe.Pointer(tail), unsafe.Pointer(node))
17                return
18            }
19        } else {
20            // Tail is lagging, try to advance it
21            atomic.CompareAndSwapPointer(&q.tail, unsafe.Pointer(tail), unsafe.Pointer(next))
22        }
23    }
24}
Hint 4: Use Epoch-Based Reclamation

Simple memory reclamation without hazard pointers:

 1type Epoch struct {
 2    current int64
 3    retired [3][]unsafe.Pointer // 3 epochs
 4}
 5
 6func Enter(threadID int) int64 {
 7    return atomic.LoadInt64(&e.current)
 8}
 9
10func Retire(ptr unsafe.Pointer, epoch int64) {
11    idx := epoch % 3
12    e.retired[idx] = append(e.retired[idx], ptr)
13}
14
15func TryAdvance() {
16    // If all threads have moved to next epoch, free old epoch
17    nextEpoch := atomic.AddInt64(&e.current, 1)
18    oldEpoch := % 3
19
20    for _, ptr := range e.retired[oldEpoch] {
21        // Safe to free now
22        free(ptr)
23    }
24    e.retired[oldEpoch] = nil
25}
Hint 5: Benchmark with Contention

Test scalability with multiple goroutines:

 1func BenchmarkLockFreeQueue_Contention(b *testing.B) {
 2    q := NewLockFreeQueue()
 3    numGoroutines := runtime.NumCPU()
 4
 5    b.ResetTimer()
 6    b.RunParallel(func(pb *testing.PB) {
 7        for pb.Next() {
 8            q.Enqueue(42)
 9            q.Dequeue()
10        }
11    })
12}

Solution

Show Complete Solution

Approach

We'll implement three lock-free data structures:

  1. Treiber Stack - Simple lock-free LIFO stack
  2. Michael-Scott Queue - Classic lock-free FIFO queue
  3. Lock-Free Hash Table - Concurrent hash table with CAS
  4. Memory Reclamation - Epoch-based reclamation system

Implementation

  1package lockfree
  2
  3import (
  4    "runtime"
  5    "sync"
  6    "sync/atomic"
  7    "unsafe"
  8)
  9
 10// ===== Lock-Free Stack =====
 11
 12// StackNode represents a node in the stack
 13type StackNode struct {
 14    Value int
 15    Next  unsafe.Pointer // atomic pointer to next node
 16}
 17
 18// LockFreeStack implements a lock-free LIFO stack
 19type LockFreeStack struct {
 20    top unsafe.Pointer // atomic pointer to top node
 21}
 22
 23// NewLockFreeStack creates a new lock-free stack
 24func NewLockFreeStack() *LockFreeStack {
 25    return &LockFreeStack{}
 26}
 27
 28// Push adds a value to the stack
 29func Push(value int) {
 30    node := &StackNode{
 31        Value: value,
 32    }
 33
 34    for {
 35        top := atomic.LoadPointer(&s.top)
 36        node.Next = top
 37
 38        if atomic.CompareAndSwapPointer(&s.top, top, unsafe.Pointer(node)) {
 39            return
 40        }
 41        // CAS failed, retry
 42    }
 43}
 44
 45// Pop removes and returns the top value
 46func Pop() {
 47    for {
 48        top := atomic.LoadPointer(&s.top)
 49        if top == nil {
 50            return 0, false // Stack is empty
 51        }
 52
 53        topNode :=(top)
 54        next := atomic.LoadPointer(&topNode.Next)
 55
 56        if atomic.CompareAndSwapPointer(&s.top, top, next) {
 57            return topNode.Value, true
 58        }
 59        // CAS failed, retry
 60    }
 61}
 62
 63// IsEmpty returns true if stack is empty
 64func IsEmpty() bool {
 65    return atomic.LoadPointer(&s.top) == nil
 66}
 67
 68// ===== Lock-Free Queue =====
 69
 70// QueueNode represents a node in the queue
 71type QueueNode struct {
 72    Value int
 73    Next  unsafe.Pointer // atomic pointer to next node
 74}
 75
 76// LockFreeQueue implements Michael-Scott lock-free FIFO queue
 77type LockFreeQueue struct {
 78    head unsafe.Pointer // atomic pointer to head
 79    tail unsafe.Pointer // atomic pointer to tail
 80}
 81
 82// NewLockFreeQueue creates a new lock-free queue
 83func NewLockFreeQueue() *LockFreeQueue {
 84    // Use a dummy node to simplify implementation
 85    dummy := &QueueNode{}
 86    return &LockFreeQueue{
 87        head: unsafe.Pointer(dummy),
 88        tail: unsafe.Pointer(dummy),
 89    }
 90}
 91
 92// Enqueue adds a value to the queue
 93func Enqueue(value int) {
 94    node := &QueueNode{
 95        Value: value,
 96    }
 97
 98    for {
 99        tail := atomic.LoadPointer(&q.tail)
100        tailNode :=(tail)
101        next := atomic.LoadPointer(&tailNode.Next)
102
103        // Check if tail is still consistent
104        if tail == atomic.LoadPointer(&q.tail) {
105            if next == nil {
106                // Tail is pointing to last node, try to link new node
107                if atomic.CompareAndSwapPointer(&tailNode.Next, nil, unsafe.Pointer(node)) {
108                    // Enqueue succeeded, try to swing tail
109                    atomic.CompareAndSwapPointer(&q.tail, tail, unsafe.Pointer(node))
110                    return
111                }
112            } else {
113                // Tail is falling behind, try to advance it
114                atomic.CompareAndSwapPointer(&q.tail, tail, next)
115            }
116        }
117    }
118}
119
120// Dequeue removes and returns the first value
121func Dequeue() {
122    for {
123        head := atomic.LoadPointer(&q.head)
124        tail := atomic.LoadPointer(&q.tail)
125        headNode :=(head)
126        next := atomic.LoadPointer(&headNode.Next)
127
128        // Check if head is still consistent
129        if head == atomic.LoadPointer(&q.head) {
130            if head == tail {
131                // Queue is empty or tail is falling behind
132                if next == nil {
133                    return 0, false // Queue is empty
134                }
135                // Tail is falling behind, try to advance it
136                atomic.CompareAndSwapPointer(&q.tail, tail, next)
137            } else {
138                // Read value before CAS, otherwise another dequeue might free the node
139                nextNode :=(next)
140                value := nextNode.Value
141
142                // Try to swing head to next node
143                if atomic.CompareAndSwapPointer(&q.head, head, next) {
144                    return value, true
145                }
146            }
147        }
148    }
149}
150
151// IsEmpty returns true if queue is empty
152func IsEmpty() bool {
153    head := atomic.LoadPointer(&q.head)
154    headNode :=(head)
155    next := atomic.LoadPointer(&headNode.Next)
156    return next == nil
157}
158
159// ===== Lock-Free Hash Table =====
160
161const (
162    InitialBuckets = 16
163    LoadFactor     = 0.75
164)
165
166// HashNode represents a node in the hash table bucket list
167type HashNode struct {
168    Key   int
169    Value int
170    Hash  uint64
171    Next  unsafe.Pointer // atomic
172}
173
174// LockFreeHashTable implements a lock-free hash table
175type LockFreeHashTable struct {
176    buckets unsafe.Pointer // atomic pointer to bucket array
177    count   int64           // atomic count of elements
178    size    int64           // atomic size of bucket array
179}
180
181// NewLockFreeHashTable creates a new lock-free hash table
182func NewLockFreeHashTable() *LockFreeHashTable {
183    buckets := make([]unsafe.Pointer, InitialBuckets)
184    return &LockFreeHashTable{
185        buckets: unsafe.Pointer(&buckets[0]),
186        size:    InitialBuckets,
187    }
188}
189
190// hash computes hash for a key
191func hash(key int) uint64 {
192    // Simple hash function
193    h := uint64(key)
194    h ^= h >> 16
195    h *= 0x85ebca6b
196    h ^= h >> 13
197    h *= 0xc2b2ae35
198    h ^= h >> 16
199    return h
200}
201
202// getBucket returns the bucket for a hash
203func getBucket(hash uint64) *unsafe.Pointer {
204    size := atomic.LoadInt64(&ht.size)
205    buckets :=(atomic.LoadPointer(&ht.buckets))
206    idx := hash % uint64(size)
207    return &buckets[idx]
208}
209
210// Put inserts or updates a key-value pair
211func Put(key, value int) {
212    hash := ht.hash(key)
213    node := &HashNode{
214        Key:   key,
215        Value: value,
216        Hash:  hash,
217    }
218
219    for {
220        bucket := ht.getBucket(hash)
221        head := atomic.LoadPointer(bucket)
222
223        // Check if key already exists
224        for curr := head; curr != nil; {
225            currNode :=(curr)
226            if currNode.Hash == hash && currNode.Key == key {
227                // Update existing value
228                atomic.StoreInt64((*int64)(unsafe.Pointer(&currNode.Value)), int64(value))
229                return
230            }
231            curr = atomic.LoadPointer(&currNode.Next)
232        }
233
234        // Key doesn't exist, try to insert
235        node.Next = head
236        if atomic.CompareAndSwapPointer(bucket, head, unsafe.Pointer(node)) {
237            atomic.AddInt64(&ht.count, 1)
238            return
239        }
240    }
241}
242
243// Get retrieves a value by key
244func Get(key int) {
245    hash := ht.hash(key)
246    bucket := ht.getBucket(hash)
247    head := atomic.LoadPointer(bucket)
248
249    for curr := head; curr != nil; {
250        currNode :=(curr)
251        if currNode.Hash == hash && currNode.Key == key {
252            value := int(atomic.LoadInt64((*int64)(unsafe.Pointer(&currNode.Value))))
253            return value, true
254        }
255        curr = atomic.LoadPointer(&currNode.Next)
256    }
257
258    return 0, false
259}
260
261// Delete removes a key-value pair
262func Delete(key int) bool {
263    hash := ht.hash(key)
264
265    for {
266        bucket := ht.getBucket(hash)
267        head := atomic.LoadPointer(bucket)
268
269        var prev *HashNode
270        for curr := head; curr != nil; {
271            currNode :=(curr)
272
273            if currNode.Hash == hash && currNode.Key == key {
274                next := atomic.LoadPointer(&currNode.Next)
275
276                if prev == nil {
277                    // Delete head
278                    if atomic.CompareAndSwapPointer(bucket, curr, next) {
279                        atomic.AddInt64(&ht.count, -1)
280                        return true
281                    }
282                    break // Retry
283                } else {
284                    // Delete middle/end
285                    if atomic.CompareAndSwapPointer(&prev.Next, curr, next) {
286                        atomic.AddInt64(&ht.count, -1)
287                        return true
288                    }
289                    break // Retry
290                }
291            }
292
293            prev = currNode
294            curr = atomic.LoadPointer(&currNode.Next)
295        }
296
297        if curr == nil {
298            return false // Key not found
299        }
300    }
301}
302
303// Size returns the number of elements
304func Size() int {
305    return int(atomic.LoadInt64(&ht.count))
306}
307
308// ===== Mutex-Based Comparison =====
309
310// MutexQueue for performance comparison
311type MutexQueue struct {
312    mu    sync.Mutex
313    items []int
314}
315
316func NewMutexQueue() *MutexQueue {
317    return &MutexQueue{
318        items: make([]int, 0),
319    }
320}
321
322func Enqueue(value int) {
323    q.mu.Lock()
324    q.items = append(q.items, value)
325    q.mu.Unlock()
326}
327
328func Dequeue() {
329    q.mu.Lock()
330    defer q.mu.Unlock()
331
332    if len(q.items) == 0 {
333        return 0, false
334    }
335
336    value := q.items[0]
337    q.items = q.items[1:]
338    return value, true
339}
340
341// MutexStack for performance comparison
342type MutexStack struct {
343    mu    sync.Mutex
344    items []int
345}
346
347func NewMutexStack() *MutexStack {
348    return &MutexStack{
349        items: make([]int, 0),
350    }
351}
352
353func Push(value int) {
354    s.mu.Lock()
355    s.items = append(s.items, value)
356    s.mu.Unlock()
357}
358
359func Pop() {
360    s.mu.Lock()
361    defer s.mu.Unlock()
362
363    if len(s.items) == 0 {
364        return 0, false
365    }
366
367    value := s.items[len(s.items)-1]
368    s.items = s.items[:len(s.items)-1]
369    return value, true
370}
371
372// ===== Tagged Pointer for ABA Prevention =====
373
374// TaggedPointer combines pointer and version tag
375type TaggedPointer uint64
376
377// Pack creates a tagged pointer
378func Pack(ptr unsafe.Pointer, tag uint32) TaggedPointer {
379    return TaggedPointer(uint64(uintptr(ptr)) | << 48))
380}
381
382// Unpack extracts pointer and tag
383func Unpack(tp TaggedPointer) {
384    ptr := unsafe.Pointer(uintptr(tp & 0x0000FFFFFFFFFFFF))
385    tag := uint32(tp >> 48)
386    return ptr, tag
387}
388
389// StackNodeTagged with ABA prevention
390type StackNodeTagged struct {
391    Value int
392    Next  TaggedPointer
393}
394
395// LockFreeStackTagged uses tagged pointers
396type LockFreeStackTagged struct {
397    top uint64 // atomic TaggedPointer
398}
399
400func NewLockFreeStackTagged() *LockFreeStackTagged {
401    return &LockFreeStackTagged{}
402}
403
404func Push(value int) {
405    node := &StackNodeTagged{Value: value}
406
407    for {
408        oldTop := atomic.LoadUint64(&s.top)
409        oldPtr, oldTag := Unpack(TaggedPointer(oldTop))
410
411        node.Next = TaggedPointer(oldTop)
412        newTop := uint64(Pack(unsafe.Pointer(node), oldTag+1))
413
414        if atomic.CompareAndSwapUint64(&s.top, oldTop, newTop) {
415            return
416        }
417    }
418}
419
420func Pop() {
421    for {
422        oldTop := atomic.LoadUint64(&s.top)
423        if oldTop == 0 {
424            return 0, false
425        }
426
427        oldPtr, oldTag := Unpack(TaggedPointer(oldTop))
428        topNode :=(oldPtr)
429        nextPtr, _ := Unpack(topNode.Next)
430
431        newTop := uint64(Pack(nextPtr, oldTag+1))
432
433        if atomic.CompareAndSwapUint64(&s.top, oldTop, newTop) {
434            return topNode.Value, true
435        }
436    }
437}

Testing

  1package lockfree
  2
  3import (
  4    "runtime"
  5    "sync"
  6    "testing"
  7)
  8
  9func TestLockFreeStack_Basic(t *testing.T) {
 10    s := NewLockFreeStack()
 11
 12    s.Push(1)
 13    s.Push(2)
 14    s.Push(3)
 15
 16    if v, ok := s.Pop(); !ok || v != 3 {
 17        t.Errorf("Expected 3, got %d", v)
 18    }
 19
 20    if v, ok := s.Pop(); !ok || v != 2 {
 21        t.Errorf("Expected 2, got %d", v)
 22    }
 23
 24    if v, ok := s.Pop(); !ok || v != 1 {
 25        t.Errorf("Expected 1, got %d", v)
 26    }
 27
 28    if _, ok := s.Pop(); ok {
 29        t.Error("Expected empty stack")
 30    }
 31}
 32
 33func TestLockFreeQueue_Basic(t *testing.T) {
 34    q := NewLockFreeQueue()
 35
 36    q.Enqueue(1)
 37    q.Enqueue(2)
 38    q.Enqueue(3)
 39
 40    if v, ok := q.Dequeue(); !ok || v != 1 {
 41        t.Errorf("Expected 1, got %d", v)
 42    }
 43
 44    if v, ok := q.Dequeue(); !ok || v != 2 {
 45        t.Errorf("Expected 2, got %d", v)
 46    }
 47
 48    if v, ok := q.Dequeue(); !ok || v != 3 {
 49        t.Errorf("Expected 3, got %d", v)
 50    }
 51
 52    if _, ok := q.Dequeue(); ok {
 53        t.Error("Expected empty queue")
 54    }
 55}
 56
 57func TestLockFreeQueue_Concurrent(t *testing.T) {
 58    q := NewLockFreeQueue()
 59    numGoroutines := runtime.NumCPU()
 60    itemsPerGoroutine := 1000
 61
 62    var wg sync.WaitGroup
 63
 64    // Concurrent enqueue
 65    for i := 0; i < numGoroutines; i++ {
 66        wg.Add(1)
 67        go func(id int) {
 68            defer wg.Done()
 69            for j := 0; j < itemsPerGoroutine; j++ {
 70                q.Enqueue(id*itemsPerGoroutine + j)
 71            }
 72        }(i)
 73    }
 74
 75    wg.Wait()
 76
 77    // Count dequeued items
 78    count := 0
 79    for {
 80        if _, ok := q.Dequeue(); !ok {
 81            break
 82        }
 83        count++
 84    }
 85
 86    expected := numGoroutines * itemsPerGoroutine
 87    if count != expected {
 88        t.Errorf("Expected %d items, got %d", expected, count)
 89    }
 90}
 91
 92func TestLockFreeHashTable_Basic(t *testing.T) {
 93    ht := NewLockFreeHashTable()
 94
 95    ht.Put(1, 100)
 96    ht.Put(2, 200)
 97    ht.Put(3, 300)
 98
 99    if v, ok := ht.Get(1); !ok || v != 100 {
100        t.Errorf("Expected 100, got %d", v)
101    }
102
103    if v, ok := ht.Get(2); !ok || v != 200 {
104        t.Errorf("Expected 200, got %d", v)
105    }
106
107    // Update
108    ht.Put(1, 111)
109    if v, ok := ht.Get(1); !ok || v != 111 {
110        t.Errorf("Expected 111, got %d", v)
111    }
112
113    // Delete
114    if !ht.Delete(2) {
115        t.Error("Delete failed")
116    }
117
118    if _, ok := ht.Get(2); ok {
119        t.Error("Key should be deleted")
120    }
121}
122
123func BenchmarkLockFreeStack_Contention(b *testing.B) {
124    s := NewLockFreeStack()
125
126    b.RunParallel(func(pb *testing.PB) {
127        for pb.Next() {
128            s.Push(42)
129            s.Pop()
130        }
131    })
132}
133
134func BenchmarkMutexStack_Contention(b *testing.B) {
135    s := NewMutexStack()
136
137    b.RunParallel(func(pb *testing.PB) {
138        for pb.Next() {
139            s.Push(42)
140            s.Pop()
141        }
142    })
143}
144
145func BenchmarkLockFreeQueue_Contention(b *testing.B) {
146    q := NewLockFreeQueue()
147
148    b.RunParallel(func(pb *testing.PB) {
149        for pb.Next() {
150            q.Enqueue(42)
151            q.Dequeue()
152        }
153    })
154}
155
156func BenchmarkMutexQueue_Contention(b *testing.B) {
157    q := NewMutexQueue()
158
159    b.RunParallel(func(pb *testing.PB) {
160        for pb.Next() {
161            q.Enqueue(42)
162            q.Dequeue()
163        }
164    })
165}
166
167func BenchmarkLockFreeHashTable_Get(b *testing.B) {
168    ht := NewLockFreeHashTable()
169
170    // Pre-populate
171    for i := 0; i < 1000; i++ {
172        ht.Put(i, i*10)
173    }
174
175    b.ResetTimer()
176    b.RunParallel(func(pb *testing.PB) {
177        key := 0
178        for pb.Next() {
179            ht.Get(key % 1000)
180            key++
181        }
182    })
183}
184
185func TestLockFreeStackTagged_ABAProof(t *testing.T) {
186    s := NewLockFreeStackTagged()
187
188    s.Push(1)
189    s.Push(2)
190
191    // Pop and push same value
192    s.Pop()
193    s.Push(2)
194
195    // Tagged pointer prevents ABA problem
196    if v, ok := s.Pop(); !ok || v != 2 {
197        t.Errorf("Expected 2, got %d", v)
198    }
199}

Usage Example

  1package main
  2
  3import (
  4    "fmt"
  5    "lockfree"
  6    "runtime"
  7    "sync"
  8    "time"
  9)
 10
 11func main() {
 12    fmt.Println("=== Lock-Free Stack ===")
 13    stack := lockfree.NewLockFreeStack()
 14
 15    stack.Push(10)
 16    stack.Push(20)
 17    stack.Push(30)
 18
 19    for !stack.IsEmpty() {
 20        if val, ok := stack.Pop(); ok {
 21            fmt.Printf("Popped: %d\n", val)
 22        }
 23    }
 24
 25    fmt.Println("\n=== Lock-Free Queue ===")
 26    queue := lockfree.NewLockFreeQueue()
 27
 28    queue.Enqueue(1)
 29    queue.Enqueue(2)
 30    queue.Enqueue(3)
 31
 32    for !queue.IsEmpty() {
 33        if val, ok := queue.Dequeue(); ok {
 34            fmt.Printf("Dequeued: %d\n", val)
 35        }
 36    }
 37
 38    fmt.Println("\n=== Lock-Free Hash Table ===")
 39    ht := lockfree.NewLockFreeHashTable()
 40
 41    ht.Put(1, 100)
 42    ht.Put(2, 200)
 43    ht.Put(3, 300)
 44
 45    if val, ok := ht.Get(2); ok {
 46        fmt.Printf("Get(2) = %d\n", val)
 47    }
 48
 49    ht.Delete(2)
 50    fmt.Printf("After delete, size = %d\n", ht.Size())
 51
 52    fmt.Println("\n=== Performance Comparison ===")
 53    compareLockFreeVsMutex()
 54}
 55
 56func compareLockFreeVsMutex() {
 57    iterations := 100000
 58    numGoroutines := runtime.NumCPU()
 59
 60    // Lock-free queue
 61    lfQueue := lockfree.NewLockFreeQueue()
 62    start := time.Now()
 63    var wg sync.WaitGroup
 64
 65    for i := 0; i < numGoroutines; i++ {
 66        wg.Add(1)
 67        go func() {
 68            defer wg.Done()
 69            for j := 0; j < iterations; j++ {
 70                lfQueue.Enqueue(j)
 71                lfQueue.Dequeue()
 72            }
 73        }()
 74    }
 75
 76    wg.Wait()
 77    lockFreeTime := time.Since(start)
 78
 79    // Mutex queue
 80    mQueue := lockfree.NewMutexQueue()
 81    start = time.Now()
 82
 83    for i := 0; i < numGoroutines; i++ {
 84        wg.Add(1)
 85        go func() {
 86            defer wg.Done()
 87            for j := 0; j < iterations; j++ {
 88                mQueue.Enqueue(j)
 89                mQueue.Dequeue()
 90            }
 91        }()
 92    }
 93
 94    wg.Wait()
 95    mutexTime := time.Since(start)
 96
 97    fmt.Printf("Lock-free queue: %v\n", lockFreeTime)
 98    fmt.Printf("Mutex queue: %v\n", mutexTime)
 99    fmt.Printf("Speedup: %.2fx\n", float64(mutexTime)/float64(lockFreeTime))
100}

Testing Your Solution

Test these scenarios:

  1. Correctness: Basic push/pop, enqueue/dequeue operations
  2. FIFO/LIFO Order: Verify queue/stack ordering
  3. Concurrent Access: Multiple threads accessing simultaneously
  4. Empty Data Structure: Pop/dequeue from empty structure
  5. Memory Safety: No data races
  6. ABA Problem: Verify tagged pointers prevent ABA
  7. Performance: Compare lock-free vs mutex-based
  8. Scalability: Test with 1, 2, 4, 8, 16 threads

Verification Checklist:

  • No data races detected by Go race detector
  • Correct FIFO/LIFO ordering maintained
  • All pushed/enqueued items are retrieved
  • Handles empty data structure gracefully
  • Performance scales with number of cores
  • Lock-free faster than mutex under contention
  • Memory is properly reclaimed
  • ABA problem handled correctly

Bonus Challenges

  1. Wait-Free Queue: Implement wait-free bounded queue
  2. Lock-Free Deque: Double-ended queue with lock-free operations
  3. Lock-Free Priority Queue: Heap-based priority queue
  4. Lock-Free Skip List: Sorted concurrent skip list
  5. Hazard Pointers: Implement proper hazard pointer system
  6. Elimination Stack: Stack with elimination array for better scaling
  7. Lock-Free B-Tree: Concurrent B-tree index
  8. Transactional Memory: Software transactional memory
  9. Universal Construction: Transform sequential to concurrent
  10. Formal Verification: Prove correctness with TLA+ or Spin

Key Takeaways

  • Lock-free algorithms scale better than locks under high contention
  • CAS is the fundamental primitive for lock-free programming
  • ABA problem must be addressed with tagged pointers or versioning
  • Memory reclamation is challenging - use hazard pointers or epochs
  • Not all algorithms have efficient lock-free versions - sometimes locks are better
  • Memory ordering is critical - use proper barriers
  • Testing is harder - race conditions are subtle
  • Lock-free doesn't mean wait-free - some threads may retry many times
  • Profile before optimizing - lock-free isn't always faster

Lock-free programming is a powerful technique for high-performance concurrent systems, but it requires deep understanding of memory models, atomics, and concurrent algorithms.

References