Notes on synchronization in the Linux Kernel
23 Jul 2015Just a small collection of notes on synchronization mechanisms in the Kernel. The underlying reference for which is the Linux Kernel Development book by Robert Love.
Atomic Operations
- Indivisible operations - executed without interruption
- Keep state consistent across threads of execution
- Some architectural support
- Atomic Integer Operations
- Use special type
atomic_t
for int- Prevent compiler from optimizing value
- hide architecture specific implementation differences
- defined in
linux/types.h
- Use special type
* Devlopers need to limit the usage of `atomic_t` to `24 bits` to not break on sparc
* on sparc lock embeded in lower 8 bits (this may no longer be relevant) * atomic operations in `asm/atomic.h` * defining atomic
- Some atomic operations :
-
Using
atomic_add
,atomic_inc
,atomic_dec
lighter weight instead of complex locking -
int atomic_dec_and_test(atomic_t *v)
- decrement
v
- if
v
is zero returns true - else return false
- if
- List of Atomic Operations
- decrement
- Implemented as inline functions & inline assembly
- word sized reads always atomic
- most architectures read/write of a single byte is atomic( no two simultaneous operations)
- read is consistent(ordered either before or after the write)
- Thus on most architectures we get
- Consitent atomicity does not guarantee consistent ordering (use memory barriers for enforced ordering)
64-bit
architecturesatomic_t
size cant be different for different architectuersatomic_t
is consitently 32-bit accross architectuers
atomic64_t
is used to for 64-bit atomic operations on 64 bit machines- nearly all atomic operations listed above implement a 64-bit version
- all corresponding functions shown below
- Atomic Bit Operations
- architecture specific code
asm/bitops.h
for details- operate on generic pointers
- on
32-bit machines
- bit
31
most significant bit - bit
0
least significant bit
- bit
- Non-atomic versions of bit operations provided but their name prefixed with
__
test_bit()
atomic version ,__test_bit()
non-atomic version- when dont need locking use non-atomic versions for performance
- See example usage setting and clearing bits atomically
Spin Locks
- Allow for creation of critical regions in code
- spinlock can be held by at most one thread at a time
- contending threads busy loop waiting for the lock to unlock
- if lock is uncontended acquire lock and proceed instantly
- busy loop will
"waste"
processor time - thus must minmize length of time spinlock is held
- advantage that doesnt result in context switches
- can be used in contexts which do not support blocking (interrupt context) or (preemption disabled?)
- Spin Lock Methods
- Defined in
linux/spinlock.h
- Defined in
- On uniprocessor system spinlocks compile away
- “some people” say they have some side effects, and spinlocs are not noops
- need to verify
- act as markers to enable disable kernel preemption
- Spinlocks are not recursive
- trying to acquire the same lock again will hang the thread of execution
- Using spinlock in interrupt context - need to disable
local interrupts
on this processor- if preemption not disabled may deadlock on the lock this processor already holds
- Thus need to use methods to which will disable interrupts
spin_lock_irqsave
- save the current state of interrupts
- disables interrupts locally
- obtain the spinlock
spin_unlock_irqrestore
- unlock the given lock
- returns interrupts to
previous state
- This is key , if the interrupts were disabled on entry then we need should keep them disabled
- allows nesting of different spinlocks and interrupt disabling clode
- Locks should be associated with data structures which they lock
- make association of lock with data clearer by naming conventions
- Kernel Configs to help Spin Lock Debugging
CONFIG_DEBUG_SPINLOCK
- using uninitialized spinlock detection
- unlocking a lock which was never locked
CONFIG_DEBUG_LOCK_ALLOC
- debugging lock lifecycles?
- Dynamic Allocation of Spinlock
spin_lock_init()
initialize dynamically created spinlock, ie with a pointerspin_trylock()
- try to obtain lock, if fail return 0, if succed return non-zero
- Spin Locks and Bottom Halfs
- Bottom halfs preempt process contexts
- two tasklets of same type don’t run concurrently
- tasklet never preempts another tasklet on same processor
- interrupt context can preempt bottom halfs and process contexts
- Process context should disable bottom half preemption over contended locks
- Reader-Writer Spin Locks
- Dealing with the asymmetrical nature of reading and writing.
- Writing demands mutal exclusion
- no reading permitted
- no writing permitted
- Reading requires - write exclusion
- no writers permitted
- multiple readers ok
- Reader-Writer spinlocks increase throughput by allowing multiple readers
- Provide separate variants of read and write locks
- one or more readers can hold read locks
- only one writer and no one readers or writers can hold write lock
- alternative terminology: (reader,writer) ,(shared/exlusive) , (concurrent/exclusive)
- read locks cant later be transformed to write locks without first releasing the read lock and then asking for a write lock
-
reader locks are recursive, same thread can re-obtain same reader locks without deadlock
-
Thus if only readers in interrupt handlers no need to disable interrupts.
-
use
read_lock()
instead ofread_lock_irqsave()
-
Must use
write_lock_irqsave()
in interrupts
-
-
Need to think of appropriate priority for writers. Else lots of readers may starve writer.
-
Spinlocks are on small timescales (nanosceconds?) For larger timescales blocking semaphores are advisable instead.
Semaphores
-
For larger timescales where putting process context to sleep is advisable (ms) scale
-
Sets processor free to execute other code.
-
Should only be obtained in
process context
cannot be used ininterrupt context
,interrupt context
is not schedulable. -
Cannot hold a
spinlock
while acquiring a semaphore. Since you may sleep and cause a deadlock on thespinlock
(how is this enforced?) -
Semaphores don’t disable kernel preemption, don’t hurt affect scheduling latency as much as spinlocks
-
Counting vs Binary Semaphores
-
Unlike spinlocks allow arbitrary number of simultaneous lock holders
-
usage count
orcount
- number of permisable simultaneous lock holders -
Binary semaphore/mutex - enforces mutual exclusion -
usage count
is one -
Counting semaphore -
usage count
greater than one -
While
count
is greater thanzero
acquiring semaphore succeeds. -
asm/semaphore.h
-
struct semaphore
key structure
-
* `init_MUTEX(sem)` - initialize a dynamically created semaphore
* To try acquire use `down_*` methods.
* To release use `up_*` methods
* `down_interruptible()`
* one failure to acquire lock puts calling process in
`TASK_INTERRUPTIBLE` and sleep
* If `process` receives a `signal` then wake up and fail by
returning `-EINTR`
* `down()`
* place task in `TASK_UNINTERRUPTIBLE`, then sleep
* process will not respond to signals
* prefer `down_interruptible()` over `down()`
* `down_trylock()`
* acquire semaphore without blocking
* if lock held return non-zero
- Semaphore Methods
Reader-Writer Semaphores
- Similar to
reader-writer
spinlocks linux/rwsem.h
- Static Creation
static DECLARE_RWSEM(name);
- Dynamic Creation
init_rwsem(struct rw_semaphore *sem)
- Define mutual exclusion on writers with usage
count
1 - Allow simultaneous reads
- All readers and writers use uninterruptible sleep
- For non-blocking lock-acquisition use
down_read_trylock()
anddown_write_trylock()
- return non-zero if lock can be acquired
- return zero if lock cannot bet acquired
downgrade_write()
converts awrite
lock to a read lock
Mutexes
struct mutex
- used to simplify common use case of semaphores
- Static Definition
DEFINE_MUTEX(name)
- Dynamic Definition
mutex_init(&mutex)
- Locking and Unlocking Mutex
- Does not require usage counts
- Constrains Imposed on mutex usage
- usage count is always
one
- single task holds mutex
- only original locker allowed to unlock mutex
- recursive locks and unlocks are not allowed.
- process cannot exit while holding a mutex
- a mutex cannot be acquired by an interrupt handler or bottom half
- Constraints checked via
CONFIG_DEBUG_MUTEXES
- usage count is always
- Usage guidelines
- start with mutex use semaphore if constriants not statisfiable
- When to use Spinlocks vs (Semaphore , Mutex) ?
Completion Variables
- Event signalling between tasks
- One task waits for signal, other task signals on completion
- On completion wake up all sleeping tasks
- Ex. Completion Variable, wake up parent when child exits
-
- See
kernel/sched.c
andkernel/fork.c
- See
-
- Static
DECLARE_COMPLETION(mr_comp);
- Dynamic
init_completion()
- To wait for completion call
wait_for_completion
- To signal completion call
complete
BKL: The big kernel lock
- Spinlock used to ease transition to SMP
- global spinlock
- Lock dropped on sleep
- Is a recursive lock: same process multiple acquisition allowed
- Forbidden new users
- Transition away from it
lock_kernel
acquires lockunclock_kernel
releases recursivelykernel_locked
returns0
- lock already heldnon-zero
-lock not being held
linux/smp_lock.h
- problem with single lock - difficult to determine which two parties actually need to syncrhonize with each other # Sequential Locks
Sequential Locking
- Simple mechanism reading/writing shared data
- maintains a sequence counter
- when sequence counter is odd a write is taking palce
- check sequence counter is even prior to and after read to be sure no write was underway
- Write lock always succeeds as long as no writers
-
Favor writers over readers
- Ideal cases
- Many of readers
- Few writers
- readers never starve writers
- simple data
- Ex :
seq_lock
onjiffies
Preemption Disabling
- kernel is preemptive
- spin locks disable premption for duration held
- during modification of per-processor data disable preemption to prevent corrupting in flight data
preempt_disable
andpreempt_enable
can be nested- methods maintain counts of the number of times preeption is enabled or disabled.
- If count is
0
kernel is preemptive
-
Alternatively calling
get_cpu()
to obtain an index into perprocessor data will disable kernel preemption. -
put_cpu()
will reenable kernel preemption
Ordering and Barriers
- Ensuring ordering of memory loads and stores
- Compiler and Processors like to reorder loads and stores
- Barriers prevent compiler and processor from reordering instructions
x86
does not do out of order stores- Other processors might do out of order stores
- Reorderings can have dependencies :
-
data dependency between b and a
- static reordering : compiler
- Reflected in object code
-
dynamic reordering: processor
rmb()
- a read memory barrier
- loads before the
rmb()
will never be reordered to loads after the call read_barrier_depends()
- read barrier for loads
- only for loads where the subsequent load depends on previous load
- much quicker on some architectures
- on some architectures transforms to
noop
wmb()
- write barrier
- stores before the
wmb()
will never be reordered with stores after the call x86
does not reorder stores
mb()
- read/write barrier
- (loads and stores) before the
wmb()
will never be reordered with (loads and stores) after the call
smp_rmb
,smp_wmb
andsmp_read_barrier_depends
- turn memory barrier to compiler barrier on smp
- compiler barrires are nearly free - only preventing static rearrangement