• Will Deacon's avatar
    arm64: spinlock: serialise spin_unlock_wait against concurrent lockers · d86b8da0
    Will Deacon authored
    Boqun Feng reported a rather nasty ordering issue with spin_unlock_wait
    on architectures implementing spin_lock with LL/SC sequences and acquire
     | CPU 1                   CPU 2                     CPU 3
     | ==================      ====================      ==============
     |                                                   spin_unlock(&lock);
     |                         spin_lock(&lock):
     |                           r1 = *lock; // r1 == 0;
     |                         o = READ_ONCE(object); // reordered here
     | object = NULL;
     | smp_mb();
     | spin_unlock_wait(&lock);
     |                           *lock = 1;
     | smp_mb();
     | o->dead = true;
     |                         if (o) // true
     |                           BUG_ON(o->dead); // true!!
    The crux of the problem is that spin_unlock_wait(&lock) can return on
    CPU 1 whilst CPU 2 is in the process of taking the lock. This can be
    resolved by upgrading spin_unlock_wait to a LOCK operation, forcing it
    to serialise against a concurrent locker and giving it acquire semantics
    in the process (although it is not at all clear whether this is needed -
    different callers seem to assume different things about the barrier
    semantics and architectures are similarly disjoint in their
    implementations of the macro).
    This patch implements spin_unlock_wait using an LL/SC sequence with
    acquire semantics on arm64. For v8.1 systems with the LSE atomics, the
    exclusive writeback is omitted, since the spin_lock operation is
    indivisible and no intermediate state can be observed.
    Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
spinlock.h 7.24 KB