mutex: speed up generic mutex implementations
authorNick Piggin <npiggin@suse.de>
Tue, 21 Oct 2008 08:59:15 +0000 (10:59 +0200)
committerLinus Torvalds <torvalds@linux-foundation.org>
Thu, 23 Oct 2008 16:18:20 +0000 (09:18 -0700)
commita8ddac7e53e89cb877965097d05adfeb1c91def3
treedb4ee686e50f7fb57b0cef20e0a8e7f06151e317
parent5a439c565799cb8d290d71ce375e86be64d43a4b
mutex: speed up generic mutex implementations

- atomic operations which both modify the variable and return something imply
  full smp memory barriers before and after the memory operations involved
  (failing atomic_cmpxchg, atomic_add_unless, etc don't imply a barrier because
  they don't modify the target). See Documentation/atomic_ops.txt.
  So remove extra barriers and branches.

- All architectures support atomic_cmpxchg. This has no relation to
  __HAVE_ARCH_CMPXCHG. We can just take the atomic_cmpxchg path unconditionally

This reduces a simple single threaded fastpath lock+unlock test from 590 cycles
to 203 cycles on a ppc970 system.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
include/asm-generic/mutex-dec.h
include/asm-generic/mutex-xchg.h