This implements an optimised mutex fastpath for powerpc, making use of
acquire and release barrier semantics. This takes the mutex
lock+unlock benchmark from 203 to 173 cycles on a G5.
Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>