Daniel Lohmann wrote:
g++ 4.2.3
Hi,
I have the following code, which uses the new __sync_lock_test_and_set()
builtin:
class Mutex {
int locked;
public:
Mutex() {
locked = 0;
}
void lock();
void unlock();
};
void Mutex::lock() {
while( __sync_lock_test_and_set( &locked, 1) == 0 )
;
}
void Mutex::unlock() {
__sync_lock_release( &locked );
}
After compiling with -03 -fomit-frame-pointer, the resulting code for
the Mutex::lock() method looks as follows:
00000010 <Mutex::lock()>:
10: 8b 54 24 04 mov 0x4(%esp),%edx
14: b8 01 00 00 00 mov $0x1,%eax
19: 87 02 xchg %eax,(%edx)
1b: 85 c0 test %eax,%eax
1d: 74 f5 je 14 <Mutex::lock()+0x4>
1f: f3 c3 repz ret
I am wondering about the repz prefix before the ret. A "do RET until
Z-Flag is set" obviously does not make sense from the functional point
of view. So I assume that it actually is a side effects of the repz
prefix that is exploited here to guarantee "something" with respect to
instruction reordering, fetching, caching, or ...?
So what exactly is this "something"?
And what exactly could happen under which circumstances if we don't use it?
Google does not reveal much. If one googles for "repz ret" one gets a
*load* of hits -- but just because of the fact that "ret" appears
immediately after "repz" in the alphabetically sorted list of x86
instructions :-)
If you grep the gcc source you'll find
;; Used by x86_machine_dependent_reorg to avoid penalty on single byte RET
;; instruction Athlon and K8 have.
(define_insn "return_internal_long"
[(return)
(unspec [(const_int 0)] UNSPEC_REP)]
"reload_completed"
"rep\;ret"
[(set_attr "length" "1")
(set_attr "length_immediate" "0")
(set_attr "prefix_rep" "1")
(set_attr "modrm" "0")])