On 25.02.2008, at 16:03, Andrew Haley wrote:
Daniel Lohmann wrote:
g++ 4.2.3
Hi,
I have the following code, which uses the new
__sync_lock_test_and_set() builtin:
class Mutex {
int locked;
public:
Mutex() {
locked = 0;
}
void lock();
void unlock();
};
void Mutex::lock() {
while( __sync_lock_test_and_set( &locked, 1) == 0 )
;
}
void Mutex::unlock() {
__sync_lock_release( &locked );
}
After compiling with -03 -fomit-frame-pointer, the resulting code
for the Mutex::lock() method looks as follows:
00000010 <Mutex::lock()>:
10: 8b 54 24 04 mov 0x4(%esp),%edx
14: b8 01 00 00 00 mov $0x1,%eax
19: 87 02 xchg %eax,(%edx)
1b: 85 c0 test %eax,%eax
1d: 74 f5 je 14 <Mutex::lock()+0x4>
1f: f3 c3 repz ret
I am wondering about the repz prefix before the ret. A "do RET
until Z-Flag is set" obviously does not make sense from the
functional point of view. So I assume that it actually is a side
effects of the repz prefix that is exploited here to guarantee
"something" with respect to instruction reordering, fetching,
caching, or ...?
So what exactly is this "something"?
And what exactly could happen under which circumstances if we don't
use it?
Google does not reveal much. If one googles for "repz ret" one gets
a *load* of hits -- but just because of the fact that "ret"
appears immediately after "repz" in the alphabetically sorted list
of x86 instructions :-)
If you grep the gcc source you'll find
;; Used by x86_machine_dependent_reorg to avoid penalty on single
byte RET
;; instruction Athlon and K8 have.
(define_insn "return_internal_long"
[(return)
(unspec [(const_int 0)] UNSPEC_REP)]
"reload_completed"
"rep\;ret"
[(set_attr "length" "1")
(set_attr "length_immediate" "0")
(set_attr "prefix_rep" "1")
(set_attr "modrm" "0")])
Oh, it is something *that* machine specific...
Thanks Andrew!