On 5/21/21 7:44 AM, Florian Weimer wrote: > * Dave Hansen via Libc-alpha: >> Our system calls are *REALLY* fast. We can even do a vsyscall for this >> if we want to get the overhead down near zero. Userspace can also cache >> the "I did the prctl()" state in thread-local storage if it wants to >> avoid the syscall. > Why can't userspace look at XCR0 to make the decision? The thing we're trying to avoid is a #NM exception from XFD (the new first-use detection feature) that occurs on the first use of AMX. XCR0 will have XCR0[AMX]=1, even if XFD is "armed" and ready to generate the #NM.