* Dave Hansen: > On 5/21/21 7:44 AM, Florian Weimer wrote: >> * Dave Hansen via Libc-alpha: >>> Our system calls are *REALLY* fast. We can even do a vsyscall for this >>> if we want to get the overhead down near zero. Userspace can also cache >>> the "I did the prctl()" state in thread-local storage if it wants to >>> avoid the syscall. >> Why can't userspace look at XCR0 to make the decision? > > The thing we're trying to avoid is a #NM exception from XFD (the new > first-use detection feature) that occurs on the first use of AMX. > XCR0 will have XCR0[AMX]=1, even if XFD is "armed" and ready to > generate the #NM. I see. So essentially the hardware wants to offer transparent initialize-on-use, but Linux does not seem to want to implement it this way. Is there still a chance to bring the hardware and Linux into alignment? Thanks, Florian