On Wed, Oct 25, 2023 at 08:54:34AM -0400, Steven Rostedt wrote: > I didn't want to overload that for something completely different. This is > not a "restartable sequence". Your hack is arguably worse. At least rseq already exists and most threads will already have it set up if you have a recent enough glibc. > > So what if it doesn't ? Can we kill it for not playing nice ? > > No, it's no different than a system call running for a long time. You could Then why ask for it? What's the point. Also, did you define sched_yield() semantics for OTHER to something useful? Because if you didn't you just invoked UB :-) We could be setting your pets on fire. > set this bit and leave it there for as long as you want, and it should not > affect anything. It would affect the worst case interference terms of the system at the very least. > If you look at what Thomas's PREEMPT_AUTO.patch I know what it does, it also means your thing doesn't work the moment you set things up to have the old full-preempt semantics back. It doesn't work in the presence of RT/DL tasks, etc.. More importantly, it doesn't work for RT/DL tasks, so having the bit set and not having OTHER policy is an error. Do you want an interface that randomly doesn't work ? > We could possibly make it adjustable. Tunables are not a good thing. > The reason I've been told over the last few decades of why people implement > 100% user space spin locks is because the overhead of going int the kernel > is way too high. Over the last few decades that has been a blatant falsehood. At some point (right before the whole meltdown trainwreck) amluto had syscall overhead down to less than 150 cycles. Then of course meltdown happened and it all went to shit. But even today (on good hardware or with mitigations=off): gettid-1m: 179,650,423 cycles xadd-1m: 23,036,564 cycles syscall is the cost of roughly 8 atomic ops. More expensive, sure. But not insanely so. I've seen atomic ops go up to >1000 cycles if you contend them hard enough.