On Mon, Jul 20, 2015 at 2:09 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > > Annoying problem one: the segment base field is only 32 bits in the GDT. Ok. So if we go this way, we'd make the rule be something like "the segment base is the CPU number shifted up by the page size", and then you'd have to add some magic offset that we'd declare as the "per-cpu page offset". >> - user space can just load the segment selector in %gs > > IIRC this is very expensive -- 40 cycles or so. At this point > userspace might as well just use a real lock cmpxchg. So cmpxchg may be as many cycles, but (a) you can choose to load the segment just once, and do several operations with it (b) often - but admittedly not always - the real cost of a non-cpu-local local and cmpxchg tends to be the cacheline ping-pong, not the CPU cycles. so I agree, loading a segment isn't free. But it's not *that* expensive, and you could always decide to keep the segment loaded and just do - read segment selector - if NUL segment, reload it. although that only works if you own the segment entirely and can keep it as the percpu segment (ie obviously not the Wine case, for example). > Does it solve the Wine problem? If Wine uses gs for something and > calls a function that does this, Wine still goes boom, right? So the advantage of just making a global segment descriptor available is that it's not *that* expensive to just save/restore segments. So either wine could do it, or any library users would do it. But anyway, I'm not sure this is a good idea. The advantage of it is that the kernel support really is _very_ minimal. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html