Rusty Russell wrote: > You're thinking of it in a convoluted way, by converting to offsets > from the per-cpu section, then converting it back. How about this > explanation: the local cpu's versions are offset from where the compiler > thinks they are by __per_cpu_offset[cpu]. We set the segment base to > __per_cpu_offset[cpu], so "%gs:per_cpu__foo" gets us straight to the > local cpu version. __per_cpu_offset[cpu] is always positive (kernel > image sits at bottom of kernel address space). > We're talking kernel virtual addresses, so the physical load address doesn't matter, of course. So, take this kernel I have here as an explicit example: $ nm -n vmlinux [...] c0431100 A __per_cpu_start [...] c0433800 D per_cpu__cpu_gdt_descr c0433880 D per_cpu__cpu_tlbstate And say that this CPU has its percpu data allocated at 0xc100000. So, in this case the %gs base will be loaded with 0xc100000-0xc0431100 = 0x4bccef00 The offset of per_cpu__cpu_gdt_descr is 0xc0433800, so %gs:per_cpu__cpu_gdt_descr will compute 0x4bccef00+0xc0433800 to get the final linear address. Since 0xc0433800 is negative, this is actually a subtraction, and it therefore requires the segment to have a 4G limit. Which makes Xen sad. >> Especially since "__per_cpu_start" is actually very >> large, and so this scheme pretty much relies on being able to wrap >> around the segment limit, and will be very bad for Xen. >> > > __per_cpu_start is large, yes. But there's no reason to use it in > address calculation. The second half of your statement is not correct. > __per_cpu_start is added to all per_cpu__* addresses. >> An alternative is to put the "-__per_cpu_start" into the addressing mode >> when constructing the address of the per-cpu variable. >> > > I think you're thinking of TLS relocations? I don't use them... > No, but this is just as bad. J