SGDT would be easy to use, and it is logical that it is faster since it reads an internal register. SIDT does too but unlike the GDT has a secondary limit (it can never be larger than 4096 bytes) and so all limits in the range 4095-65535 are exactly equivalent. Anything that causes a write to the GDT will #PF if read-only. So yes, we need to force the accessed bit to set. This shouldn't be a problem and in fact ought to be a performance improvement. On September 29, 2015 10:35:38 AM PDT, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: >On Sep 29, 2015 2:01 AM, "Ingo Molnar" <mingo@xxxxxxxxxx> wrote: >> >> >> * Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote: >> >> > On 09/28/2015 09:58 AM, Ingo Molnar wrote: >> > > >> > > * Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote: >> > > >> > >> On 09/26/2015 09:50 PM, H. Peter Anvin wrote: >> > >>> NAK. We really should map the GDT read-only on all 64 bit >systems, >> > >>> since we can't hide the address from SLDT. Same with the IDT. >> > >> >> > >> Sorry, I don't understand your point. >> > > >> > > So the problem is that right now the SGDT instruction (which is >unprivileged) >> > > leaks the real address of the kernel image: >> > > >> > > fomalhaut:~> ./sgdt >> > > SGDT: ffff88303fd89000 / 007f >> > > >> > > that 'ffff88303fd89000' is a kernel address. >> > >> > Thank you. >> > I do know that SGDT and friends are unprivileged on x86 >> > and thus they allow userspace (and guest kernels in paravirt) >> > learn things they don't need to know. >> > >> > I don't see how making GDT page-aligned and page-sized >> > changes anything in this regard. SGDT will still work, >> > and still leak GDT address. >> >> Well, as I try to explain it in the other part of my mail, doing so >enables us to >> remap the GDT to a less security sensitive virtual address that does >not leak the >> kernel's randomized address: >> >> > > Your observation in the changelog and your patch: >> > > >> > >>>> It is page-sized because of paravirt. [...] >> > > >> > > ... conflicts with the intention to mark (remap) the primary GDT >address read-only >> > > on native kernels as well. >> > > >> > > So what we should do instead is to use the page alignment >properly and remap the >> > > GDT to a read-only location, and load that one. >> > >> > If we'd have a small GDT (i.e. what my patch does), we still can >remap the >> > entire page which contains small GDT, and simply don't care that >some other data >> > is also visible through that RO page. >> >> That's generally considered fragile: suppose an attacker has a >limited information >> leak that can read absolute addresses with system privilege but he >doesn't know >> the kernel's randomized base offset. With a 'partial page' mapping >there could be >> function pointers near the GDT, part of the page the GDT happens to >be on, that >> leak this information. >> >> (Same goes for crypto keys or other critical information (like canary >information, >> salts, etc.) accidentally ending up nearby.) >> >> Arguably it's a bit tenuous, but when playing remapping games it's >generally >> considered good to be page aligned and page sized, with zero padding. >> >> > > This would have a couple of advantages: >> > > >> > > - This would give kernel address randomization more teeth on >x86. >> > > >> > > - An additional advantage would be that rootkits overwriting the >GDT would have >> > > a bit more work to do. >> > > >> > > - A third advantage would be that for NUMA systems we could >'mirror' the GDT into >> > > node-local memory and load those. This makes GDT load >cache-misses a bit less >> > > expensive. >> > >> > GDT is per-cpu. Isn't per-cpu memory already NUMA-local? >> >> Indeed it is: >> >> fomalhaut:~> for ((cpu=1; cpu<9; cpu++)); do taskset $cpu ./sgdt ; >done >> SGDT: ffff88103fa09000 / 007f >> SGDT: ffff88103fa29000 / 007f >> SGDT: ffff88103fa29000 / 007f >> SGDT: ffff88103fa49000 / 007f >> SGDT: ffff88103fa49000 / 007f >> SGDT: ffff88103fa49000 / 007f >> SGDT: ffff88103fa29000 / 007f >> SGDT: ffff88103fa69000 / 007f >> >> I confused it with the IDT, which is still global. >> >> This also means that the GDT in itself does not leak kernel addresses >at the >> moment, except it leaks the layout of the percpu area. >> >> So my suggestion would be to: >> >> - make the GDT unconditionally page aligned and sized, then remap it >to a >> read-only address unconditionally as well, like we do it for the >IDT. > >Does anyone know what happens if you stick a non-accessed segment in >the GDT, map the GDT RO, and access it? The docs are extremely vague >on the interplay between segmentation and paging on the segmentation >structures themselves. My guess is that it causes #PF. This might >break set_thread_area users unless we change set_thread_area to force >the accessed bit on. > >There's a possible worse failure mode: if someone pokes an un-accessed >segment into SS or CS using sigreturn, then it's within the realm of >possibility that IRET would generate #PF (hey Intel and AMD, please >document this!). I don't think that would be rootable, but at the >very least we'd want to make sure it doesn't OOPS by either making it >impossible or adding an explicit test to sigreturn.c. > >hpa pointed out in another thread that the GDT *must* be writable on >32-bit kernels because we use a task gate for NMI and jumping through >a task gate writes to the GDT. > >On another note, SGDT is considerably faster than LSL, at least on >Sandy Bridge. The vdso might be able to take advantage of that for >getcpu. > >--Andy -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html