On Thu, Jun 26, 2014 at 5:19 PM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote: > On 06/26/2014 04:15 PM, Andy Lutomirski wrote: >> So here's my mental image of how I might do this if I were doing it >> entirely in userspace: I'd create a file or memfd for the bound tables >> and another for the bound directory. These files would be *huge*: the >> bound directory file would be 2GB and the bounds table file would be >> 2^48 bytes or whatever it is. (Maybe even bigger?) >> >> Then I'd just map pieces of those files wherever they'd need to be, >> and I'd make the mappings sparse. I suspect that you don't actually >> want a vma for each piece of bound table that gets mapped -- the space >> of vmas could end up incredibly sparse. So I'd at least map (in the >> vma sense, not the pte sense) and entire bound table at a time. And >> I'd probably just map the bound directory in one big piece. >> >> Then I'd populate it in the fault handler. >> >> This is almost what the code is doing, I think, modulo the files. >> >> This has one killer problem: these mappings need to be private (cowed >> on fork). So memfd is no good. > > This essentially uses the page cache's radix tree as a parallel data > structure in order to keep a vaddr->mpx_vma map. That's not a bad idea, > but it is a parallel data structure that does not handle copy-on-write > very well. > > I'm pretty sure we need the semantics that anonymous memory provides. > >> There's got to be an easyish way to >> modify the mm code to allow anonymous maps with vm_ops. Maybe a new >> mmap_region parameter or something? Maybe even a special anon_vma, >> but I don't really understand how those work. > > Yeah, we very well might end up having to go down that path. > >> Also, egads: what happens when a bound table entry is associated with >> a MAP_SHARED page? > > Bounds table entries are for pointers. Do we keep pointers inside of > MAP_SHARED-mapped things? :) Sure, if it's MAP_SHARED | MAP_ANONYMOUS. For example: struct thing { struct thing *next; }; struct thing *storage = mmap(..., MAP_SHARED | MAP_ANONYMOUS, ...); storage[0].next = &storage[1]; fork(); I'm not suggesting that this needs to *work* in the first incarnation of this :) --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>