On 11/28/2017 12:12 AM, Michal Hocko wrote: > On Mon 27-11-17 15:26:27, John Hubbard wrote: > [...] >> Let me add a belated report, then: we ran into this limit while implementing >> an early version of Unified Memory[1], back in 2013. The implementation >> at the time depended on tracking that assumed "one allocation == one vma". > > And you tried hard to make those VMAs really separate? E.g. with > prot_none gaps? We didn't do that, and in fact I'm probably failing to grasp the underlying design idea that you have in mind there...hints welcome... What we did was to hook into the mmap callbacks in the kernel driver, after userspace mmap'd a region (via a custom allocator API). And we had an ioctl in there, to connect up other allocation attributes that couldn't be passed through via mmap. Again, this was for regions of memory that were to be migrated between CPU and device (GPU). > >> So, with only 64K vmas, we quickly ran out, and changed the design to work >> around that. (And later, the design was *completely* changed to use a separate >> tracking system altogether). exag >> >> The existing limit seems rather too low, at least from my perspective. Maybe >> it would be better, if expressed as a function of RAM size? > > Dunno. Whenever we tried to do RAM scaling it turned out a bad idea > after years when memory grown much more than the code author expected. > Just look how we scaled hash table sizes... But maybe you can come up > with something clever. In any case tuning this from the userspace is a > trivial thing to do and I am somehow skeptical that any early boot code > would trip over the limit. > I agree that this is not a limit that boot code is likely to hit. And maybe tuning from userspace really is the right approach here, considering that there is a real cost to going too large. Just philosophically here, hard limits like this seem a little awkward if they are set once in, say, 1999 (gross exaggeration here, for effect) and then not updated to stay with the times, right? In other words, one should not routinely need to tune most things. That's why I was wondering if something crude and silly would work, such as just a ratio of RAM to vma count. (I'm more just trying to understand the "rules" here, than to debate--I don't have a strong opinion on this.) The fact that this apparently failed with hash tables is interesting, I'd love to read more if you have any notes or links. I spotted a 2014 LWN article ( https://lwn.net/Articles/612100 ) about hash table resizing, and some commits that fixed resizing bugs, such as 12311959ecf8a ("rhashtable: fix shift by 64 when shrinking") ...was it just a storm of bugs that showed up? thanks, John Hubbard -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html