On Thu, Feb 20, 2025 at 09:57:37AM +0100, David Hildenbrand wrote: > On 20.02.25 09:51, Lorenzo Stoakes wrote: > > On Wed, Feb 19, 2025 at 12:56:31PM -0800, Kalesh Singh wrote: > > > > We also can't change smaps in the way you want, it _has_ to still give > > > > output per VMA information. > > > > > > Sorry I wasn't suggesting to change the entries in smaps, rather > > > agreeing to your marker suggestion. Maybe a set of ranges for each > > > smaps entry that has guards? It doesn't solve the use case, but does > > > make these regions visible to userspace. > > > > No, you are not providing a usecase for this. /proc/$pid/pagemaps does not > > contaminate the smaps output, mess with efforts to make it RCU readable, > > require updating the ioctl interface, etc. so it is clearly the better > > choice. > > > > > > > > > > > > > The proposed change that would be there would be a flag or something > > > > indicating that the VMA has guard regions _SOMEWHERE_ in it. > > > > > > > > Since this doesn't solve your problem, adds complexity, and nobody else > > > > seems to need it, I would suggest this is not worthwhile and I'd rather not > > > > do this. > > > > > > > > Therefore for your needs there are literally only two choices here: > > > > > > > > 1. Add a bit to /proc/$pid/pagemap OR > > > > 2. a new interface. > > > > > > > > I am not in favour of a new interface here, if we can just extend pagemap. > > > > > > > > What you'd have to do is: > > > > > > > > 1. Find virtual ranges via /proc/$pid/maps > > > > 2. iterate through /proc/$pid/pagemaps to retrieve state for all ranges. > > > > > > > > > > Could we also consider an smaps field like: > > > > > > VmGuards: [AAA, BBB), [CCC, DDD), ... > > > > > > or something of that sort? > > > > No, absolutely, categorically not. You realise these could be thousands of > > characters long right? > > > > /proc/$pid/pagemaps resolves this without contaminating this output. > > > > > > Well I'm glad that you guys find it useful for _something_ ;) > > > > > > > > Again this wasn't written only for you (it is broadly a good feature for > > > > upstream), but I did have your use case in mind, so I'm a little > > > > disappointed that it doesn't help, as I like to solve problems. > > > > > > > > But I'm glad it solves at least some for you... > > > > > > I recall Liam had a proposal to store the guard ranges in the maple tree? > > > > > > I wonder if that can be used in combination with this approach to have > > > a better representation of this? > > > > This was an alternative proposal made prior to the feature being > > implemented (and you and others at Google were welcome to comment and many > > were cc'd, etc.). > > > > There is no 'in combination with'. This feature would take weeks/months to > > implement, fundamentally impact the maple tree VMA implementation > > and... not actually achieve anything + immediately be redundant. > > > > Plus it'd likely be slower, have locking implications, would have kernel > > memory allocation implications, a lot more complexity and probably other > > problems besides (we discussed this at length at the time and a number of > > issues came up, I can't recall all of them). > > > > To be crystal clear - we are empathically NOT changing /proc/$pid/maps to > > lie about VMAs regardless of underlying implementation, nor adding > > thousands of characters to /proc/$pid/smaps entries. > > Yes. Calling it a "guard region" might be part of the problem > (/"misunderstanding"), because it reminds people of "virtual memory > regions". > > "Guard markers" or similar might have been clearer that these operate on > individual PTEs, require page table scanning etc ... which makes them a lot > more scalable and fine-grained and provides all these benfits, with the > downside being that we don't end up with that many "virtual memory regions" > that maps/smaps operate on. Honestly David you and the naming... :P I disagree, sorry. Saying 'guard' anything might make people think one thing or another. We can't account for that. I mean don't get me started on 'pinning' or any of the million other overloaded terms we use... I _hugely_ publicly went out of my way to express the limitations, I gave a talk, we had meetings, I mentioned it in the series. Honestly if at that point you still don't realise, that's not a naming problem. It's a 'did not participate with upstream' problem. I like guard regions, as they're not pages as we previously referred to them. People have no idea what a marker is, it doesn't sound like it spans ranges, no don't like it sorry. And sorry but this naming topic is closed :) I already let you change the naming of the MADV_'s, which broke my heart, there will not be a second heart breaking... > > [...] > > > > > As I said to you earlier, the _best_ we could do in smaps would be to add a > > flag like 'Grd' or something to indicate some part of the VMA is > > guarded. But I won't do that unless somebody has an -actual use case- for > > it. > > Right, and that would limit where you have to manually scan. Something > similar is being done with uffd-wp markers IIRC. Yeah that's a good point, but honestly if you're reading smaps that reads the page tables, then reading /proc/$pid/pagemaps and reading page tables TWICE that seems inefficient vs. just reading /proc/$pid/maps, then reading /proc/$pid/pagemaps and reading page tables once. > > -- > Cheers, > > David / dhildenb >