On 11 Mar 2023 12:24:58 +0000 William Kucharski <william.kucharski@xxxxxxxxxx> >> On Mar 10, 2023, at 04:25, David Hildenbrand <david@xxxxxxxxxx> wrote: >> On 10.03.23 02:40, William Kucharski wrote: >>>> On Mar 9, 2023, at 17:05, Zach O'Keefe <zokeefe@xxxxxxxxxx> wrote: >>>>=20 >>>>> I think the hugepage alignment in their environment was somewhat luck. >>>>> One suggestion made was to change stack size to avoid alignment and >>>>> hugepage usage. That 'works' but seems kind of hackish. >>>>=20 >>>> That was my first thought, if the alignment was purely due to luck, >>>> and not somebody manually specifying it. Agreed it's kind of hackish >>>> if anyone can get bit by this by sheer luck. >>> I don't agree it's "hackish" at all, but I go more into that below. >>>>=20 >>>>> Also, David H pointed out the somewhat recent commit to align sufficie= >ntly >>>>> large mappings to THP boundaries. This is going to make all stacks hu= >ge >>>>> page aligned. >>>>=20 >>>> I think that change was reverted by Linus in commit 0ba09b173387 >>>> ("Revert "mm: align larger anonymous mappings on THP boundaries""), >>>> until it's perf regressions were better understood -- and I haven't >>>> seen a revamp of it. >>> It's too bad it was reverted, though I understand the concerns regarding= > it. >>> From my point of view, if an address is properly aligned and a caller is >>> asking for 2M+ to be mapped, it's going to be advantageous from a purely >>> system-focused point of view to do that mapping with a THP.=20 >>=20 >> Just noting that, if user space requests multiple smaller mappings, and t= >he kernel decides to all place them in the same PMD, all VMAs might get mer= >ged and you end up with a properly aligned VMA where khugepaged would happi= >ly place a THP. >>=20 >> That case is, of course, different to the "user space asks for 2M+" mappi= >ng case, but from khugepaged perspective they might look alike -- and it mi= >ght be unclear if a THP is valuable or not (IOW maybe that THP could be bet= >ter used somewhere else). > >That's a really, really good point. > >My general philosophy on the subject (if the address is aligned and the cal= >ler is asking for a THP-sized allocation, why not map it with a THP if you = >can) kind of falls apart when it's the system noticing it can coalesce a bu= >nch of smaller allocations into one THP via khugepaged. > >Arguably it's the difference between the caller knowing it's asking for som= >ething THP-sized on its behalf and the system deciding to remap a bunch of = >disparate mappings using a THP because _it_ can. > >If we were to say allow a caller's request for a THP-sized allocation/mappi= >ng take priority over those from khugepaged, it would not only be a major v= >ector for abuse, it would also lead to completely indeterminate behavior ("= >When I start my browser after a reboot I get a bunch of THPs, but after the= > system's been up for a few weeks, I don't, how come?") Given transparent_hugepage_flags, how would it be abused? And indetermined? > >I don't have a good answer here. > > -- Bill=