On Mon, Mar 11, 2024 at 12:28 PM Peter Xu <peterx@xxxxxxxxxx> wrote: > > On Mon, Mar 11, 2024 at 11:59:59AM -0700, Axel Rasmussen wrote: > > I'd prefer not to require root or CAP_SYS_ADMIN or similar for > > UFFDIO_POISON, because those control access to lots more things > > besides, which we don't necessarily want the process using UFFD to be > > able to do. :/ I agree; UFFDIO_POISON should not require CAP_SYS_ADMIN. > > > > Ratelimiting seems fairly reasonable to me. I do see the concern about > > dropping some addresses though. > > Do you know how much could an admin rely on such addresses? How frequent > would MCE generate normally in a sane system? I'm not sure about how much admins rely on the address themselves. +cc Jiaqi Yan It's possible for a sane hypervisor dealing with a buggy guest / guest userspace to trigger lots of these pr_errs. Consider the case where a guest userspace uses HugeTLB-1G, finds poison (which HugeTLB used to ignore), and then ignores SIGBUS. It will keep getting MCEs / SIGBUSes. The sane hypervisor will use UFFDIO_POISON to prevent the guest from re-accessing *real* poison, but we will still get the pr_err, and we still keep injecting MCEs into the guest. We have observed scenarios like this before. > > > Perhaps we can mitigate that concern by defining our own ratelimit > > interval/burst configuration? > > Any details? > > > Another idea would be to only ratelimit it if !CONFIG_DEBUG_VM or > > similar. Not sure if that's considered valid or not. :) > > This, OTOH, sounds like an overkill.. > > I just checked again on the detail of ratelimit code, where we by default > it has: > > #define DEFAULT_RATELIMIT_INTERVAL (5 * HZ) > #define DEFAULT_RATELIMIT_BURST 10 > > So it allows a 10 times burst rather than 2.. IIUC it means even if > there're continous 10 MCEs it won't get suppressed, until the 11th came, in > 5 seconds interval. I think it means it's possibly even less of a concern > to directly use pr_err_ratelimited(). I'm okay with any rate limiting everyone agrees on. IMO, silencing these pr_errs if they came from UFFDIO_POISON (or, perhaps, if they did not come from real hardware MCE events) sounds like the most correct thing to do, but I don't mind. Just don't make UFFDIO_POISON require CAP_SYS_ADMIN. :) Thanks.