Hi Dave, On 2/20/25 5:40 AM, Dave Martin wrote: > On Thu, Feb 20, 2025 at 11:35:56AM +0100, Peter Newman wrote: >> Hi Reinette, >> >> On Wed, Feb 19, 2025 at 6:55 PM Reinette Chatre >> <reinette.chatre@xxxxxxxxx> wrote: >>> >>> Hi Dave and Peter, >>> >>> On 2/19/25 6:09 AM, Peter Newman wrote: >>>> Hi Dave, >>>> >>>> On Wed, Feb 19, 2025 at 2:41 PM Dave Martin <Dave.Martin@xxxxxxx> wrote: >>>>> >>>>> Hi, >>>>> >>>>> On Wed, Jan 22, 2025 at 02:20:25PM -0600, Babu Moger wrote: >>>>>> Assign/unassign counters on resctrl group creation/deletion. Two counters >>>>>> are required per group, one for MBM total event and one for MBM local >>>>>> event. >>>>>> >>>>>> There are a limited number of counters available for assignment. If these >>>>>> counters are exhausted, the kernel will display the error message: "Out of >>>>>> MBM assignable counters". However, it is not necessary to fail the >>>>>> creation of a group due to assignment failures. Users have the flexibility >>>>>> to modify the assignments at a later time. >>>>> >>>>> If we are doing this, should turning mbm_cntr_assign mode on also >>>>> trigger auto-assingment for all extant monitoring groups? >>>>> >>>>> Either way though, this auto-assignment feels like a potential nuisance >>>>> for userspace. >>> >>> hmmm ... this auto-assignment was created with the goal to help userspace. >>> In mbm_cntr_assign mode the user will only see data when a counter is assigned >>> to an event. mbm_cntr_assign mode is selected as default on a system that >>> supports ABMC. Without auto assignment a user will thus see different >>> behavior when reading the monitoring events when the user switches to a kernel with >>> assignable counter support: Before assignable counter support events will have >>> data, with assignable counter support the events will not have data. >>> >>> I understood that interfaces should not behave differently when user space >>> switches kernels and that is what the auto assignment aims to solve. >>> >>>>> >>>>> If the userspace use-case requires too many monitoring groups for the >>>>> available counters, then the kernel will auto-assign counters to a >>>>> random subset of groups which may or may not be the ones that userspace >>>>> wanted to monitor; then userspace must manually look for the assigned >>>>> counters and unassign some of them before they can be assigned where >>>>> userspace actually wanted them. >>>>> >>>>> This is not impossible for userspace to cope with, but it feels >>>>> awkward. >>>>> >>>>> Is there a way to inhibit auto-assignment? >>>>> >>>>> Or could automatic assignments be considered somehow "weak", so that >>>>> new explicit assignments can steal automatically assigned counters >>>>> without the need to unassign them explicitly? >>>> >>>> We had an incomplete discussion about this early on[1]. I guess I >>>> didn't revisit it because I found it was trivial to add a flag that >>>> inhibits the assignment behavior during mkdir and had moved on to >>>> bigger issues. >>> >>> Could you please remind me how a user will set this flag? >> >> Quoting my original suggestion[1]: >> >> "info/L3_MON/mbm_assign_on_mkdir? >> >> boolean (parsed with kstrtobool()), defaulting to true?" >> >> After mount, any groups that got counters on creation would have to be >> cleaned up, but at least that can be done with forward progress once >> the flag is cleared. >> >> I was able to live with that as long as there aren't users polling for >> resctrl to be mounted and immediately creating groups. For us, a >> single container manager service manages resctrl. >> >>> >>>> >>>> If an agent creating directories isn't coordinated with the agent >>>> managing counters, a series of creating and destroying a group could >>>> prevent a monitor assignment from ever succeeding because it's not >>>> possible to atomically discover the name of the new directory that >>>> stole the previously-available counter and reassign it. >>>> >>>> However, if the counter-manager can get all the counters assigned once >>>> and only move them with atomic reassignments, it will become >>>> impossible to snatch them with a mkdir. >>>> >>> >>> You have many points that makes auto-assignment not be ideal but I >>> remain concerned that not doing something like this will break >>> existing users who are not as familiar with resctrl internals. >> >> I agree auto-assignment should be the default. I just want an official >> way to turn it off. >> >> Thanks! >> -Peter >> >> [1] https://lore.kernel.org/lkml/CALPaoCiJ9ELXkij-zsAhxC1hx8UUR+KMPJH6i8c8AT6_mtXs+Q@xxxxxxxxxxxxxx/ >> > > +1 > > That's basically my position -- the auto-assignment feels like a > _potential_ nuisance for ABMC-aware users, but it depends on what they > are trying to do. Migration of non-ABMC-aware users will be easier for > basic use cases if auto-assignment occurs by default (as in this > series). > > Having an explicit way to turn this off seems perfectly reasonable > (and could be added later on, if not provided in this series). > > > What about the question re whether turning mbm_cntr_assign mode on > should trigger auto-assignment? > > Currently turning this mode off and then on again has the effect of > removing all automatic assignments for extant groups. This feels > surprising and/or unintentional (?) Connecting to what you start off by saying I also see auto-assignment as the way to provide a smooth transition for "non-ABMC-aware" users. To me a user that turns this mode off and then on again can be considered as a user that is "ABMC-aware" and turning it "off and then on again" seems like an intuitive way to get to a "clean slate" wrt counter assignments. This may also be a convenient way for an "ABMC-aware" user space to unassign all counters and thus also helpful if resctrl supports the flag that Peter proposed. The flag seems to already keep something like this in its context with a name of "mbm_assign_on_mkdir" that could be interpreted as "only auto assign on mkdir"? I am not taking a stand for one or the other approach but instead trying to be more specific about pros/cons. Could you please provide more insight in the use case you have in mind so that we can see how resctrl could behave with few surprises? Reinette