On Fri, Jan 31, 2020 at 12:07 PM David Laight <David.Laight@xxxxxxxxxx> wrote: > > From: Rafael J. Wysocki > > Sent: 30 January 2020 14:47 > > > > In certain system configurations it may not be desirable to use some > > C-states assumed to be available by intel_idle and the driver needs > > to be prevented from using them even before the cpuidle sysfs > > interface becomes accessible to user space. Currently, the only way > > to achieve that is by setting the 'max_cstate' module parameter to a > > value lower than the index of the shallowest of the C-states in > > question, but that may be overly intrusive, because it effectively > > makes all of the idle states deeper than the 'max_cstate' one go > > away (and the C-state to avoid may be in the middle of the range > > normally regarded as available). > > > > To allow that limitation to be overcome, introduce a new module > > parameter called 'states_off' to represent a list of idle states to > > be disabled by default in the form of a bitmask and update the > > documentation to cover it. > > The problem I see is that there are (at least) 3 different ways of > referring to the C-States: So the mask is not referring to the C-states in the first place. > 1) The state names, C1, C1E, C3, C7 etc. > I'm not sure these are visible outside intel_idle.c. Yes, they are, in sysfs. > 2) The maximum allowed latency in us. > 3) The index into the cpu-dependant tables in intel_idle.c. > > Boot parameters that set 3 are completely hopeless for normal > users. The C-state names might be - but they aren't documented. > > Unless you know exactly which cpu table is being used the > only constraint a user can request is the latency. So this mask refers to the idle states numbering in sysfs, as stated in the documentation update. That covers state0 which is not a C-state too. > (I've had the misfortune to read intel_idle.c in the last week. > Almost impenetrable TLA ridden uncommented code.) I have some patches to improve that, will post them after this is settled. > ... > > + * The positions of the bits that are set in the two's complement representation > > + * of this value are the indices of the idle states to be disabled by default > > + * (as reflected by the names of the corresponding idle state directories in > > + * sysfs, "state0", "state1" ... "state<i>" ..., where <i> is the index of the > > + * given state). > > What has 'two's complement' got to do with anything? Well, it is the representation in which bits are used. Kind of as opposed to decimal or hex digits. But I can replace that phrase with "bits that are set in this number" easily enough. > ... > > +The value of the ``states_off`` module parameter (0 by default) represents a > > +list of idle states to be disabled by default in the form of a bitmask. Namely, > > +the positions of the bits that are set in the two's complement representation of > > +that value are the indices of idle states to be disabled by default (as > > +reflected by the names of the corresponding idle state directories in ``sysfs``, > > +:file:`state0`, :file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the > > +index of the given idle state; see :ref:`idle-states-representation` in > > +:doc:`cpuidle`). For example, if ``states_off`` is equal to 3, the driver will > > +disable idle states 0 and 1 by default, and if it is equal to 8, idle state 3 > > +will be disabled by default and so on (bit positions beyond the maximum idle > > +state index are ignored). The idle states disabled this way can be enabled (on > > +a per-CPU basis) from user space via ``sysfs``. > > A few line breaks would make that easier to read. Fair enough. Thanks!