Re: RFC: Drm-connector properties managed by another driver / privacy screen support

Hans de Goede <hdegoede@xxxxxxxxxx> · Wed, 15 Apr 2020 13:39:23 +0200

Hi,

On 4/15/20 12:22 PM, Daniel Vetter wrote:
On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:

Hi,

On 4/15/20 11:52 AM, Daniel Vetter wrote:

<snip>

iv. What every SoC subsystem does:

- lcdshadow drivers register drivers
- drm drivers look them up
- if stuff isn't there yet, we delay loading with EPROBE_DEFER until
the entire thing is assembled.

That's what we're doing already for other standardized components like
drm_bridge or drm_panel, and I think that's also the right approach
for backlight and anything else like that. Hand-rolling our own
EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
to real pain. Also, with EPROBE_DEFER we have one standard way of
building a driver from component, which spans subsystems and is also
the underlying magic that makes stuff like component.c work.

On the SoCs we have devicetree telling us what components there
are, so we can wait for them to show up. The only way to figure out
if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
or duplicating a lot of code from thinkpad_acpi. Edit:
also see below for a possible solution.

Yup it sucks. I think all we can do is have a small acpi match
function (which yes will duplicate some of the thinkpad_acpi driver
logic) to re-create that information and give us a "should we have a
lcdshadow driver for this $pci_device" answer.

Ok, so questions about this solution:

1. Where should that match-function live ?

2. An acpi_thinkpad derived match-function will only be able to
   answer if there is an lcdshadow device/driver for the internal
   panel. It will not be able to tie this info to a certain PCI
   device. My plan is to pass NULL as dev_name when registering
   the lcdshadow-device and have lcdshadow_get(dev, <connector-name>)
   skip device-name matching (consider everything a match) for
   lcdshadow-devices registered with NULL as dev_name.

   So I guess in this case the mini match function should just
   ignore the passed in device?

This need for an in-kernel source of truth for "which backlight, if
any, do I need" is why the backlight stuff never got fixed. It's a
really hard problem, and doing the table flip and just letting
userspace deal with the mess at least avoids having to face the fact
that the kernel is totally failing here. It's made worse for backlight
because of 20 years of hacked up systems that "work", and we can't
regress any of them.

Right, as discussed during last plumbers Christian Kellner and I
have written a plan to slowly resolve this. Unfortunately Christian
has not found the time to work on this yet.

I really want to avoid that situation for anything new like lcdshadow.

Ack.

Wrt the actual userspace interface, I think the drm core should handle
this as much as possible. Same way we let drm core handle a lot of the
atomic property stuff, to make sure things are standardized.

Agreed.

So

- add an lcdshadow pointer to struct drm_connector
- implement the property glue code in drm core completely, at least
for the read side
- for the write side we might want to have some drm wrappers drivers
can call to at the appropriate times to e.g. restore privacy screen
settings when the panel gets enabled. In case that's needed.

Also one thing that the EPROBE_DEFER stuff forces us to handle
correctly is to track these depencies. That's the other nightmare in
backlight land, essentially you have no idea of knowing (in userspace)
whether the backlight driver you want is actually loaded, resulting in
lots of fun. The kernel is the only thing that can know, and for hw
that's built-in there's really no excuse to not know. So a model where
stuff gets assembled after drm_dev_register() is imo just plain buggy.

This means that the lcdshadow subsystem needs to have some idea of
whether it needs a driver, so that it can correctly differentiate
between EPROBE_DEFER and ENOENT error cases. In the latter the driver
should continue loading ofc.

Right, so how would the lcdshadow subsystem do this? The only way
would be for it to say try and modprobe thinkpad_acpi from its
core init function (if thinkpad_acpi is enabled).  IOW it is the usual
x86 mess.  I guess we could have something like this in a theoretical
to be added lcdshadow subsystem:

/* Add comment explaining why we need this messy stuff here */
const char * const shadow_providers[] = {
#ifdef CONFIG_THINKPAD_ACPI_MODULE
         "thinkpad_acpi",
#endif
#ifdef CONFIG_OTHER_MODULE
         "other",
#endif
         NULL
};

int module_init(void)
{
         /* do actual setup of the ?class? */

         for (i = 0; shadow_providers[i]; i++)
                 request_module(shadow_providers[i]);

         return 0;
}

Hm I think explicitly loading drivers feels very much not device model
like. Don't these drivers auto-load, matching on acpi functions?

thinkpad_acpi does autoload based on a number of ACPI device-ids,
the idea behind the above request_module is to avoid the need
to have the acpi-match function you mentioned above.

Basically what would happen is e.g. :

1. i915 loads, calls lcdshadow_get(dev, "eDP-1");
2. if this is the first lcdshadow_get() call then
   the lcdshadow core will do the request_module calls,
   allowing any of these modules to get loaded + probed and
   call e.g. lcdshadow_register(&mylcdshadowdev, <gfx-adapter-dev-name>, "eDP-1");
3. After the request modules the lcdshadow_get() will walk over
   the list of registered devices, including the ones just registered
   by the request_module calls and then hopefully find a match

So by doing the request-module calls before checking for
a matching lcdshadow dev, we avoid the need of having some of
the knowledge currently abstracted away in the thinkpad_acpi driver
duplicated inside the drm code somewhere.

I guess if that doesn't exist, then we'd need to fix that one first :-/
In general no request_module please, that shouldn't be needed either.

The trouble with request_module is also that (afaiui) it doesn't
really work well with parallel module load and all that, for
EPROBE_DEFER to work we do need to be able to answer "should we have a
driver?" independently of whether that driver has loaded already or
not.

The idea here is to avoid using EPROBE_DEFER (on x86 at least)
and either directly return the lcdshadow_dev or ENOENT. Also
see below.

And then simply have the function which gets a lcd_shadow provider
provide -ENOENT if there are none.

One problem here is that this will work for modules since
the drm-core will depend on modules from the lcdshadow-core,
so the lcdshadow-core module will get loaded first and will
then try to load thinkpad_acpi, which will return with -ENODEV
from its module_init on non ThinkPads. We could even put the
request_module loop in some other init_once function so that
things will also work when the lcdshadow-core is builtin.

But how do we ensure that thinkpad_acpi will get to register
its lcdshadow before say i915 gets probed if everything is builtin?

EPROBE_DEFER. Everyone hates it, but it does work. It's really kinda
neat magic. The only pain point is that you might need to wire
EPROBE_DEFER through a few layers in i915, but we do have a lot of
that in place already,

AFAIK we do not have any EPROBE_DEFER handling in i915 in place at
all! There are _0_ checks for an EPROBE_DEFER return in all of the i915
code. We actually have a similar problem with backlight control when
controlled by an external PWM controller such as the PWM controller
of the Crystal Cove PMIC found on some BYT/CHT drivers or the
PWM controller found in the LPSS bits of BYT/CHT SoCs.

Since again we lack a clear hardware model like device tree,
we use lookup tables for the (external) PWM controllers on x86,
see the pwm_add_table calls in drivers/acpi/acpi_lpss.c
and drivers/mfd/intel_soc_pmic_core.c and the pwm_get call
in the i915 code simply continues on its happy way without
backlight control if the pwm_get fails, rather then dealing
with -EPROBE_DEFER. I looked into untangling this back then
but the i915 code really is not ready to unroll everything
it has done already once we are probing connectors.

I actually "fixed" the PWM issue by:

1. Adding a module-name field to the PWM lookup table
registered by the pwm_add_table calls.

2. Have the PWM core call request_module on that module_name
if it finds a match in the registered lookup_table (which
get registered early on), but the matching pwm_dev has not
been registered yet.

So I agree with you that ideally i915 would handle EPROBE_DEFER
but atm AFAIK it really does not handle that at all and
we are pretty far away from getting to a point where it
does handle that.

Assuming we are going to add some device/model specific
lcdshadow knowledge inside the lcdshadow core as you
suggested with your "small acpi match function" above,
we could do something similar to what the vga_switcheroo
code is doing for this and have a lcdshadow_defer_probe()
helper and call that really early in i915_pci_probe(),
which currently already has this for the vgaswitcheroo case:

        if (vga_switcheroo_client_probe_defer(pdev))
                return -EPROBE_DEFER;

The reason why I suggested the request_module approach
is because as said it will allow lcdshadow_get()
to return either a device or -ENOENT and never -EPROBE_DEFER
and currently none of the i915 code, nor the nouveau code
nor the amdgpu code has any EPROBE_DEFER checks. They all
assume that once there probe function is called they can
continue on forward without unrolling and exiting from
probe with EPROBE_DEFER ever. I agree that that is somewhat
of a bad assumption now a days but fixing that is going to
be massive undertakig and I'm afraid that trying to deal
with it will lead to many many regressions.

So I would rather work around it by using request_module.

plus we have solid error-injecting load error
tests in igt. So that part shouldn't be a big deal.

I guess my SOFTDEP solution has the same issue though. Do you
know has this is dealt with for kvmgt ?

kvmgt? What do you mean there? kvmgt is just a user of i915-gem, if
you enable it in the config (and module option too iirc) then it works
more like an additional uapi layer, similar to how we handle fbdev
emulation. So totally different scenario, since the main driver is
100% in control of the situation and controls the register/unregister
lifetime cycles completely. There's no indepedent part somewhere else
for kvmgt or fbdev emulation.

Or I'm totally misunderstanding what you mean with this example here.

The i915 driver used to have a:

MODULE_SOFTDEP("pre: kvmgt")

But it seems that that has been replaced with some mechanism
or maybe the need for kvmgt to be loaded first has been removed/

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel