Re: [PATCH v5 08/10] drm/bridge: samsung-dsim: use supporting variable for out_bridge

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Luca,

On Wed, Jan 29, 2025 at 12:51:46PM +0100, Luca Ceresoli wrote:
> > > > > For hotplugging we cannot use drmm and devm and instead we use get/put,
> > > > > to let the "next bridge" disappear with the previous one still present.
> > > > > So the trivial idea is to add a drm_of_get_bridge(), similar to
> > > > > {drmm,devm_drm}_of_get_bridge() except it uses plain
> > > > > drm_panel_bridge_add() instead of devm/drmm variants. But then the
> > > > > caller (which is the panel consumer) will have to dispose of the struct
> > > > > drm_bridge pointer by calling:
> > > > > 
> > > > >  - drm_bridge_put() in case a)
> > > > >  - drm_panel_bridge_remove in case b)
> > > > > 
> > > > > And that's the problem I need to solve.    
> > > > 
> > > > I'm not sure the problem is limited to panel_bridge. Your question is
> > > > essentially: how do I make sure a driver-specific init is properly freed
> > > > at drm_bridge_put time. This was done so far mostly at bridge remove
> > > > time, but we obviously can't do that anymore.
> > > > 
> > > > But we'd have the same issue if, say, we needed to remove a workqueue
> > > > from a driver.
> > > > 
> > > > I think we need a destroy() hook for bridges, just like we have for
> > > > connectors for example that would deal with calling
> > > > drm_panel_bridge_remove() if necessary, or any other driver-specific
> > > > sequence.  
> > > 
> > > The .destroy hook looked appealing at first, however as I tried to
> > > apply the idea to bridges I'm not sure it matches. Here's why.
> > > 
> > > The normal (and sane) flow for a bridge is:
> > > 
> > >  A) probe
> > >     1. allocate private struct embedding struct drm_bridge
> > >        (I have an _alloc() variant ready for v5 to improve this as you proposed)
> > >     2. get resources, initialize struct fields
> > >     3. drm_bridge_add(): publish bridge into global bridge_list
> > > 
> > > Now the bridge can be found and pointers taken and used.  
> > 
> > We agree so far.
> 
> Good :-)
> 
> > > And on hardware removal, in reverse order:
> > >  
> > >  B) remove (hardware is hot-unplugged)
> > >     3. unpublish bridge
> > >     2. release resources, cleanup
> > >     1. kfree private struct  
> > 
> > I think the sequence would rather be something like:
> > 
> > B') remove
> >   3. unpublish bridge
> >   2. release device resources
> >   1. release reference
> > 
> > C') last put
> >   2. release KMS resources
> >   1. kfree private struct
> 
> Just to ensure we are on the same page: patch 3 is already implementing
> this model except for C'2.
> 
> Well, in reality it even implements a .destroy callback at C'2, even
> though it was not meant for the usage you have in mind and it's
> scheduled for removal in v6 -- even though as I said I'm OK in
> re-adding it if it is useful.
> 
> Mainly I'm not sure I understand for which ultimate goal you propose to
> postpone releasing KMS resources to C'.
> 
> Is it (1) because we _want_ to postpone releasing KMS resources? In this
> case I don't know the use case, so if you have a practical example it
> would probably help a lot.

It's not that we want it, it's that it's already happening :)

The main DRM device is only torn down not when its devices goes away,
but when the last application closes its fd to the device file. Thus,
there's a significant window (possibly infinitely long) between the
device being removed and the DRM device being freed, during which the
application will still be able to issue ioctl that might reach the
driver.

Thus, we have two kind of resources: the ones tied to the device
(clocks, register mappings, etc.) that will go away when the device is
removed, and the ones tied to the DRM device (connectors, bridges, etc.)
that need to stick until the DRM device is free'd. It's the difference
between devm and drmm actions.

> Moreover, for the panel bridge specifically, it would mean postponing
> the destruction of the struct panel_bridge, which however has a pointer
> to the panel. But the panel is probably hot-unplugged at the same time
> as the previous removable bridge(s), we'd have a time window between B'
> and C' where there is a pointer to a freed struct panel. We'd need to
> ensure that pointer is cleared at B'2, even though it is a "KMS
> resource" and not a "device resource".

You're correct, but we have to start somewhere. Fixing the issue for
bridges only will already fix it for all setups using only bridges, even
if the ones using panels are still broken.

I'm also mentoring someone at the moment to fix this for panels, so it's
only temporary.

> Or is it (2) because there are cases where we don't know how else we
> could release the KMS resources? AFAIK all bridge drivers are able to
> release everything in their remove function (B'2) with the exception of
> the panel bridge, so this sounds like a workaround for just one user
> that apparently we all agree should be improved on its own anyway.
> 
> Note I'm not strongly against (2), if it simplifies the path towards
> dynamic bridge lifetime by postponing the panel bridge rework. I just
> need to understand the plan.
> 
> Another question is what is a device resource and what is a KMS
> resource. What's the boolean expression to classify a
> resource in one or the other family? For example, in your example
> quoted above ("But we'd have the same issue if, say, we needed to
> remove a workqueue from a driver"), is the workqueue a KMS resource?

It depends on what the workqueue is doing. If it's to handle atomic
commits like the writeback code, then it's KMS facing. If it's to handle
interrupts, it's device facing.

It's hard to come up with a boolean classification, but it's basically
"can any ioctl code path end up using that resource?".

> I need to understand your idea if I want to implement it.
> 
> > > Some drivers do real stuff in B2, so it is important that B3 happens
> > > before B2, isn't it? We don't want other drivers to find and use a
> > > bridge that is being dismantled, or afterwards.  
> > 
> > Yeah, B3/B'3 should definitely happen first.
> > 
> > > B3 should normally happen by removing the bridge from the global
> > > bridge_list, or other bridges might find it. However setting the "gone"
> > > bool and teaching of_drm_find_bridge() & Co to skip bridges with
> > > gone==true would allow to postpone the actual removal, if needed.
> > > 
> > > With that said, with hotplugging there will be two distinct events:
> > > 
> > >  * hardware removal
> > >  * last ref is put
> > > 
> > > The second event could happen way later than the first one. During the
> > > time frame between the two events we need the bridge to be unpublished
> > > and the bridge resources to be already released, as the hardware is
> > > gone. We cannot do this at the last put, it's too late.
> > > 
> > > So I think the only sane sequence is:
> > > 
> > >  * on hardware removal:
> > >      B3) unpublish bridge (drm_bridge_remove() or just set gone flag)
> > >      B2) free resources, deinit whatever needed
> > >  * when last ref is put
> > >      B1) kfree (likely via devm)  
> > 
> > No, devm will have destroyed it in B'2. We need to destroy it in the
> > cleanup hook of kref_put
> 
> devm will have destroyed what? Sorry I'm not following.
>
> If you mean "it" == "the private struct", then no, this is not the
> case. drm_bridge_init in patch 3 does not kfree the private struct but
> instead registers a devm action to call drm_bridge_put. Then, at the
> last put, drm_bridge_free() will actually kfree the private struct.
> 
> In this v5, kree()ing the private struct at the last put is done via
> a callback. In my work towards v6 the principle is the same but I have
> reworked it all, implementing a devm_drm_bridge_alloc() macro as you
> suggested (BTW that was a great improvement, thanks) and removing the
> .destroy callback as it was not needed.
> 
> In case it helps, here's a preview of my v6, with some added comments to
> support this discussion:
> 
> /* Internal function (for refcounted bridges) */
> void __drm_bridge_free(struct kref *kref)
> {
>         struct drm_bridge *bridge = container_of(kref, struct drm_bridge, refcount);
>         void *container = ((void*)bridge) - bridge->container_offset;
> 
>         DRM_DEBUG("bridge=%p, container=%p FREE\n", bridge, container);
> 
>         kfree(container);
> }
> EXPORT_SYMBOL(__drm_bridge_free);
> 
> static inline void drm_bridge_put(struct drm_bridge *bridge)
> {
>         if (!drm_bridge_is_refcounted(bridge))
>                 return;
> 
>         DRM_DEBUG("bridge=%p PUT\n", bridge);
> 
>         kref_put(&bridge->refcount, __drm_bridge_free);
> }
> 
> static void drm_bridge_put_void(void *data)
> {
>         struct drm_bridge *bridge = (struct drm_bridge *)data;
> 
>         drm_bridge_put(bridge);
> }
> 
> // fold this into __devm_drm_bridge_alloc() or keep for consistency
> // with drm_encoder.c?
> static int __devm_drm_bridge_init(struct device *dev, struct drm_bridge *bridge,
>                                   size_t offset, const struct drm_bridge_funcs *funcs)
> {
>         int err;
> 
>         bridge->container_offset = offset;
>         kref_init(&bridge->refcount);
>         bridge->is_refcounted = 1;
> 
>         err = devm_add_action_or_reset(dev, drm_bridge_put_void, bridge); // <== devm just puts one ref, does not kfree
>         if (err)
>                 return err;
> 
>         bridge->funcs = funcs;
> 
>         return 0;
> }
> 
> void *__devm_drm_bridge_alloc(struct device *dev, size_t size, size_t offset,
>                               const struct drm_bridge_funcs *funcs)
> {
>         void *container;
>         struct drm_bridge *bridge;
>         int ret;
> 
>         if (!funcs) {
>                 dev_warn(dev, "Missing funcs pointer\n");
>                 return ERR_PTR(-EINVAL);
>         }
> 
>         container = kzalloc(size, GFP_KERNEL);     // <== NOT allocating with devm
>         if (!container)
>                 return ERR_PTR(-ENOMEM);
> 
>         bridge = container + offset;
> 
>         ret = __devm_drm_bridge_init(dev, bridge, offset, funcs);
>         if (ret)
>                 return ERR_PTR(ret);
> 
>         DRM_DEBUG("bridge=%p, container=%p, funcs=%ps ALLOC\n", bridge, container, funcs);
> 
>         return container;
> }
> EXPORT_SYMBOL(__devm_drm_bridge_alloc);

Awesome, I guess we were actually understanding each other the whole
time then :)

I'm still kind of sure we'll require a destroy callback to call in
__drm_bridge_free, but if it works, I guess it's good enough for now.

Maxime

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux