Hi Stefan, thank you for the report. I think what happens is the following: On Wed, 2018-09-12 at 15:26 -0700, Stefan Agner wrote: > Hi, > > While working on Apalis iMX6 with four displays, I encountered the > following crash: > [...] > [ 3.781163] imx-drm display-subsystem: bound 120000.hdmi (ops dw_hdmi_imx_ops) > [ 3.788441] imx-drm display-subsystem: binding ldb (ops imx_ldb_ops) > [ 3.794902] imx_ldb_bind, imx_ldb ec0da010 component_bind calls devres_open_group right before component->ops->bind The devmem_kzalloc'ed imx_ldb_channel that contains the encoder is allocated in this devres group. > [ 3.799818] drm_encoder_init, encoder ec0da388 > [ 3.804363] drm_encoder_init, ret 0 drm_encoder_init adds the encoder in imx_ldb to the mode_config.encoder_list. > [ 3.807908] imx_ldb_register, encoder ec0da388 > [ 3.812432] imx_ldb_register, ret 0 > [ 3.815951] imx_ldb_bind encoder LVDS-46, ec0da388 ->func c0e6371c > [ 3.822227] imx_ldb_bind, done > [ 3.825331] imx-drm display-subsystem: bound ldb (ops imx_ldb_ops) > [ 3.831614] imx-drm display-subsystem: binding parallel-display-controller1 (ops imx_pd_ops) > [ 3.840285] drm_encoder_init, encoder ec8a2b78 > [ 3.844931] imx-drm display-subsystem: Linked as a consumer to panel > [ 3.851472] imx-drm display-subsystem: bound parallel-display-controller1 (ops imx_pd_ops) > [ 3.859785] imx-drm display-subsystem: binding parallel-display-controller0 (ops imx_pd_ops) > [ 3.868561] imx-drm display-subsystem: failed to bind parallel-display-controller0 (ops imx_pd_ops): -517 > [ 3.878225] imx-drm display-subsystem: Dropping the link to panel > [ 3.884679] imx_ldb_unbind > [ 3.887421] imx_ldb_unbind, encoder LVDS-46, ec0da388 ->func c0e6371c > [ 3.893950] imx_ldb_unbind, encoder (null), ec0da838 ->func 0 > [ 3.899723] imx_ldb_unbind, done component_unbind calls devres_release_group after component->ops- >unbind, freeing imx_ldb and with it the encoder that is still in the mode_config.encoder_list > [ 3.908345] imx_drm_bind, component_bind_all, ret -517 > [ 3.913638] drm_mode_config_cleanup, encoder TMDS-44, ec861894 ->funcs c0e63a18 > [ 3.921260] drm_mode_config_cleanup, calling encoder->funcs->destroy! > [ 3.927897] drm_encoder_cleanup, encoder ec861894 > [ 3.932717] drm_mode_config_cleanup, encoder (null), ec0da388 ->funcs 0 Use after free. > [ 3.939356] drm_mode_config_cleanup, calling encoder->funcs->destroy! > [ 3.946050] Unable to handle kernel NULL pointer dereference at virtual address 00000004 [...] > The Apalis iMX6 device tree I work with uses a dumb VGA bridge and a LVDS > display directly specified in the ldb node. In the case above I did not > compile in the dumb VGA bridge driver, that is the reason why binding > parallel-display-controller0 fails. I suppose this means that component drivers must not allocate the encoder in the component devres group during bind. Not only their own failure, but any other component's failure can cause component_unbind to be called in the cleanup path of component_bind_all, freeing the memory before drm_mode_config_cleanup is called. > The LVDS encoder (0xec1a0388 in that case) is somehow already cleaned up > when calling drm_mode_config_cleanup(), which leads to a null pointer > deference when trying to call destroy. > > I was unable to figure out why the DRM encoder in the second > drm_mode_config_cleanup() call is already zeroed out. The component > framework calls imx_ldb_unbind() first, but that should not free > up the encoder afaict. Any idea? > > I was also wondering in the deferred probing case, functions like > imx_ldb_bind() call devm_kzalloc on every try. Since the underlying > device is not removed, doesn't this leads to multiple active allocations? See above. To me it looks like we have to allocate imx_ldb in probe and deal with possibly partially preinitialzied structure when bind is called a second time. regards Philipp _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel