On Fri, 04 Aug 2023 21:17:56 +0200, Curtis Malainey wrote: > > On Fri, Aug 4, 2023 at 1:58 AM Takashi Iwai <tiwai@xxxxxxx> wrote: > > > > Now I've been reconsidering the problem, and thought of another > > possible workaround. We may add the card's refcount control for the > > device init and release, so that we delay the actual resource free. > > The basic idea is to take card->card_ref at the device init and unref > > it at the actual device release callback. Then the snd_card_free() > > call is held until all those refcounted devices are released. > > > > Below is a PoC patch (yes, this can be split, too ;) > > A quick test on VM seems OK, but needs more reviews and tests. > > > > It contains somewhat ugly conditional put_device() at the dev_free > > ops. We can make those a bit saner with some helpers later, too. > > > > BTW, this approach may avoid another potential problem by the delayed > > release; if we finish the driver remove without waiting for the actual > > device releases, it may hit a code (the piece of the device release > > callback) of the already removed module, and it's not guaranteed to be > > present. > > I'm not sure whether this really happens, but it's another thing to be > > considered. > > > > > > thanks, > > > > Takashi > > > > --- > > diff --git a/include/sound/core.h b/include/sound/core.h > > index f6e0dd648b80..00c514a80a4a 100644 > > --- a/include/sound/core.h > > +++ b/include/sound/core.h > > Unfortunately it doesn't hold up in my testing, hit the devm vs device > race bug after a little over 30min of hammering snd-dummy (on top of > your for-next branch) (snip) > I was talking with Stephen Boyd about this bug and his recommendation > was to keep snd_card always out of devm and just allocate a pointer in > devm to snd_card to puppet it as if it was managed via devm and just > reference count rather than release the card so that its always > handled through device->release. What do you think? This would go > alongside the current patch proposed. Indeed, that's another issue about devres vs delayed kobj release. A quick fix would be something like below, I suppose. (note: totally untested) thanks, Takashi --- a/sound/core/init.c +++ b/sound/core/init.c @@ -188,7 +188,10 @@ EXPORT_SYMBOL(snd_card_new); static void __snd_card_release(struct device *dev, void *data) { - snd_card_free(data); + struct snd_card **card_p = data; + + if (card_p) + snd_card_free(*card_p); } /** @@ -221,21 +224,25 @@ int snd_devm_card_new(struct device *parent, int idx, const char *xid, struct snd_card **card_ret) { struct snd_card *card; + struct snd_card **card_devres; int err; *card_ret = NULL; - card = devres_alloc(__snd_card_release, sizeof(*card) + extra_size, - GFP_KERNEL); + card_devres = devres_alloc(__snd_card_release, sizeof(void *), GFP_KERNEL); + if (!card_devres) + return -ENOMEM; + devres_add(parent, card_devres); + + card = kzalloc(sizeof(*card) + extra_size, GFP_KERNEL); if (!card) return -ENOMEM; + card->managed = true; err = snd_card_init(card, parent, idx, xid, module, extra_size); - if (err < 0) { - devres_free(card); /* in managed mode, we need to free manually */ + if (err < 0) return err; - } - devres_add(parent, card); + *card_devres = card; *card_ret = card; return 0; } @@ -295,8 +302,7 @@ static int snd_card_init(struct snd_card *card, struct device *parent, mutex_unlock(&snd_card_mutex); dev_err(parent, "cannot find the slot for index %d (range 0-%i), error: %d\n", idx, snd_ecards_limit - 1, err); - if (!card->managed) - kfree(card); /* manually free here, as no destructor called */ + kfree(card); /* manually free here, as no destructor called */ return err; } set_bit(idx, snd_cards_lock); /* lock it */ @@ -592,8 +598,7 @@ static int snd_card_do_free(struct snd_card *card) #endif if (card->release_completion) complete(card->release_completion); - if (!card->managed) - kfree(card); + kfree(card); return 0; }