Re: ALSA sound core device deinit crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 11 Jul 2016 08:45:20 +0200,
b_lkasam@xxxxxxxxxxxxxx wrote:
> 
> On 2016-07-08 11:23, Takashi Iwai wrote:
> > (Please don't drop Cc to ML)
> >
> > On Fri, 08 Jul 2016 07:35:37 +0200,
> > b_lkasam@xxxxxxxxxxxxxx wrote:
> >>
> >> On 2016-07-08 10:58, Takashi Iwai wrote:
> >> > On Fri, 08 Jul 2016 07:19:05 +0200,
> >> > b_lkasam@xxxxxxxxxxxxxx wrote:
> >> >>
> >> >> Hi Alsa team,
> >> >> There is kernel crash observed when soundcard register failure case as
> >> >> below ->
> >> >>
> >> >> diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c old mode
> >> >> 100644 new mode 100755 index 0495890..60a1eb0
> >> >> --- a/sound/soc/soc-core.c
> >> >> +++ b/sound/soc/soc-core.c
> >> >> @@ -1382,6 +1382,7 @@ static int soc_probe_link_dais(struct
> >> >> snd_soc_card *card, int num, int order)
> >> >> /* do machine specific initialization */ if (dai_link->init) { ret =
> >> >> dai_link->init(rtd);
> >> >> + ret = -ENODEV; //  -> we can force error here to reproduce crash
> >> >> easily.
> >> >> if (ret < 0) {
> >> >> dev_err(card->dev, "ASoC: failed to init %s: %d\n", dai_link->name,
> >> >> ret);
> >> >>
> >> >> If sound card fails at initialize at above, it is crashing in
> >> >> pcm_chmap_ctl_private_free().
> >> >>
> >> >> <1>[   40.646642] [01-01-2016 00:07:16 CPU:0x3] Unable to handle
> >> >> kernel
> >> >> paging request at virtual address ffffffc0da644b68
> >> >> <1>[   40.646664] [01-01-2016 00:07:16 CPU:0x3] pgd = ffffffc001d28000
> >> >> <1>[   40.646673] [01-01-2016 00:07:16 CPU:0x3] [ffffffc0da644b68]
> >> >> *pgd=00000000857fc003, *pud=00000000857fc003, *pmd=000000017ddd8003,
> >> >> *pte=00c000015a644793
> >> >> <0>[   40.646697] [01-01-2016 00:07:16 CPU:0x3] Internal error: Oops:
> >> >> 9600004f [#1] PREEMPT SMP
> >> >> <6>[   40.646708] [01-01-2016 00:07:16 CPU:0x3] Modules linked in:
> >> >> brcm_bt_drv fm_drv brcm_hci_ldisc texfat(PO)
> >> >> <6>[   40.646735] [01-01-2016 00:07:16 CPU:0x3] CPU: 3 PID: 299 Comm:
> >> >> kworker/u8:8 Tainted: P           O   3.18.24-g41b2dda2-00002-gbe25a74
> >> >> #1
> >> >> <6>[   40.646744] [01-01-2016 00:07:16 CPU:0x3] Hardware name:
> >> >> Qualcomm
> >> >> Technologies, Inc. MSM 8996 v3.x + PMI8996 MTP (DT)
> >> >> <6>[   40.646763] [01-01-2016 00:07:16 CPU:0x3] Workqueue: deferwq
> >> >> deferred_probe_work_func
> >> >> <6>[   40.646773] [01-01-2016 00:07:16 CPU:0x3] task: ffffffc0e951c880
> >> >> ti: ffffffc03368c000 task.ti: ffffffc03368c000
> >> >> <6>[   40.646787] [01-01-2016 00:07:16 CPU:0x3] PC is at
> >> >> pcm_chmap_ctl_private_free+0x1c/0x2c
> >> >> <6>[   40.646798] [01-01-2016 00:07:16 CPU:0x3] LR is at
> >> >> snd_ctl_free_one+0x20/0x34
> >> >>
> >> >>
> >> >> FIX:
> >> >>
> >> >> Can you look at the change below and share your comments on this?
> >> >> diff --git a/sound/core/device.c b/sound/core/device.c old mode 100644
> >> >> new mode 100755 index 41bec30..eaffde1
> >> >> --- a/sound/core/device.c
> >> >> +++ b/sound/core/device.c
> >> >> @@ -219,6 +219,7 @@ void snd_device_free_all(struct snd_card *card)
> >> >>
> >> >>           if (snd_BUG_ON(!card))
> >> >>                   return;
> >> >> -       list_for_each_entry_safe_reverse(dev, next, &card->devices,
> >> >> list)
> >> >> +               list_for_each_entry_safe(dev, next, &card->devices,
> >> >> list)
> >> >>                   __snd_device_free(dev); }
> >> >>
> >> >>
> >> >> Since control sound device has the lowest type value
> >> >> (SNDRV_DEV_CONTROL), it will be the first entry linked in the
> >> >> card->devices linked list head and will be the last one to be freed.
> >> >>
> >> >> This issue seems to be resolved by modifying the sequence the sound
> >> >> devices in the card->devices list are freed as shown below (from
> >> >> “prev”
> >> >> direction to “next” direction) but I’m not sure if this is a right
> >> >> approach from ALSA perspective.
> >> >
> >> > This doesn't look correct.  The strange thing is that this error
> >> > shouldn't happen no matter which free loop direction is.  The chmap
> >> > ctl should have been already removed by the disconnection before
> >> > freeing.  It means that either the disconnection isn't done properly
> >> > or something else is missing.
> >> >
> >> > Could you give the full stack trace?  It's important to know which
> >> > code path triggers it.
> >> >
> >> >
> >> > thanks,
> >> >
> >> > Takashi
> >>
> >>
> >> Hi Takashi,
> >>
> >> Here is full stack trace -->
> >>
> >>
> >> [Callstack]
> >> : (struct snd_pcm_chmap *)info = 0xFFFFFFC0DA6A5200
> >> : (struct snd_pcm *)pcm = 0xFFFFFFC0DA644A80 //part of buddy page,
> >> read-only
> >> : info->stream = 0
> >> : snd_card_free() was called
> >>
> >> -012|pcm_chmap_ctl_private_free()
> >> //info->pcm->streams[info->stream].chmap_kctl = NULL
> >> -013|snd_ctl_free_one()
> >> -014|snd_ctl_remove()
> >> -015|snd_ctl_dev_free()
> >> -016|__snd_device_free()
> >> -017|snd_device_free_all()
> >> -018|snd_card_do_free(inline)
> >> -018|release_card_device()
> >> -019|device_release()
> >> -020|kobject_cleanup(inline)
> >> -020|kobject_release()
> >> -021|kobject_put()
> >> -022|put_device()
> >> -023|snd_card_free_when_closed()
> >> -024|snd_card_free()
> >> -025|snd_soc_instantiate_card(inline) -> //Here instantiate card
> >> failed for some reason, then triggers snd_card_free
> >> -025|snd_soc_register_card()
> >> -026|msm8996_asoc_machine_probe()
> >> -027|platform_drv_probe()
> >> -028|really_probe(inline)
> >> -028|driver_probe_device()
> >> -029|__device_attach()
> >> -030|bus_for_each_drv()
> >> -031|device_attach()
> >> -032|bus_probe_device()
> >> -033|deferred_probe_work_func()
> >> -034|static_key_count(inline)
> >> -034|static_key_false(inline)
> >> -034|trace_workqueue_execute_end(inline)
> >> -034|process_one_work()
> >> -035|worker_thread()
> >> -036|kthread()
> >> -037|ret_from_fork(asm)
> >> ---|end of frame
> >>
> >>
> >> -----------------------
> >> trigger point in soc-core.c
> >> in API snd_soc_instantiate_card(),
> >>
> >>         card_probe_error:
> >>                  if (card->remove)
> >>                         card->remove(card);
> >>
> >>                  snd_card_free(card->snd_card); -> this is
> >> snd_card_free which internally leads to above function call.
> >> ------------------------------------------
> >
> > OK, thanks.  So this is the case where it frees before registering,
> > and indeed there is a bug in the PCM chmap code.  It's freed at
> > disconnection but the disconnect is called only when it was
> > registered.
> >
> > Below is a quick fix.  Give it a try.
> >
> >
> > thanks,
> >
> > Takashi
> >
> > ---
> > diff --git a/sound/core/control.c b/sound/core/control.c
> > index 9ff081cd03f4..fb096cb20a80 100644
> > --- a/sound/core/control.c
> > +++ b/sound/core/control.c
> > @@ -160,6 +160,8 @@ void snd_ctl_notify(struct snd_card *card,
> > unsigned int mask,
> >
> >  	if (snd_BUG_ON(!card || !id))
> >  		return;
> > +	if (card->shutdown)
> > +		return;
> >  	read_lock(&card->ctl_files_rwlock);
> >  #if IS_ENABLED(CONFIG_SND_MIXER_OSS)
> >  	card->mixer_oss_change_count++;
> > diff --git a/sound/core/pcm.c b/sound/core/pcm.c
> > index 308c9ecf73db..8e980aa678d0 100644
> > --- a/sound/core/pcm.c
> > +++ b/sound/core/pcm.c
> > @@ -849,6 +849,14 @@ int snd_pcm_new_internal(struct snd_card *card,
> > const char *id, int device,
> >  }
> >  EXPORT_SYMBOL(snd_pcm_new_internal);
> >
> > +static void free_chmap(struct snd_pcm_str *pstr)
> > +{
> > +	if (pstr->chmap_kctl) {
> > +		snd_ctl_remove(pstr->pcm->card, pstr->chmap_kctl);
> > +		pstr->chmap_kctl = NULL;
> > +	}
> > +}
> > +
> >  static void snd_pcm_free_stream(struct snd_pcm_str * pstr)
> >  {
> >  	struct snd_pcm_substream *substream, *substream_next;
> > @@ -871,6 +879,7 @@ static void snd_pcm_free_stream(struct
> > snd_pcm_str * pstr)
> >  		kfree(setup);
> >  	}
> >  #endif
> > +	free_chmap(pstr);
> >  	if (pstr->substream_count)
> >  		put_device(&pstr->dev);
> >  }
> > @@ -1135,10 +1144,7 @@ static int snd_pcm_dev_disconnect(struct
> > snd_device *device)
> >  	for (cidx = 0; cidx < 2; cidx++) {
> >  		if (!pcm->internal)
> >  			snd_unregister_device(&pcm->streams[cidx].dev);
> > -		if (pcm->streams[cidx].chmap_kctl) {
> > -			snd_ctl_remove(pcm->card, pcm->streams[cidx].chmap_kctl);
> > -			pcm->streams[cidx].chmap_kctl = NULL;
> > -		}
> > +		free_chmap(&pcm->streams[cidx]);
> >  	}
> >  	mutex_unlock(&pcm->open_mutex);
> >  	mutex_unlock(&register_mutex);
> 
> 
> hi Takashi,
> Thanks for the patch.
> 
> It is working fine after above patch.

Good to hear.  I queued the fix now.  It'll be likely included in
4.7-final.


thanks,

Takashi
_______________________________________________
Alsa-devel mailing list
Alsa-devel@xxxxxxxxxxxxxxxx
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel




[Index of Archives]     [ALSA User]     [Linux Audio Users]     [Kernel Archive]     [Asterisk PBX]     [Photo Sharing]     [Linux Sound]     [Video 4 Linux]     [Gimp]     [Yosemite News]

  Powered by Linux