On 07/09/2024 8:46 am, Takashi Iwai wrote: > On Fri, 06 Sep 2024 20:42:09 +0200, > Ariadne Conill wrote: >> This patch attempted to work around a DMA issue involving Xen, but >> causes subtle kernel memory corruption. >> >> When I brought up this patch in the XenDevel matrix channel, I was >> told that it had been requested by the Qubes OS developers because >> they were trying to fix an issue where the sound stack would fail >> after a few hours of uptime. They wound up disabling SG buffering >> entirely instead as a workaround. >> >> Accordingly, I propose that we should revert this workaround patch, >> since it causes kernel memory corruption and that the ALSA and Xen >> communities should collaborate on fixing the underlying problem in >> such a way that SG buffering works correctly under Xen. >> >> This reverts commit 53466ebdec614f915c691809b0861acecb941e30. >> >> Signed-off-by: Ariadne Conill <ariadne@ariadne.space> >> Cc: stable@xxxxxxxxxxxxxxx >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx >> Cc: alsa-devel@xxxxxxxxxxxxxxxx >> Cc: Takashi Iwai <tiwai@xxxxxxx> > The relevant code has been largely rewritten for 6.12, so please check > the behavior with sound.git tree for-next branch. I guess the same > issue should happen as the Xen workaround was kept and applied there, > too, but it has to be checked at first. > > If the issue is persistent with there, the fix for 6.12 code would be > rather much simpler like the blow. > > > thanks, > > Takashi > > --- a/sound/core/memalloc.c > +++ b/sound/core/memalloc.c > @@ -793,9 +793,6 @@ static void *snd_dma_sg_alloc(struct snd_dma_buffer *dmab, size_t size) > int type = dmab->dev.type; > void *p; > > - if (cpu_feature_enabled(X86_FEATURE_XENPV)) > - return snd_dma_sg_fallback_alloc(dmab, size); > - > /* try the standard DMA API allocation at first */ > if (type == SNDRV_DMA_TYPE_DEV_WC_SG) > dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC; > > Individual subsystems ought not to know or care about XENPV; it's a layering violation. If the main APIs don't behave properly, then it probably means we've got a bug at a lower level (e.g. Xen SWIOTLB is a constant source of fun) which is probably affecting other subsystems too. I think we need to re-analyse the original bug. Right now, the behaviour resulting from 53466ebde is worse than what it was trying to fix. ~Andrew