On Sun, Apr 19, 2015 at 05:43:18PM +0200, Dorian Gray wrote: > I think the case is closed. > Now that I know it's not USB, but wireless driver, I looked through > the new k3.19.5's changelog and saw this: > > > commit b943e69d33fac1e5f6db57868e061096b0aae67a > Author: Larry Finger <Larry.Finger@xxxxxxxxxxxx> > Date: Sat Mar 21 15:16:05 2015 -0500 > > rtlwifi: Fix IOMMU mapping leak in AP mode > > commit be0b5e635883678bfbc695889772fed545f3427d upstream. > > Transmission of an AP beacon does not call the TX interrupt service routine, > which usually does the cleanup. Instead, cleanup is handled in a tasklet > completion routine. Unfortunately, this routine has a serious bug > in that it does > not release the DMA mapping before it frees the skb, thus one > IOMMU mapping is > leaked for each beacon. The test system failed with no free IOMMU > mapping slots > approximately one hour after hostapd was used to start an AP. > > This issue was reported and tested at > https://github.com/lwfinger/rtlwifi_new/issues/30. > > Reported-and-tested-by: Kevin Mullican <kevin@xxxxxxxxxxxx> > Cc: Kevin Mullican <kevin@xxxxxxxxxxxx> > Signed-off-by: Shao Fu <shaofu@xxxxxxxxxxx> > Signed-off-by: Larry Finger <Larry.Finger@xxxxxxxxxxxx> > Signed-off-by: Kalle Valo <kvalo@xxxxxxxxxxxxxx> > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > > > Looks very related, especially because my wireless card is also always > in AP mode, however I haven't been actually using it lately, so > probably that's why I didn't notice anything related to it (and kept > focused on USB), until I used dump_dma. > > Well, due to my minimal knowledge regarding kernel's internals I can't > be 100% sure that this was it, but so far 3.19.5 is working stable > (uptime 6hrs and counting). Sweet! > > Thank you Konrad (and everyone else involved) for helping me out to > pinpoint the actual culprit. Sure thing. Happy to have been able to help! > Jake > > > On 18 April 2015 at 21:59, Dorian Gray <yourfavouritegod@xxxxxxxxx> wrote: > > On 18 April 2015 at 12:10, Dorian Gray <yourfavouritegod@xxxxxxxxx> wrote: > >> On 17 April 2015 at 22:06, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote: > >>> On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote: > >>>> On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote: > >>>> > And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG > >>>> > and then load the attached module. > >>>> > > >>>> > That should tell you who and what else is holding on the buffers. > >>>> > >>>> Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent me. > >>>> Now, I'm not sure if I've done it right - I waited until the error > >>>> occured and then modprobe'd dump_dma. > >>>> I have attached the kernel log, but it tells me not much, if anything... > >>> > >>> The network driver is quite hungry for DMA. Did it do the same thing > >>> in the earlier kernels? > >>> > >>> Thanks. > >>>> > >>>> Thanks again. > >>>> Jake > >>> > >>> > >> > >> Yeah, you're right: > >> > >> # grep rtl8192se dump_dma_k3.19.4.log | wc -l > >> 6789 > >> # > >> # grep rtl8192se dump_dma_k3.17.8.log | wc -l > >> 162 > >> # > >> > >> So, wlan driver would be the real culprit then..? > >> I would have never thought... > >> > >> I guess I'm gonna test 3.19.4 once more (just to be sure) with > >> rtl8192se removed and see what happens. > >> > >> Thanks! > >> Jake > > > > > > [update] > > > > Ok, 6 hours of uptime (3.19.4 + blacklisted rtl8192se) and everything > > was fine... > > However, I was checking periodically and noticed that 'radeon' also > > tends to grow continuously over time, whereas ethernet driver sticks > > to, more or less, the same range: > > > > # uname -r > > 3.19.4 > > # > > # grep -Eo 'radeon|r8169' L1.log | sort | uniq -c > > 62 r8169 > > 4183 radeon > > # > > # grep -Eo 'radeon|r8169' L2.log | sort | uniq -c > > 33 r8169 > > 5582 radeon > > # > > # grep -Eo 'radeon|r8169' L3.log | sort | uniq -c > > 54 r8169 > > 7007 radeon > > # > > # grep -Eo 'radeon|r8169' L4.log | sort | uniq -c > > 49 r8169 > > 7429 radeon > > # > > # grep -Eo 'radeon|r8169' L5.log | sort | uniq -c > > 34 r8169 > > 9360 radeon > > # > > > > It doesn't grow that much in 3.17.8: > > > > # uname -r > > 3.17.8 > > # > > # grep -Eo 'radeon|r8169|rtl8192se' L1.log | sort | uniq -c > > 265 r8169 > > 1229 radeon > > 142 rtl8192se > > # > > # grep -Eo 'radeon|r8169|rtl8192se' L2.log | sort | uniq -c > > 187 r8169 > > 3159 radeon > > 124 rtl8192se > > # > > # grep -Eo 'radeon|r8169|rtl8192se' L3.log | sort | uniq -c > > 41 r8169 > > 1894 radeon > > 39 rtl8192se > > # > > # grep -Eo 'radeon|r8169|rtl8192se' L4.log | sort | uniq -c > > 64 r8169 > > 3370 radeon > > 77 rtl8192se > > # > > # grep -Eo 'radeon|r8169|rtl8192se' L5.log | sort | uniq -c > > 52 r8169 > > 2597 radeon > > 49 rtl8192se > > # > > > > > > Btw, at some point (3.19.4) I encounetered this: > > [21631.181909] DMA-API: debugging out of memory - disabling > > > > Jake -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html