On Sun, Nov 24, 2024 at 11:23 PM Baochen Qiang <quic_bqiang@xxxxxxxxxxx> wrote: > > > > On 11/23/2024 8:43 AM, Tim Harvey wrote: > > On Thu, Nov 21, 2024 at 9:51 PM Baochen Qiang <quic_bqiang@xxxxxxxxxxx> wrote: > >> > >> > >> > >> On 11/22/2024 5:50 AM, Tim Harvey wrote: > >>> On Tue, Nov 19, 2024 at 6:32 PM Baochen Qiang <quic_bqiang@xxxxxxxxxxx> wrote: > >>>> > >>>> > >>>> > >>>> On 11/20/2024 4:16 AM, Tim Harvey wrote: > >>>>> Greetings, > >>>>> > >>>>> I've got an ath11k card that is failing to init on an IMX8MM system > >>>>> with 4GB of DRAM: > >>>>> [ 7.551582] ath11k_pci 0000:01:00.0: BAR 0 [mem > >>>>> 0x18000000-0x181fffff 64bit]: assigned > >>>>> [ 7.551713] ath11k_pci 0000:01:00.0: enabling device (0000 -> 0002) > >>>>> [ 7.552401] ath11k_pci 0000:01:00.0: MSI vectors: 16 > >>>>> [ 7.552440] ath11k_pci 0000:01:00.0: qcn9074 hw1.0 > >>>>> [ 7.887186] mhi mhi0: Loaded FW: ath11k/QCN9074/hw1.0/amss.bin, > >>>>> sha256: 5ee1b7b204541b5f99984f21d694ececaec08fbce1b520ffe6fe740b02a4afd7 > >>>>> [ 8.435964] ath11k_pci 0000:01:00.0: chip_id 0x0 chip_family 0x0 > >>>>> board_id 0xff soc_id 0xffffffff > >>>>> [ 8.435991] ath11k_pci 0000:01:00.0: fw_version 0x270206d0 > >>>>> fw_build_timestamp 2022-08-04 12:48 fw_build_id > >>>>> WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1 > >>>>> [ 8.441700] ath11k_pci 0000:01:00.0: Loaded FW: > >>>>> ath11k/QCN9074/hw1.0/board-2.bin, sha256: > >>>>> dbf0ca14aa1229eccd48f26f1026901b9718b143bd30b51b8ea67c84ba6207f1 > >>>>> [ 9.753764] ath11k_pci 0000:01:00.0: Loaded FW: > >>>>> ath11k/QCN9074/hw1.0/m3.bin, sha256: > >>>>> b6d957f335073a15a8de809398e1506f0200a08747eaf7189c843cf519ffc1de > >>>>> [ 9.789791] ath11k_pci 0000:01:00.0: swiotlb buffer is full (sz: > >>>>> 1048583 bytes), total 32768 (slots), used 2528 (slots) > >>>>> [ 9.789853] ath11k_pci 0000:01:00.0: failed to set up tcl_comp ring (0) :-12 > >>>>> [ 9.790238] ath11k_pci 0000:01:00.0: failed to init DP: -12 > >>>>> root@noble-venice:~# cat /proc/cmdline > >>>>> console=ttymxc1,115200 earlycon=ec_imx6q,0x30890000,115200 > >>>>> root=PARTUUID=5cdde84f-01 rootwait net.ifnames=0 cma=196M > >>>>> > >>>>> The IMX8MM's DRAM base is at 1GB so anything above 3GB hits the 32bit > >>>>> address boundary. If I pass in a mem=3096M the device registers just > >>>>> fine. > >>>> yeah ... that parameter makes kernel alloc memory below 32bit boundary, thus swiotlb is not necessary. > >>> > >>> Hi Baochen, > >>> > >>> Yes, that makes sense as I step through the code. On IMX8M with DRAM > >>> 3GB or less dma_capable(...) is true so swiotlb bounce buffers are not > >>> needed. > >>> > >>>> > >>>>> > >>>>> I found this to be the case with modern kernels however I found > >>>>> differing behavior with older kernels: > >>>>> - 6.6 and 6.1 the device registers with 4GB DRAM but crashes on client connect > >>>>> - 5.15 devices registers with 4GB DRAM and appears to work just fine > >>>> are you using Linus' tree or the stable tree? > >>>> > >>> > >>> For 6.6 I tested stable. > >> can you try Linus's tree ? as I know the stable tree is possible to miss some important fix. > >> > >>> > >>> This likely has something to do with commit dbd73acb22d8 ("wifi: > >>> ath11k: enable 36 bit mask for stream DMA") but it would seem to me > >>> that patch was trying to avoid the entire 32bit DMA limitation. Maybe > >>> that patch sets the ath11k device DMA mask to 36 bits but maybe the > >>> IMX8M PCI DMA is only capable of 32bits? > >> that patch is making situation better, not worse. that said, it helps to avoid swiotlb in > >> ath11k DMA, rather than to get it involved. > >> > > > > Yes, that patch would be an improvement on systems capable of > > addressing 64bit memory but not on the IMX8M which is seemingly > > capable of only 32bit DMA over PCI. > > > >>> > >>>>> > >>>>> Could anyone explain what is going on here? Obviously there have been > >>>>> changes at some point to start using swiotlb which I believe was all > >>>>> about avoiding 32bit DMA limitations but I'm not clear how I should be > >>>>> configuring this for IMX8MM with 4GB DRAM. Maybe my kernel IOMMU > >>>>> configuration is incorrect somehow? > >>>> there are quite some options associated with IOMMU, not sure which one might be causing this. But basically you may check: > >>>> > >>>> CONFIG_IOMMU_IOVA > >>>> CONFIG_IOMMU_API > >>>> CONFIG_IOMMU_SUPPORT > >>>> CONFIG_IOMMU_DMA=y > >>>> > >>> > >>> These are enabled which I believe appropriate for IMX8M. If I want to > >>> utilize the full 4GB DRAM on IMX then I must use IOMMU and swiotlb > >>> which would mean a performance hit due to copying mem to/from bounce > >>> buffers not to mention the fact that I can't figure out how to > >>> configure the system to avoid the 'swiotlb swiotlb buffer is full' > >>> issue. > > > > My statement regarding needing an IOMMU above is wrong; apparently the > > IMX8M SoC's don't have an IOMMU but the fact I have it enabled in the > > kernel should be a don't-care. If I understand swiotlb correctly, if I > > did have an IOMMU then it would be used instead of swiotlb. > > > >>> > >>> Enabling CONFIG_SWIOTLB_DYNAMIC does not help nor does increasing the > >>> number of slots - it has something to do with the number/size of DMA > >>> buffers that ath11k is asking for: > >> yeah, ath11k asks for fixed size DMA buffer regardless of that config. > >> > >>> # dmesg | grep swiotlb_tbl_map_single > >>> [ 5.237731] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16384 (slots=32768/ 32) > >>> [ 5.247519] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16416 (slots=32768/ 64) > >>> [ 5.261794] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16448 (slots=32768/ 96) > >>> [ 5.275114] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16480 (slots=32768/ 128) > >>> [ 5.287757] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16512 (slots=32768/ 160) > >>> [ 5.299688] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16544 (slots=32768/ 192) > >>> [ 5.312482] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16576 (slots=32768/ 224) > >>> [ 5.324493] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16608 (slots=32768/ 256) > >>> [ 5.337001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16640 (slots=32768/ 288) > >>> [ 5.346754] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16672 (slots=32768/ 320) > >>> [ 5.356571] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16704 (slots=32768/ 352) > >>> [ 5.366372] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16736 (slots=32768/ 384) > >>> [ 5.376164] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16768 (slots=32768/ 416) > >>> [ 5.385944] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16800 (slots=32768/ 448) > >>> [ 5.395712] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16832 (slots=32768/ 480) > >>> [ 5.408270] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16864 (slots=32768/ 512) > >>> [ 5.419768] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16896 (slots=32768/ 544) > >>> [ 5.430966] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16928 (slots=32768/ 576) > >>> [ 5.442368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16960 (slots=32768/ 608) > >>> [ 5.452422] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 16992 (slots=32768/ 640) > >>> [ 5.463507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17024 (slots=32768/ 672) > >>> [ 5.473536] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17056 (slots=32768/ 704) > >>> [ 5.485661] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17088 (slots=32768/ 736) > >>> [ 5.495404] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17120 (slots=32768/ 768) > >>> [ 5.509626] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17152 (slots=32768/ 800) > >>> [ 5.519353] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17184 (slots=32768/ 832) > >>> [ 5.529077] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17216 (slots=32768/ 864) > >>> [ 5.538799] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17248 (slots=32768/ 896) > >>> [ 5.548517] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17280 (slots=32768/ 928) > >>> [ 5.558238] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17312 (slots=32768/ 960) > >>> [ 5.567965] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 17344 (slots=32768/ 992) > >>> [ 5.578943] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 0 (slots=32768/ 992) > >>> [ 5.578964] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 52B index= 8192 (slots=32768/ 993) > >>> [ 5.599793] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 32 (slots=32768/ 992) > >>> [ 5.599861] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 68B index= 8193 (slots=32768/ 993) > >>> [ 5.609589] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 64 (slots=32768/ 993) > >>> [ 5.628921] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 96 (slots=32768/ 992) > >>> [ 5.638703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 68B index= 17376 (slots=32768/ 993) > >>> [ 5.649602] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 128 (slots=32768/ 992) > >>> [ 5.659389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 160 (slots=32768/ 992) > >>> [ 5.674038] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 96B index= 17377 (slots=32768/ 993) > >>> [ 5.685016] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 192 (slots=32768/ 992) > >>> [ 5.694819] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 224 (slots=32768/ 992) > >>> [ 5.694831] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 52B index= 17378 (slots=32768/ 993) > >>> [ 5.714194] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 40B index= 17379 (slots=32768/ 994) > >>> [ 5.725089] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 256 (slots=32768/ 992) > >>> [ 5.753507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17380 (slots=32768/ 996) > >>> [ 5.764668] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 288 (slots=32768/ 992) > >>> [ 5.774456] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 320 (slots=32768/ 992) > >>> [ 5.774620] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17384 (slots=32768/ 996) > >>> [ 5.795091] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 352 (slots=32768/ 992) > >>> [ 5.795241] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17388 (slots=32768/ 996) > >>> [ 5.815724] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 384 (slots=32768/ 992) > >>> [ 5.815884] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17392 (slots=32768/ 996) > >>> [ 5.836357] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 416 (slots=32768/ 992) > >>> [ 5.836368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 52B index= 8194 (slots=32768/ 993) > >>> [ 5.855856] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17396 (slots=32768/ 997) > >>> [ 5.866818] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 448 (slots=32768/ 992) > >>> [ 5.866978] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17400 (slots=32768/ 996) > >>> [ 5.887451] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 480 (slots=32768/ 992) > >>> [ 5.897231] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 512 (slots=32768/ 992) > >>> [ 5.897389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17404 (slots=32768/ 996) > >>> [ 5.917866] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 544 (slots=32768/ 992) > >>> [ 5.918026] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17408 (slots=32768/ 996) > >>> [ 5.938489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 576 (slots=32768/ 992) > >>> [ 5.938642] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17412 (slots=32768/ 996) > >>> [ 5.959121] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 608 (slots=32768/ 992) > >>> [ 5.959135] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 52B index= 8195 (slots=32768/ 993) > >>> [ 5.978619] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17416 (slots=32768/ 997) > >>> [ 5.989588] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 640 (slots=32768/ 992) > >>> [ 5.989738] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17420 (slots=32768/ 996) > >>> [ 6.010215] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 672 (slots=32768/ 992) > >>> [ 6.020001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 704 (slots=32768/ 992) > >>> [ 6.020158] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17424 (slots=32768/ 996) > >>> [ 6.040643] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 736 (slots=32768/ 992) > >>> [ 6.040798] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17428 (slots=32768/ 996) > >>> [ 6.061287] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 768 (slots=32768/ 992) > >>> [ 6.061437] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17432 (slots=32768/ 996) > >>> [ 6.081918] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 800 (slots=32768/ 992) > >>> [ 6.081929] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 52B index= 8196 (slots=32768/ 993) > >>> [ 6.101409] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17436 (slots=32768/ 997) > >>> [ 6.112375] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 832 (slots=32768/ 992) > >>> [ 6.112528] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17440 (slots=32768/ 996) > >>> [ 6.133004] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 864 (slots=32768/ 992) > >>> [ 6.142785] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 896 (slots=32768/ 992) > >>> [ 6.142949] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17444 (slots=32768/ 996) > >>> [ 6.163426] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 928 (slots=32768/ 992) > >>> [ 6.163576] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17448 (slots=32768/ 996) > >>> [ 6.184058] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 960 (slots=32768/ 992) > >>> [ 6.184208] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17452 (slots=32768/ 996) > >>> [ 6.204691] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 992 (slots=32768/ 992) > >>> [ 6.204704] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 52B index= 8197 (slots=32768/ 993) > >>> [ 6.224183] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17456 (slots=32768/ 997) > >>> [ 6.235148] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 1024 (slots=32768/ 992) > >>> [ 6.235308] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 6224B index= 17460 (slots=32768/ 996) > >>> [ 6.255777] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 1056 (slots=32768/ 992) > >>> [ 6.265552] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 1088 (slots=32768/ 992) > >>> [ 6.265633] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 2128B index= 17464 (slots=32768/ 994) > >>> [ 6.286142] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 1120 (slots=32768/ 992) > >>> [ 6.286182] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 72B index= 17466 (slots=32768/ 993) > >>> [ 7.574489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 1152 (slots=32768/ 992) > >>> [ 7.584645] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 60B index= 17467 (slots=32768/ 993) > >>> [ 7.595593] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 1184 (slots=32768/ 992) > >>> [ 7.595608] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 52B index= 8198 (slots=32768/ 993) > >>> [ 7.605359] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 1216 (slots=32768/ 993) > >>> [ 7.624703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 452B index= 1248 (slots=32768/ 993) > >>> [ 7.635603] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 1280 (slots=32768/ 992) > >>> [ 7.645344] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 52B index= 1312 (slots=32768/ 993) > >>> [ 7.656247] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 1314 (slots=32768/ 992) > >>> [ 7.683567] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > >>> 65535B index= 1346 (slots=32768/ 992) > >>> [ 7.696095] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single > >>> size=1048583B index= -1 (slots=32768/ 992) > >>> > >>> I'm still trying to understand the swiotlb allocation to see if there > >>> is some configuration change I should be making. > >> > >> I suspect you hit the same issue mentioned here: > >> > >> https://lore.kernel.org/all/CAOMZO5A7+nxACoBPY0k8cOpVQByZtEV_N1489MK5wETHF_RXWA@xxxxxxxxxxxxxx/ > >> > >> so can you check if below commit present in your kernel, and if not could you pick it up > >> and try again? > >> > >> commit 14cebf689a78 ("swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE") > ignore this request since it should be no related to your issue :( > > >> > > > > I bisected the 'swiotlb buffer is full' issue back to commit > > aaf244141ed7 ("wifi: ath11k: fix IOMMU errors on buffer rings") which > > looks to me to be a legitimate fix and if I revert it swiotlb is now > > happy and the driver registers but I get the crash on client connect > > that I was seeing in 6.6 so that commit fixes an issue, but causes > > swiotlb to not be fulfilled. > not really ... that commit is not the cause to your issue. you don;t see the 'swiotlb > full' error after revert it simply because dma_map_single() is NOT called then. > > > > > > The issue seems to be that the swiotlb memory buffer allocator is > > getting too fragmented to be useful with what ath11k is now asking for > > (a lot of 2K and 64K buffers and then finally a 1048583B buffer which > > fails due to the fragmentation of the swiotlb buffer. > no, the direct cause to 'swiotlb full' error is that kernel does not allow a swiotlb map > request larger than 256kb [1]: > > 'A single allocation from swiotlb is limited to IO_TLB_SIZE * IO_TLB_SEGSIZE bytes, which > is 256 KiB with current definitions' > > while here ath11k is requesting a buffer of 1048583 bytes. > > > howevr the question is that why swiotlb is involved here: for streamed DMA operation > ath11k is capable of addressing 64GB memory (with 36bit DMA mask), in your case this > covers whole system memory. the most possible reason I can think of is that swiotlb is > forcebly enabled in your kernel (with swiotlb=force?) such that each DMA buffer would be > bounced by swiotlb regardless of its physical address. > I do not have swiotlb forced explicitly. Again, this is because I'm on a IMX8MM with 4GiB DRAM which has no IOMMU and a 32bit DMA where peripherals can not access memory over 3GiB as its base DRAM starting at 1GiB (so swiotlb is getting used with a DRAM size >3GiB). Reverting commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE ring from cacheable memory") indeed resolves this issue. I notice that ath12k has a similar architecture as ath11k where ath12k_dp_srng_setup() looks like what ath11k_dp_srng_setup() before the change to allocate its buffers from cacheable memory so it's probably just a matter of time before the same changes are made to ath12k which will break that for this platform/memory-size as well. So the way I see to resolve this either: a) revert commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE ring from cacheable memory") - to stop asking for buffers >256KiB b) find some other use of that upper 1GiB so that it can't be allocated by DMA and swiotlb isn't needed c) tell my board users to use mem=3096M and lose that last 1GiB of DRAM > > > [1] Documentation/core-api/swiotlb.rst > > > > > I'm guessing that this has gone unnoticed for a while because there > > are maybe not a lot of systems out there that require swiotlb with > > ath11k (either no IOMMU or more memory than DMA can address) and my > > guess is that if you test ath11k with swiotlb=force you will easily > > see this 'swiotlb buffer is full' issue on other systems. > > > > I'm not that knowledgeable about ath11k but I do know that ath10 and > > ath12k do not have this issue with swiotlb. Debugging a bit shows that > > there are a lot of large DMA buffers being requested by ath11k and I'm > > wondering if that could be reduced or optimized somehow. > > > >> > >>> > >>> To avoid using swiotlb is there some way to limit the memory region > >>> used for DMA operations to below 32bit boundary yet still allow the > >>> memory above 32bit to be useful in the system for userspace maybe? > >> if you are using dma_alloc_coherent() I'm afraid there is no way for that. the API > >> internally ignores any zone flags passed with the 'gfp' argument. see > >> > >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c#n615 > >> > > > > is DMA_RESTRICTED_POOL a solution for me? > i don;t think this help since this is used in coherent DMA? > While DMA_RESTRICTED_POOL does allow defining the area used by swiotlb it doesn't change the way swiotlb allocates buffers or the fact that swiotlb is used at all. Best Regards, Tim > > > > Best Regards, > > > > Tim > > >