On 11/22/2024 5:50 AM, Tim Harvey wrote: > On Tue, Nov 19, 2024 at 6:32 PM Baochen Qiang <quic_bqiang@xxxxxxxxxxx> wrote: >> >> >> >> On 11/20/2024 4:16 AM, Tim Harvey wrote: >>> Greetings, >>> >>> I've got an ath11k card that is failing to init on an IMX8MM system >>> with 4GB of DRAM: >>> [ 7.551582] ath11k_pci 0000:01:00.0: BAR 0 [mem >>> 0x18000000-0x181fffff 64bit]: assigned >>> [ 7.551713] ath11k_pci 0000:01:00.0: enabling device (0000 -> 0002) >>> [ 7.552401] ath11k_pci 0000:01:00.0: MSI vectors: 16 >>> [ 7.552440] ath11k_pci 0000:01:00.0: qcn9074 hw1.0 >>> [ 7.887186] mhi mhi0: Loaded FW: ath11k/QCN9074/hw1.0/amss.bin, >>> sha256: 5ee1b7b204541b5f99984f21d694ececaec08fbce1b520ffe6fe740b02a4afd7 >>> [ 8.435964] ath11k_pci 0000:01:00.0: chip_id 0x0 chip_family 0x0 >>> board_id 0xff soc_id 0xffffffff >>> [ 8.435991] ath11k_pci 0000:01:00.0: fw_version 0x270206d0 >>> fw_build_timestamp 2022-08-04 12:48 fw_build_id >>> WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1 >>> [ 8.441700] ath11k_pci 0000:01:00.0: Loaded FW: >>> ath11k/QCN9074/hw1.0/board-2.bin, sha256: >>> dbf0ca14aa1229eccd48f26f1026901b9718b143bd30b51b8ea67c84ba6207f1 >>> [ 9.753764] ath11k_pci 0000:01:00.0: Loaded FW: >>> ath11k/QCN9074/hw1.0/m3.bin, sha256: >>> b6d957f335073a15a8de809398e1506f0200a08747eaf7189c843cf519ffc1de >>> [ 9.789791] ath11k_pci 0000:01:00.0: swiotlb buffer is full (sz: >>> 1048583 bytes), total 32768 (slots), used 2528 (slots) >>> [ 9.789853] ath11k_pci 0000:01:00.0: failed to set up tcl_comp ring (0) :-12 >>> [ 9.790238] ath11k_pci 0000:01:00.0: failed to init DP: -12 >>> root@noble-venice:~# cat /proc/cmdline >>> console=ttymxc1,115200 earlycon=ec_imx6q,0x30890000,115200 >>> root=PARTUUID=5cdde84f-01 rootwait net.ifnames=0 cma=196M >>> >>> The IMX8MM's DRAM base is at 1GB so anything above 3GB hits the 32bit >>> address boundary. If I pass in a mem=3096M the device registers just >>> fine. >> yeah ... that parameter makes kernel alloc memory below 32bit boundary, thus swiotlb is not necessary. > > Hi Baochen, > > Yes, that makes sense as I step through the code. On IMX8M with DRAM > 3GB or less dma_capable(...) is true so swiotlb bounce buffers are not > needed. > >> >>> >>> I found this to be the case with modern kernels however I found >>> differing behavior with older kernels: >>> - 6.6 and 6.1 the device registers with 4GB DRAM but crashes on client connect >>> - 5.15 devices registers with 4GB DRAM and appears to work just fine >> are you using Linus' tree or the stable tree? >> > > For 6.6 I tested stable. can you try Linus's tree ? as I know the stable tree is possible to miss some important fix. > > This likely has something to do with commit dbd73acb22d8 ("wifi: > ath11k: enable 36 bit mask for stream DMA") but it would seem to me > that patch was trying to avoid the entire 32bit DMA limitation. Maybe > that patch sets the ath11k device DMA mask to 36 bits but maybe the > IMX8M PCI DMA is only capable of 32bits? that patch is making situation better, not worse. that said, it helps to avoid swiotlb in ath11k DMA, rather than to get it involved. > >>> >>> Could anyone explain what is going on here? Obviously there have been >>> changes at some point to start using swiotlb which I believe was all >>> about avoiding 32bit DMA limitations but I'm not clear how I should be >>> configuring this for IMX8MM with 4GB DRAM. Maybe my kernel IOMMU >>> configuration is incorrect somehow? >> there are quite some options associated with IOMMU, not sure which one might be causing this. But basically you may check: >> >> CONFIG_IOMMU_IOVA >> CONFIG_IOMMU_API >> CONFIG_IOMMU_SUPPORT >> CONFIG_IOMMU_DMA=y >> > > These are enabled which I believe appropriate for IMX8M. If I want to > utilize the full 4GB DRAM on IMX then I must use IOMMU and swiotlb > which would mean a performance hit due to copying mem to/from bounce > buffers not to mention the fact that I can't figure out how to > configure the system to avoid the 'swiotlb swiotlb buffer is full' > issue. > > Enabling CONFIG_SWIOTLB_DYNAMIC does not help nor does increasing the > number of slots - it has something to do with the number/size of DMA > buffers that ath11k is asking for: yeah, ath11k asks for fixed size DMA buffer regardless of that config. > # dmesg | grep swiotlb_tbl_map_single > [ 5.237731] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16384 (slots=32768/ 32) > [ 5.247519] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16416 (slots=32768/ 64) > [ 5.261794] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16448 (slots=32768/ 96) > [ 5.275114] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16480 (slots=32768/ 128) > [ 5.287757] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16512 (slots=32768/ 160) > [ 5.299688] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16544 (slots=32768/ 192) > [ 5.312482] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16576 (slots=32768/ 224) > [ 5.324493] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16608 (slots=32768/ 256) > [ 5.337001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16640 (slots=32768/ 288) > [ 5.346754] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16672 (slots=32768/ 320) > [ 5.356571] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16704 (slots=32768/ 352) > [ 5.366372] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16736 (slots=32768/ 384) > [ 5.376164] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16768 (slots=32768/ 416) > [ 5.385944] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16800 (slots=32768/ 448) > [ 5.395712] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16832 (slots=32768/ 480) > [ 5.408270] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16864 (slots=32768/ 512) > [ 5.419768] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16896 (slots=32768/ 544) > [ 5.430966] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16928 (slots=32768/ 576) > [ 5.442368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16960 (slots=32768/ 608) > [ 5.452422] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 16992 (slots=32768/ 640) > [ 5.463507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17024 (slots=32768/ 672) > [ 5.473536] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17056 (slots=32768/ 704) > [ 5.485661] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17088 (slots=32768/ 736) > [ 5.495404] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17120 (slots=32768/ 768) > [ 5.509626] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17152 (slots=32768/ 800) > [ 5.519353] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17184 (slots=32768/ 832) > [ 5.529077] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17216 (slots=32768/ 864) > [ 5.538799] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17248 (slots=32768/ 896) > [ 5.548517] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17280 (slots=32768/ 928) > [ 5.558238] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17312 (slots=32768/ 960) > [ 5.567965] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 17344 (slots=32768/ 992) > [ 5.578943] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 0 (slots=32768/ 992) > [ 5.578964] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 52B index= 8192 (slots=32768/ 993) > [ 5.599793] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 32 (slots=32768/ 992) > [ 5.599861] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 68B index= 8193 (slots=32768/ 993) > [ 5.609589] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 64 (slots=32768/ 993) > [ 5.628921] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 96 (slots=32768/ 992) > [ 5.638703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 68B index= 17376 (slots=32768/ 993) > [ 5.649602] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 128 (slots=32768/ 992) > [ 5.659389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 160 (slots=32768/ 992) > [ 5.674038] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 96B index= 17377 (slots=32768/ 993) > [ 5.685016] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 192 (slots=32768/ 992) > [ 5.694819] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 224 (slots=32768/ 992) > [ 5.694831] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 52B index= 17378 (slots=32768/ 993) > [ 5.714194] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 40B index= 17379 (slots=32768/ 994) > [ 5.725089] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 256 (slots=32768/ 992) > [ 5.753507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17380 (slots=32768/ 996) > [ 5.764668] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 288 (slots=32768/ 992) > [ 5.774456] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 320 (slots=32768/ 992) > [ 5.774620] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17384 (slots=32768/ 996) > [ 5.795091] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 352 (slots=32768/ 992) > [ 5.795241] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17388 (slots=32768/ 996) > [ 5.815724] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 384 (slots=32768/ 992) > [ 5.815884] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17392 (slots=32768/ 996) > [ 5.836357] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 416 (slots=32768/ 992) > [ 5.836368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 52B index= 8194 (slots=32768/ 993) > [ 5.855856] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17396 (slots=32768/ 997) > [ 5.866818] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 448 (slots=32768/ 992) > [ 5.866978] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17400 (slots=32768/ 996) > [ 5.887451] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 480 (slots=32768/ 992) > [ 5.897231] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 512 (slots=32768/ 992) > [ 5.897389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17404 (slots=32768/ 996) > [ 5.917866] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 544 (slots=32768/ 992) > [ 5.918026] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17408 (slots=32768/ 996) > [ 5.938489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 576 (slots=32768/ 992) > [ 5.938642] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17412 (slots=32768/ 996) > [ 5.959121] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 608 (slots=32768/ 992) > [ 5.959135] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 52B index= 8195 (slots=32768/ 993) > [ 5.978619] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17416 (slots=32768/ 997) > [ 5.989588] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 640 (slots=32768/ 992) > [ 5.989738] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17420 (slots=32768/ 996) > [ 6.010215] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 672 (slots=32768/ 992) > [ 6.020001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 704 (slots=32768/ 992) > [ 6.020158] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17424 (slots=32768/ 996) > [ 6.040643] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 736 (slots=32768/ 992) > [ 6.040798] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17428 (slots=32768/ 996) > [ 6.061287] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 768 (slots=32768/ 992) > [ 6.061437] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17432 (slots=32768/ 996) > [ 6.081918] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 800 (slots=32768/ 992) > [ 6.081929] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 52B index= 8196 (slots=32768/ 993) > [ 6.101409] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17436 (slots=32768/ 997) > [ 6.112375] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 832 (slots=32768/ 992) > [ 6.112528] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17440 (slots=32768/ 996) > [ 6.133004] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 864 (slots=32768/ 992) > [ 6.142785] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 896 (slots=32768/ 992) > [ 6.142949] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17444 (slots=32768/ 996) > [ 6.163426] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 928 (slots=32768/ 992) > [ 6.163576] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17448 (slots=32768/ 996) > [ 6.184058] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 960 (slots=32768/ 992) > [ 6.184208] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17452 (slots=32768/ 996) > [ 6.204691] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 992 (slots=32768/ 992) > [ 6.204704] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 52B index= 8197 (slots=32768/ 993) > [ 6.224183] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17456 (slots=32768/ 997) > [ 6.235148] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 1024 (slots=32768/ 992) > [ 6.235308] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 6224B index= 17460 (slots=32768/ 996) > [ 6.255777] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 1056 (slots=32768/ 992) > [ 6.265552] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 1088 (slots=32768/ 992) > [ 6.265633] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 2128B index= 17464 (slots=32768/ 994) > [ 6.286142] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 1120 (slots=32768/ 992) > [ 6.286182] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 72B index= 17466 (slots=32768/ 993) > [ 7.574489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 1152 (slots=32768/ 992) > [ 7.584645] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 60B index= 17467 (slots=32768/ 993) > [ 7.595593] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 1184 (slots=32768/ 992) > [ 7.595608] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 52B index= 8198 (slots=32768/ 993) > [ 7.605359] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 1216 (slots=32768/ 993) > [ 7.624703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 452B index= 1248 (slots=32768/ 993) > [ 7.635603] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 1280 (slots=32768/ 992) > [ 7.645344] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 52B index= 1312 (slots=32768/ 993) > [ 7.656247] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 1314 (slots=32768/ 992) > [ 7.683567] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size= > 65535B index= 1346 (slots=32768/ 992) > [ 7.696095] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single > size=1048583B index= -1 (slots=32768/ 992) > > I'm still trying to understand the swiotlb allocation to see if there > is some configuration change I should be making. I suspect you hit the same issue mentioned here: https://lore.kernel.org/all/CAOMZO5A7+nxACoBPY0k8cOpVQByZtEV_N1489MK5wETHF_RXWA@xxxxxxxxxxxxxx/ so can you check if below commit present in your kernel, and if not could you pick it up and try again? commit 14cebf689a78 ("swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE") > > To avoid using swiotlb is there some way to limit the memory region > used for DMA operations to below 32bit boundary yet still allow the > memory above 32bit to be useful in the system for userspace maybe? if you are using dma_alloc_coherent() I'm afraid there is no way for that. the API internally ignores any zone flags passed with the 'gfp' argument. see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c#n615 > Best Regards, > > Tim > >>> >>> I'm also unclear why there was no apparent problem with older kernels >>> such as 5.15. >>> >>> Best Regards, >>> >>> Tim >>> >> >