Hi Robin, On 9/1/22 5:24 AM, Robin Murphy wrote: > On 2022-09-01 03:18, Dongli Zhang wrote: >> Hi Yu, >> >> On 8/31/22 5:24 PM, Yu Zhao wrote: >>> On Wed, Aug 31, 2022 at 4:20 PM Dongli Zhang <dongli.zhang@xxxxxxxxxx> wrote: >>>> >>>> Hi Yu, >>>> >>>> As we discussed in the past, the swiotlb panic on purpose >>> >>> We should not panic() at all, especially on a platform that has been >>> working well since at least 4.14. >>> >>> Did you check out this link I previously shared with you? >>> https://urldefense.com/v3/__https://lore.kernel.org/r/CAHk-=wit-DmhMfQErY29JSPjFgebx_Ld*pnerc4J2Ag990WwAA@xxxxxxxxxxxxxx/__;Kw!!ACWV5N9M2RV99hQ!PXzSLurBv7VqxI1451TV4zO3_-BYj4grk-HYBsXzSnA6nZcXaBzdsQ-rF2DAqlICSRPMt-efYv_Uu2A2CQ$ >>> >> >> Thanks for sharing! I used to read that in the past. To be honest I am still >> confused on when to use BUG/panic and when to not, as I still see many usage in >> some patches. >> >> Just about swiotlb, it may panic in many cases during boot, e.g.,: >> >> https://urldefense.com/v3/__https://bugs.launchpad.net/ubuntu/*source/linux/*bug/1955655__;Kys!!ACWV5N9M2RV99hQ!MO4S9r0PisrW6Z-kZqUitAMbuGNMX6aceQd5VApqXllP6f1jaPCyL9x50um7chsn2uGL_pNqdBzTjZK5GebeKrI$ ; > > > That's really a different thing, but frankly that panic is also bogus anyway - > there is no good reason to have different behaviour for failing to allocate a > buffer slot because the buffer is full vs. failing to allocate a buffer slot > because there is no buffer. If we can fail gracefully some of the time we should > fail gracefully all of the time. Yes, there's a slight difference in that one Thank you very much for the feedback! Currently the swiotlb remap logic is more gracefully to handle failure, to re-try with smaller nslabs (line 352), but will still panic at the end if the value is reduced to smaller than IO_TLB_MIN_SLABS. 337 retry: 338 bytes = PAGE_ALIGN(nslabs << IO_TLB_SHIFT); 339 if (flags & SWIOTLB_ANY) 340 tlb = memblock_alloc(bytes, PAGE_SIZE); 341 else 342 tlb = memblock_alloc_low(bytes, PAGE_SIZE); 343 if (!tlb) { 344 pr_warn("%s: Failed to allocate %zu bytes tlb structure\n", 345 __func__, bytes); 346 return; 347 } 348 349 if (remap && remap(tlb, nslabs) < 0) { 350 memblock_free(tlb, PAGE_ALIGN(bytes)); 351 352 nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE); 353 if (nslabs < IO_TLB_MIN_SLABS) 354 panic("%s: Failed to remap %zu bytes\n", 355 __func__, bytes); 356 goto retry; 357 } > case has a chance of succeeding if retried in future while the other definitely > doesn't, but it's not SWIOTLB's place to decide that the entire system is > terminally unusable just because some device can't make a DMA mapping. Currently the swiotlb panic at line 735 on purpose if swiotlb initialization is failed, but when any DMA requires a swiotlb mapped slot. 724 phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr, 725 size_t mapping_size, size_t alloc_size, 726 unsigned int alloc_align_mask, enum dma_data_direction dir, 727 unsigned long attrs) 728 { 729 struct io_tlb_mem *mem = dev->dma_io_tlb_mem; 730 unsigned int offset = swiotlb_align_offset(dev, orig_addr); 731 unsigned int i; 732 int index; 733 phys_addr_t tlb_addr; 734 735 if (!mem || !mem->nslabs) 736 panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer"); I agree it is a very interesting point whether it is a good idea to leave the decision to swiotlb for a panic. One thing to take care is the driver should gracefully handle the error returned by swiotlb. Otherwise, there may be IO hang (e.g., hung_task) instead of an IO failure. > > Similarly for the other panics at init time, which seem to have originated from > a mechanical refactoring of the memblock API with the expectation of preserving > the existing behaviour at the time. Those have then just been moved around > without anyone thinking to question why they're there, and if memblock *does* > now return usable error information, why don't we start handling that error > properly like we do in other init paths? > > Really there is no reason to panic anywhere in SWIOTLB. This is old code, things > have moved on over the last 20 years, and we can and should do much better. I'll > add this to my list of things to look at for cleanup once I find a bit of free > time, unless anyone else fancies taking it on. Thank you very much for the feedback. Looking forward to how this can be improved! Dongli Zhang