Re: [PATCH v2] Revert "swiotlb: panic if nslabs is too small"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Robin,

On 9/1/22 5:24 AM, Robin Murphy wrote:
> On 2022-09-01 03:18, Dongli Zhang wrote:
>> Hi Yu,
>>
>> On 8/31/22 5:24 PM, Yu Zhao wrote:
>>> On Wed, Aug 31, 2022 at 4:20 PM Dongli Zhang <dongli.zhang@xxxxxxxxxx> wrote:
>>>>
>>>> Hi Yu,
>>>>
>>>> As we discussed in the past, the swiotlb panic on purpose
>>>
>>> We should not panic() at all, especially on a platform that has been
>>> working well since at least 4.14.
>>>
>>> Did you check out this link I previously shared with you?
>>> https://urldefense.com/v3/__https://lore.kernel.org/r/CAHk-=wit-DmhMfQErY29JSPjFgebx_Ld*pnerc4J2Ag990WwAA@xxxxxxxxxxxxxx/__;Kw!!ACWV5N9M2RV99hQ!PXzSLurBv7VqxI1451TV4zO3_-BYj4grk-HYBsXzSnA6nZcXaBzdsQ-rF2DAqlICSRPMt-efYv_Uu2A2CQ$
>>>
>>
>> Thanks for sharing! I used to read that in the past. To be honest I am still
>> confused on when to use BUG/panic and when to not, as I still see many usage in
>> some patches.
>>
>> Just about swiotlb, it may panic in many cases during boot, e.g.,:
>>
>> https://urldefense.com/v3/__https://bugs.launchpad.net/ubuntu/*source/linux/*bug/1955655__;Kys!!ACWV5N9M2RV99hQ!MO4S9r0PisrW6Z-kZqUitAMbuGNMX6aceQd5VApqXllP6f1jaPCyL9x50um7chsn2uGL_pNqdBzTjZK5GebeKrI$ ;
> 
> 
> That's really a different thing, but frankly that panic is also bogus anyway -
> there is no good reason to have different behaviour for failing to allocate a
> buffer slot because the buffer is full vs. failing to allocate a buffer slot
> because there is no buffer. If we can fail gracefully some of the time we should
> fail gracefully all of the time. Yes, there's a slight difference in that one

Thank you very much for the feedback!

Currently the swiotlb remap logic is more gracefully to handle failure, to
re-try with smaller nslabs (line 352), but will still panic at the end if the
value is reduced to smaller than IO_TLB_MIN_SLABS.

 337 retry:
 338         bytes = PAGE_ALIGN(nslabs << IO_TLB_SHIFT);
 339         if (flags & SWIOTLB_ANY)
 340                 tlb = memblock_alloc(bytes, PAGE_SIZE);
 341         else
 342                 tlb = memblock_alloc_low(bytes, PAGE_SIZE);
 343         if (!tlb) {
 344                 pr_warn("%s: Failed to allocate %zu bytes tlb structure\n",
 345                         __func__, bytes);
 346                 return;
 347         }
 348
 349         if (remap && remap(tlb, nslabs) < 0) {
 350                 memblock_free(tlb, PAGE_ALIGN(bytes));
 351
 352                 nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
 353                 if (nslabs < IO_TLB_MIN_SLABS)
 354                         panic("%s: Failed to remap %zu bytes\n",
 355                               __func__, bytes);
 356                 goto retry;
 357         }

> case has a chance of succeeding if retried in future while the other definitely
> doesn't, but it's not SWIOTLB's place to decide that the entire system is
> terminally unusable just because some device can't make a DMA mapping.
Currently the swiotlb panic at line 735 on purpose if swiotlb initialization is
failed, but when any DMA requires a swiotlb mapped slot.

 724 phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
 725                 size_t mapping_size, size_t alloc_size,
 726                 unsigned int alloc_align_mask, enum dma_data_direction dir,
 727                 unsigned long attrs)
 728 {
 729         struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
 730         unsigned int offset = swiotlb_align_offset(dev, orig_addr);
 731         unsigned int i;
 732         int index;
 733         phys_addr_t tlb_addr;
 734
 735         if (!mem || !mem->nslabs)
 736                 panic("Can not allocate SWIOTLB buffer earlier and can't
now provide you with the DMA bounce buffer");

I agree it is a very interesting point whether it is a good idea to leave the
decision to swiotlb for a panic. One thing to take care is the driver should
gracefully handle the error returned by swiotlb. Otherwise, there may be IO hang
(e.g., hung_task) instead of an IO failure.

> 
> Similarly for the other panics at init time, which seem to have originated from
> a mechanical refactoring of the memblock API with the expectation of preserving
> the existing behaviour at the time. Those have then just been moved around
> without anyone thinking to question why they're there, and if memblock *does*
> now return usable error information, why don't we start handling that error
> properly like we do in other init paths?
> 
> Really there is no reason to panic anywhere in SWIOTLB. This is old code, things
> have moved on over the last 20 years, and we can and should do much better. I'll
> add this to my list of things to look at for cleanup once I find a bit of free
> time, unless anyone else fancies taking it on.

Thank you very much for the feedback. Looking forward to how this can be improved!

Dongli Zhang



[Index of Archives]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux