Re: [usb-storage] Re: cma: deadlock using usb-storage and fs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/17/18 1:57 PM, Laura Abbott wrote:
> On 12/17/18 10:29 AM, Gaël PORTAY wrote:
>> Alan,
>>
>> On Mon, Dec 17, 2018 at 10:45:17AM -0500, Alan Stern wrote:
>>> On Sun, 16 Dec 2018, Gaël PORTAY wrote:
>>> ...
>>>
>>>> The second task wants to writeback/flush the pages through USB, which, I
>>>> assume, is due to the page migration. The usb-storage triggers a CMA allocation
>>>> but get locked in cma_alloc since the first task hold the mutex (It is a FAT
>>>> formatted partition, if it helps).
>>>>
>>>>     usb-storage     D    0   349      2 0x00000000
>>>>     Backtrace:
>>> ...
>>>>     [<bf1c7550>] (usb_sg_wait [usbcore]) from [<bf2bd618>]
>>>> (usb_stor_bulk_transfer_sglist.part.2+0x80/0xdc [usb_storage]) r9:0001e000
>>>> r8:eca594ac r7:0001e000 r6:c0008200 r5:eca59514 r4:eca59488
>>>
>>> It looks like there is a logical problem in the CMA allocator.  The
>>> call in usb_sg_wait() specifies GFP_NOIO, which is supposed to prevent
>>> allocations from blocking on any I/O operations.  Therefore we
>>> shouldn't be waiting for the CMA mutex.
>>>
>>
>> Right.
>>
>>> Perhaps the CMA allocator needs to drop the mutex while doing
>>> writebacks/flushes, or perhaps it needs to be reorganized some other
>>> way.  I don't know anything about it.
>>>
>>> Does the CMA code have any maintainers who might need to know about
>>> this, or is it all handled by the MM maintainers?
>>
>> I did not find maintainers neither for CMA nor MM.
>>
>> That is why I have sent this mail to mm mailing list but to no one in
>> particular.
>>
> 
> Last time I looked at this, we needed the cma_mutex for serialization
> so unless we want to rework that, I think we need to not use CMA in the
> writeback case (i.e. GFP_IO).

I am wondering if we still need to hold the cma_mutex while calling
alloc_contig_range().  Looking back at the history, it appears that
the reason for holding the mutex was to prevent two threads from operating
on the same pageblock.

Commit 2c7452a075d4 ("mm/page_isolation.c: make start_isolate_page_range()
fail if already isolated") will cause alloc_contig_range to return EBUSY
if two callers are attempting to operate on the same pageblock.  This was
added because memory hotplug as well as gigantac page allocation call
alloc_contig_range and could conflict with each other or cma.   cma_alloc
has logic to retry if EBUSY is returned.  Although, IIUC it assumes the
EBUSY is the result of specific pages being busy as opposed to someone
else operating on the pageblock.  Therefore, the retry logic to 'try a
different set of pages' is not what one  would/should attempt in the case
someone else is operating on the pageblock.

Would it be possible or make sense to remove the mutex and retry when
EBUSY?  Or, am I missing some other reason for holding the mutex.
-- 
Mike Kravetz




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux