Re: Bug in processing dependencies by async_tx_submit() ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/1/07, Yuri Tikhonov <yur@xxxxxxxxxxx> wrote:
>
>  Hi Dan,
>
>   Honestly I tried to fix this quickly using the approach similar to proposed
>  by you, with one addition though (in fact, deletion of BUG_ON(chan ==
>  tx->chan) in async_tx_run_dependencies()). And this led to "Kernel stack
>  overflow". This happened because of the recurseve calling async_tx_submit()
>  from async_trigger_callback() and vice verse.
>

I had a feeling the fix could not be that easy...

>   So, then I made the interrupt scheduling in async_tx_submit() only for the
>  cases when it is really needed: i.e. when dependent operations are to be run
>  on different channels.
>
>   The resulted kernel locked-up during processing of the mkfs command on the
>  top of the RAID-array. The place where it is spinning is the dma_sync_wait()
>  function.
>
>   This is happened because of the specific implementation of
>  dma_wait_for_async_tx().

So I take it you are not implementing interrupt based callbacks in your driver?

>   The "iter", we finally waiting for there, corresponds to the last allocated
>  but not-yet-submitted descriptor. But if the "iter" we are waiting for is
>  dependent from another descriptor which has cookie > 0, but is not yet
>  submitted to the h/w channel because of the fact that threshold is not
>  achieved to this moment, then we may wait in dma_wait_for_async_tx()
>  infinitely. I think that it makes more sense to get the first descriptor
>  which was submitted to the channel but probably is not put into the h/w
>  chain, i.e. with cookie > 0 and do dma_sync_wait() of this descriptor.
>
>   When I modified the dma_wait_for_async_tx() in such way, then the kernel
>  locking had disappeared. But nevertheless the mkfs processes hangs-up after
>  some time. So, it looks like something is still missing in support of the
>  chaining dependencies feature...
>

I am preparing a new patch that replaces ASYNC_TX_DEP_ACK with
ASYNC_TX_CHAIN_ACK.  The plan is to make the entire chain of
dependencies available up until the last transaction is submitted.
This allows the entire dependency chain to be walked at
async_tx_submit time so that we can properly handle these multiple
dependency cases.  I'll send it out when it passes my internal
tests...

--
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux