Re: mdadm/Create wait_for_zero_forks is stuck

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 30, 2024 at 4:54 AM Logan Gunthorpe <logang@xxxxxxxxxxxx> wrote:
>
> Hi Xaio,
>
> Sorry it took so long but I had a chance to dig into the bug today. It's
> not what I had originally thought, but I do have a solution.
>
> Turns out the problem is that multiple SIGCHLD signals can be coalesced
> into one signal if they happen at the same time between reads to the
> signalfd. This is just the way Linux works and I didn't account for it
> in the code.
>
> To fix this we, need to wait for multiple potential children being
> completed after every SIGCHLD is received.
>
> I've made two patches which you can get from:
>
> https://github.com/lsgunth/mdadm/commits/write_zeros_sigbug/
>
> I tested it with several hundred runs of your test script and it seems
> to fix the problem. Please review and test for yourself.

Hi Logan

Thanks very much. I've tested more than 1000 times and it doesn't stuck anymore.

>
> On 2024-05-22 20:05, Xiao Ni wrote:
> > I did a test in a simple c program.
>
> I made a similar test program to try it out and I think the reason it
> wasn't working for you was due to the coalescing and simply blocking
> solves the (now only theoretical) race at startup. Once the coalescing
> problem is fixed we still need to move the block earlier to fix the
> race. I've attached the code for that program if you want to try it out.

It's the same resolution in the patches :)
I tried in my c program and it worked well too. It's the coalescing
problem. And yes, we need to block signal earlier (patch2). But for
patch01, I still like the wstatus name rather than wst.
>
> Thanks for finding and triaging the bug!
>
> Logan

Best Regards
Xiao






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux