On 7/21/22 05:54, Helge Deller wrote: > On 7/21/22 01:15, Sam James wrote: >>> On 20 Jul 2022, at 18:06, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: >>> >>> On Wed, Jul 20, 2022 at 07:00:32PM +0800, Hillf Danton wrote: >>> >>>> To help debug it, de-union d_in_lookup_hash with d_alias and add debug >>>> info after dentry is killed. If any warning hits, we know where to add >>>> something like >>>> >>>> WARN_ON(dentry->d_flags & DCACHE_DENTRY_KILLED); >>>> >>>> before hlist_bl_add or hlist_add. >> >>> [snip] >>> I wonder if anyone had seen anything similar outside of parisc... > > Me too. > Of course it could be caused by the platform code, as we have had > issues with caches, spinlocks and so on. > On older kernels we also have seen RCU stalls in d_alloc_parallel(). > >>> I don't know if I have any chance to reproduce it here - the only >>> parisc box I've got is a 715/100 (assuming the disk is still alive) >>> and it's 32bit, unlike the reported setups and, er, not fast. > > It's fun to boot it, but it will be too slow for actual testing. > >>> qemu seems to have some parisc support, but it's 32bit-only at the >>> moment... > > Yes. I think it will be hard to reproduce it in the VM. > >> I don't think I've seen this on parisc either, but I don't think >> I've used tmpfs that heavily. I'll try it in case it's somehow more >> likely to trigger it. > > It happened on the debian buildd server with tmpfs. To rule out tmpfs > I switched to ext4 (on SATA SSD) and it happened there as well. > I assume Dave's report is on ext3/ext4 with SCSI discs. > >> Helge, were there any particular steps to reproduce this? Or just >> start doing your normal Debian builds on a tmpfs and it happens >> soon enough? > > Currently it's not easy to reproduce for me either. > It happens on the debian buildd server (4-way c8000 machine) while building > the webkit2gtk package. I think it happens at the end when sbuild > cleans the build directories by deleting all files. > Maybe there is a filesystem test toolkit which you could try which hammers > the fs by deleting lots of files in parallel? I currently can't reproduce the issue any longer. In case it pops up again, I'll follow up here again. Helge