Re: PROBLEM: double fault in md_end_io

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 14, 2021 at 5:36 PM Song Liu <song@xxxxxxxxxx> wrote:
>
> On Tue, Apr 13, 2021 at 5:05 AM Paweł Wiejacha
> <pawel.wiejacha@xxxxxxxxxxxx> wrote:
> >
> > Hello Song,
> >
> > That code does not compile, but I guess that what you meant was
> > something like this:
>
> Yeah.. I am really sorry for the noise.
>
> >
> > diff --git drivers/md/md.c drivers/md/md.c
> > index 04384452a..cbc97a96b 100644
> > --- drivers/md/md.c
> > +++ drivers/md/md.c
> > @@ -78,6 +78,7 @@ static DEFINE_SPINLOCK(pers_lock);
> >
> >  static struct kobj_type md_ktype;
> >
> > +struct kmem_cache *md_io_cache;
> >  struct md_cluster_operations *md_cluster_ops;
> >  EXPORT_SYMBOL(md_cluster_ops);
> >  static struct module *md_cluster_mod;
> > @@ -5701,8 +5702,8 @@ static int md_alloc(dev_t dev, char *name)
> >          */
> >         mddev->hold_active = UNTIL_STOP;
> >
> > -   error = mempool_init_kmalloc_pool(&mddev->md_io_pool, BIO_POOL_SIZE,
> > -                     sizeof(struct md_io));
> > +   error = mempool_init_slab_pool(&mddev->md_io_pool, BIO_POOL_SIZE,
> > +                     md_io_cache);
> >     if (error)
> >         goto abort;
> >
> > @@ -9542,6 +9543,10 @@ static int __init md_init(void)
> >  {
> >     int ret = -ENOMEM;
> >
> > +   md_io_cache = KMEM_CACHE(md_io, 0);
> > +   if (!md_io_cache)
> > +       goto err_md_io_cache;
> > +
> >     md_wq = alloc_workqueue("md", WQ_MEM_RECLAIM, 0);
> >     if (!md_wq)
> >         goto err_wq;
> > @@ -9578,6 +9583,8 @@ static int __init md_init(void)
> >  err_misc_wq:
> >     destroy_workqueue(md_wq);
> >  err_wq:
> > +   kmem_cache_destroy(md_io_cache);
> > +err_md_io_cache:
> >     return ret;
> >  }
> >
> > @@ -9863,6 +9870,7 @@ static __exit void md_exit(void)
> >     destroy_workqueue(md_rdev_misc_wq);
> >     destroy_workqueue(md_misc_wq);
> >     destroy_workqueue(md_wq);
> > +   kmem_cache_destroy(md_io_cache);
> >  }
> >
> >  subsys_initcall(md_init);
>
> [...]
>
> >
> > $ watch -n0.2 'cat /proc/meminfo | paste - - | tee -a ~/meminfo'
> > MemTotal:       528235648 kB    MemFree:        20002732 kB
> > MemAvailable:   483890268 kB    Buffers:            7356 kB
> > Cached:         495416180 kB    SwapCached:            0 kB
> > Active:         96396800 kB     Inactive:       399891308 kB
> > Active(anon):      10976 kB     Inactive(anon):   890908 kB
> > Active(file):   96385824 kB     Inactive(file): 399000400 kB
> > Unevictable:       78768 kB     Mlocked:           78768 kB
> > SwapTotal:             0 kB     SwapFree:              0 kB
> > Dirty:          88422072 kB     Writeback:        948756 kB
> > AnonPages:        945772 kB     Mapped:            57300 kB
> > Shmem:             26300 kB     KReclaimable:    7248160 kB
> > Slab:            7962748 kB     SReclaimable:    7248160 kB
> > SUnreclaim:       714588 kB     KernelStack:       18288 kB
> > PageTables:        10796 kB     NFS_Unstable:          0 kB
> > Bounce:                0 kB     WritebackTmp:          0 kB
> > CommitLimit:    264117824 kB    Committed_AS:   21816824 kB
> > VmallocTotal:   34359738367 kB  VmallocUsed:      561588 kB
> > VmallocChunk:          0 kB     Percpu:            65792 kB
> > HardwareCorrupted:     0 kB     AnonHugePages:         0 kB
> > ShmemHugePages:        0 kB     ShmemPmdMapped:        0 kB
> > FileHugePages:         0 kB     FilePmdMapped:         0 kB
> > HugePages_Total:       0        HugePages_Free:        0
> > HugePages_Rsvd:        0        HugePages_Surp:        0
> > Hugepagesize:       2048 kB     Hugetlb:               0 kB
> > DirectMap4k:      541000 kB     DirectMap2M:    11907072 kB
> > DirectMap1G:    525336576 kB
> >
>
> And thanks for these information.
>
> I have set up a system to run the test, the code I am using is the top of the
> md-next branch. I will update later tonight on the status.

I am not able to reproduce the issue after 6 hours. Maybe it is because I run
tests on 3 partitions of the same nvme SSD. I will try on a different host with
multiple SSDs.

Pawel, have you tried to repro with md-next branch?

Thanks,
Song




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux