On 8/16/23 9:32 AM, Yin, Fengwei wrote: > Hi Jens, > > On 7/6/2023 3:29 PM, kernel test robot wrote: >> >> >> Hello, >> >> kernel test robot noticed a -33.1% regression of stress-ng.io-uring.ops_per_sec on: >> >> >> commit: caec5ebe77f97d948dcf46f07d622bda7f1f6dfd ("io_uring: rely solely on FMODE_NOWAIT") >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master >> >> >> NOTE: >> one thing we want to mention is we reported >> "[linux-next:master] [io_uring] caec5ebe77: stress-ng.io-uring.ops_per_sec 32.5% improvement" >> on >> https://lore.kernel.org/all/202306061247.510feb23-oliver.sang@xxxxxxxxx/ >> however, by further checking, we realized the test machine ran in abnormal >> status at that time due to BIOS issue, which we finally fixed recently. >> please just ignore that report upon linus-next/master. >> >> >> testcase: stress-ng >> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory >> parameters: >> >> nr_threads: 100% >> testtime: 60s >> class: pts >> test: io-uring >> cpufreq_governor: performance >> >> >> In addition to that, the commit also has significant impact on the following tests: >> >> +------------------+-------------------------------------------------------------------------------------------------+ >> | testcase: change | stress-ng: stress-ng.io-uring.ops_per_sec -1.3% regression | >> | test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory | >> | test parameters | class=pts | >> | | cpufreq_governor=performance | >> | | nr_threads=100% | >> | | test=io-uring | >> | | testtime=60s | >> +------------------+-------------------------------------------------------------------------------------------------+ >> >> >> If you fix the issue in a separate patch/commit (i.e. not just a new version of >> the same patch/commit), kindly add following tags >> | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> >> | Closes: https://lore.kernel.org/oe-lkp/202307060958.3594f22f-oliver.sang@xxxxxxxxx >> >> >> Details are as below: >> --------------------------------------------------------------------------------------------------> >> >> >> To reproduce: >> >> git clone https://github.com/intel/lkp-tests.git >> cd lkp-tests >> sudo bin/lkp install job.yaml # job file is attached in this email >> bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run >> sudo bin/lkp run generated-yaml-file >> >> # if come across any failure that blocks the test, >> # please remove ~/.lkp and /lkp dir to run from a clean state. >> >> ========================================================================================= >> class/compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: >> pts/gcc-12/performance/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp7/io-uring/stress-ng/60s >> >> commit: >> e9833d8701 ("block: mark bdev files as FMODE_NOWAIT if underlying device supports it") >> caec5ebe77 ("io_uring: rely solely on FMODE_NOWAIT") > About this regression, some findings in my side: > - LKP use initrd to do stress-ng testing. That means the stress-ng is using the > file in initrd as test file. > - The commit caec5ebe77 makes io_uring rely on FMODE_NOWAIT. But the tmpfs and > the file on initrd doesn't has that bit set. I suppose this is why we can see > this regression with LKP which is using initrd. > > I tried to let stress-ng.io_uring use the file on tmpfs and also can see > signifcient performance change with this commit. > > - If apply following change based on commit caec5ebe77: > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > index 7c426584e35a..e755697c773f 100644 > --- a/io_uring/io_uring.c > +++ b/io_uring/io_uring.c > @@ -1765,6 +1765,17 @@ static void io_iopoll_req_issued(struct io_kiocb *req, unsigned int issue_flags) > */ > static bool __io_file_supports_nowait(struct file *file, umode_t mode) > { > + if (S_ISREG(mode)) { > + struct block_device *bdev = file->f_inode->i_sb->s_bdev; > + > + if (IS_ENABLED(CONFIG_BLOCK) && > + (!bdev || bdev_nowait(bdev)) && > + !io_is_uring_fops(file)) > + return true; > + > + return false; > + } > + > /* any ->read/write should understand O_NONBLOCK */ > if (file->f_flags & O_NONBLOCK) > return true; > > The regression is gone with LKP. > > - If apply following change based on commit caec5ebe77: > diff --git a/mm/shmem.c b/mm/shmem.c > index e40a08c5c6d7..e9df664f0063 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -3962,9 +3962,16 @@ const struct address_space_operations shmem_aops = { > }; > EXPORT_SYMBOL(shmem_aops); > > +static int shmem_file_open(struct inode *inode, struct file *filp) > +{ > + filp->f_mode |= FMODE_NOWAIT; > + > + return generic_file_open(inode, filp); > +} > + > static const struct file_operations shmem_file_operations = { > .mmap = shmem_mmap, > - .open = generic_file_open, > + .open = shmem_file_open, > .get_unmapped_area = shmem_get_unmapped_area, > #ifdef CONFIG_TMPFS > .llseek = shmem_file_llseek, > > The performance change when running stress-ng.io_uring with testing file > in tmpfs is gone. > > This is just the information FYI. I may miss something obviously here. Thanks. This actually highlighted a problem with the old nowait logic, in that it assumed !bdev would mean that nowait was fine. Looking at shmem, we definitely need IOCB_NOWAIT handling in there to make that safe. So the old code was buggy, and conversely, we can't also just add the FMODE_NOWAIT without first making those improvements to shmem first. Basically you'd want to ensure that the read_iter and write_iter paths honor IOCB_NOWAIT. Once that's done, then FMODE_NOWAIT can indeed be set in the open helper. I might take a stab at this, but I'm out sick right now so don't think it'd be cohesive if I did it right now. -- Jens Axboe