On Fri, 10 Mar 2023 at 14:43, Christian Löhle <CLoehle@xxxxxxxxxxxxxx> wrote: > > I have benchmarked the FUA/Cache behavior a bit. > I don't have an actual filesystem benchmark that does what I wanted and is easy to port to the target so I used: > > # call with > # for loop in {1..3}; do sudo dd if=/dev/urandom bs=1M of=/dev/mmcblk2; done; for loop in {1..5}; do time ./filesystembenchmark.sh; umount /mnt; done > mkfs.ext4 -F /dev/mmcblk2 > mount /dev/mmcblk2 /mnt > for i in {1..3} > do > cp -r linux-6.2.2 /mnt/$i > done > for i in {1..3} > do > rm -r /mnt/$i > done > for i in {1..3} > do > cp -r linux-6.2.2 /mnt/$i > done > > > I found a couple of DUTs that I can link, I also tested one industrial card. > > DUT1: blue PCB Foresee eMMC > https://pine64.com/product/32gb-emmc-module/ > DUT2: green PCB SiliconGo eMMC > Couldn't find that one online anymore unfortunately > DUT3: orange hardkernel PCB 8GB > https://www.hardkernel.com/shop/8gb-emmc-module-c2-android/ > DUT4: orange hardkernel PCB white dot > https://rlx.sk/en/odroid/3198-16gb-emmc-50-module-xu3-android-for-odroid-xu3.html > DUT5: Industrial card Thanks a lot for helping out with testing! Much appreciated! > > > The test issued 461 DO_REL_WR during one of the iterations for DUT5 > > DUT1: > Cache, no FUA: > 13:04.49 > 13:13.82 > 13:30.59 > 13:28:13 > 13:20:64 > FUA: > 13:30.32 > 13:36.26 > 13:10.86 > 13:32.52 > 13:48.59 > > DUT2: > FUA: > 8:11.24 > 7:47.73 > 7:48.00 > 7:48.18 > 7:47.38 > Cache, no FUA: > 8:10.30 > 7:48.97 > 7:48.47 > 7:47.93 > 7:44.18 > > DUT3: > Cache, no FUA: > 7:02.82 > 6:58.94 > 7:03.20 > 7:00.27 > 7:00.88 > FUA: > 7:05.43 > 7:03.44 > 7:04.82 > 7:03.26 > 7:04.74 > > DUT4: > FUA: > 7:23.92 > 7:20.15 > 7:20.52 > 7:19.10 > 7:20.71 > Cache, no FUA: > 7:20.23 > 7:20.48 > 7:19.94 > 7:18.90 > 7:19.88 Without going into the details of the above, it seems like for DUT1, DUT2, DUT3 and DUT4 there a good reasons to why we should move forward with $subject patch. Do you agree? > > Cache, no FUA: > 7:19.36 > 7:02.11 > 7:01.53 > 7:01.35 > 7:00.37 > Cache, no FUA CQE: > 7:17.55 > 7:00.73 > 6:59.25 > 6:58.44 > 6:58.60 > FUA: > 7:15.10 > 6:58.99 > 6:58.94 > 6:59.17 > 6:60.00 > FUA CQE: > 7:11.03 > 6:58.04 > 6:56.89 > 6:56.43 > 6:56:28 > > If anyone has any comments or disagrees with the benchmark, or has a specific eMMC to test, let me know. If I understand correctly, for DUT5, it seems like using FUA may be slightly better than just cache-flushing, right? For CQE, it seems like FUA could be slightly even better, at least for DUT5. Do you know if REQ_OP_FLUSH translates into MMC_ISSUE_DCMD or MMC_ISSUE_SYNC for your case? See mmc_cqe_issue_type(). When it comes to CQE, maybe Adrian have some additional thoughts around this? Perhaps we should keep using REQ_FUA, if we have CQE? Kind regards Uffe