Hi Jan, On 3/4/2024 12:59 PM, Yujie Liu wrote:
From the perf profile, we can see that the contention of folio lru lock becomes more intense. We also did a simple one-file "dd" test. Looks like it is more likely that low-order folios are allocated after commit ab4443fe3c (Fengwei will help provide the data soon). Therefore, the average folio size decreases while the total folio amount increases, which leads to touching lru lock more often.
I did following testing: With a xfs image in tmpfs + mount it to /mnt and create 12G test file (sparse-file), use one process to read it on a Ice Lake machine with 256G system memory. So we could make sure we are doing a sequential file read with no page reclaim triggered. At the same time, profiling the distribution of order parameter of filemap_alloc_folio() call to understand how the large folio order for page cache is generated. Here is what we got: - Commit f0b7a0d1d46625db: $ dd bs=4k if=/mnt/sparse-file of=/dev/null 3145728+0 records in 3145728+0 records out 12884901888 bytes (13 GB, 12 GiB) copied, 2.52208 s, 5.01 GB/s filemap_alloc_folio page order : count distribution 0 : 57 | | 1 : 0 | | 2 : 20 | | 3 : 2 | | 4 : 4 | | 5 : 98300 |****************************************| - Commit ab4443fe3ca6: $ dd bs=4k if=/mnt/sparse-file of=/dev/null 3145728+0 records in 3145728+0 records out 12884901888 bytes (13 GB, 12 GiB) copied, 2.51469 s, 5.1 GB/s filemap_alloc_folio page order : count distribution 0 : 21 | | 1 : 0 | | 2 : 196615 |****************************************| 3 : 98303 |******************* | 4 : 98303 |******************* | Even the file read throughput is almost same. But the distribution of order looks like a regression with ab4443fe3ca6 (more smaller order page cache is generated than parent commit). Thanks. Regards Yin, Fengwei