I recently learned that FUSE has introduced passthrough support, which
appears to significantly enhance performance, as discussed in this
article: [LWN.net](https://lwn.net/Articles/832430/).
I plan to develop some upper-layer applications based on this feature.
However, during my testing, I found that the performance of passthrough
for reading small files seems to be worse than that without passthrough.
Below are the details of my test cases:
https://github.com/wswsmao/fuse-performance/blob/main/file_access_test.c
I generated files of sizes 1M, 500M, and 1000M using the aforementioned
use case for reading.
https://github.com/wswsmao/fuse-performance/blob/main/generate_large_files.sh
### Test Environment Information:
```
$ uname -r
6.11.5-200.fc40.x86_64
```
```
$mount
/dev/vda1 on / type ext4 (rw,noatime)
...
```
### Testing Steps:
I cloned the latest code from the libfuse upstream community and
compiled it to obtain passthrough_hp.
The latest passthrough_hp supports passthrough by default. Therefore,
when testing with passthrough, I used the following command:
```
ls -lh source_dir/
total 1.5G
-rw-r--r-- 1 root root 1.0M Nov 28 02:45 sequential_file_1
-rw-r--r-- 1 root root 500M Nov 28 02:45 sequential_file_2
-rw-r--r-- 1 root root 1000M Nov 28 02:45 sequential_file_3
./lattest_passthrough_hp source_dir/ mount_point/
```
For testing without passthrough, I used the following command:
```
./lattest_passthrough_hp source_dir/ mount_point/ --nopassthrough
```
Then, I executed the test script on mount_point.
During debugging, in a scenario with a 1M buffer set to 4K, I added
print statements in the FUSE daemon's read function. In the without
passthrough mode, I observed 11 print statements, with the maximum read
size being 131072. Additionally, I captured 11 fuse_readahead operations
using ftrace. However, in passthrough mode, even after increasing the
ext4 read-ahead size using the command `blockdev --setra $num
/dev/vda1`, the performance improvement was not significant.
I would like to understand why, in this case, the performance of
passthrough seems to be inferior to that of without passthrough.
Thank you for your assistance.
Best regards,
Abushwang
Attached is my test report for your reference.
## without passthrough
### Size = 1.0M
| Mode | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096 | 7.99 | 256 |
| sequential | 131072 | 6.46 | 8 |
| sequential | 262144 | 7.52 | 4 |
| random | 4096 | 51.40 | 256 |
| random | 131072 | 10.62 | 8 |
| random | 262144 | 8.69 | 4 |
### Size = 500M
| Mode | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096 | 3662.68 | 128000 |
| sequential | 131072 | 3399.55 | 4000 |
| sequential | 262144 | 3565.99 | 2000 |
| random | 4096 | 28444.48 | 128000 |
| random | 131072 | 5012.85 | 4000 |
| random | 262144 | 3636.87 | 2000 |
### Size = 1000M
| Mode | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096 | 8164.34 | 256000 |
| sequential | 131072 | 7704.75 | 8000 |
| sequential | 262144 | 7970.08 | 4000 |
| random | 4096 | 57275.82 | 256000 |
| random | 131072 | 10311.90 | 8000 |
| random | 262144 | 7839.20 | 4000 |
## with passthrough
### Size = 1.0M
| Mode | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096 | 8.50 | 256 |
| sequential | 131072 | 7.54 | 8 |
| sequential | 262144 | 8.71 | 4 |
| random | 4096 | 52.16 | 256 |
| random | 131072 | 9.10 | 8 |
| random | 262144 | 9.54 | 4 |
### Size = 500M
| Mode | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096 | 3320.70 | 128000 |
| sequential | 131072 | 3234.08 | 4000 |
| sequential | 262144 | 2881.98 | 2000 |
| random | 4096 | 28457.52 | 128000 |
| random | 131072 | 4558.78 | 4000 |
| random | 262144 | 3476.05 | 2000 |
### Size = 1000M
| Mode | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096 | 6842.04 | 256000 |
| sequential | 131072 | 6677.01 | 8000 |
| sequential | 262144 | 6268.29 | 4000 |
| random | 4096 | 58478.65 | 256000 |
| random | 131072 | 9435.85 | 8000 |
| random | 262144 | 7031.16 | 4000 |