[performance] fuse: No Significant Performance Improvement with Passthrough Enabled?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I recently learned that FUSE has introduced passthrough support, which appears to significantly enhance performance, as discussed in this article: [LWN.net](https://lwn.net/Articles/832430/).

I plan to develop some upper-layer applications based on this feature. However, during my testing, I found that the performance of passthrough for reading small files seems to be worse than that without passthrough. Below are the details of my test cases:
https://github.com/wswsmao/fuse-performance/blob/main/file_access_test.c

I generated files of sizes 1M, 500M, and 1000M using the aforementioned use case for reading.
https://github.com/wswsmao/fuse-performance/blob/main/generate_large_files.sh

### Test Environment Information:

```
$ uname -r
6.11.5-200.fc40.x86_64
```

```
$mount
/dev/vda1 on / type ext4 (rw,noatime)
...

```

### Testing Steps:

I cloned the latest code from the libfuse upstream community and compiled it to obtain passthrough_hp.

The latest passthrough_hp supports passthrough by default. Therefore, when testing with passthrough, I used the following command:

```
ls -lh source_dir/
total 1.5G
-rw-r--r-- 1 root root  1.0M Nov 28 02:45 sequential_file_1
-rw-r--r-- 1 root root  500M Nov 28 02:45 sequential_file_2
-rw-r--r-- 1 root root 1000M Nov 28 02:45 sequential_file_3

./lattest_passthrough_hp source_dir/ mount_point/
```

For testing without passthrough, I used the following command:

```
./lattest_passthrough_hp source_dir/ mount_point/ --nopassthrough
```

Then, I executed the test script on mount_point.


During debugging, in a scenario with a 1M buffer set to 4K, I added print statements in the FUSE daemon's read function. In the without passthrough mode, I observed 11 print statements, with the maximum read size being 131072. Additionally, I captured 11 fuse_readahead operations using ftrace. However, in passthrough mode, even after increasing the ext4 read-ahead size using the command `blockdev --setra $num /dev/vda1`, the performance improvement was not significant.

I would like to understand why, in this case, the performance of passthrough seems to be inferior to that of without passthrough.

Thank you for your assistance.

Best regards,

Abushwang

Attached is my test report for your reference.

## without passthrough

### Size = 1.0M

| Mode       | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096        | 7.99      | 256        |
| sequential | 131072      | 6.46      | 8          |
| sequential | 262144      | 7.52      | 4          |
| random     | 4096        | 51.40     | 256        |
| random     | 131072      | 10.62     | 8          |
| random     | 262144      | 8.69      | 4          |


### Size = 500M

| Mode       | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096        | 3662.68   | 128000     |
| sequential | 131072      | 3399.55   | 4000       |
| sequential | 262144      | 3565.99   | 2000       |
| random     | 4096        | 28444.48  | 128000     |
| random     | 131072      | 5012.85   | 4000       |
| random     | 262144      | 3636.87   | 2000       |

### Size = 1000M

| Mode       | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096        | 8164.34   | 256000     |
| sequential | 131072      | 7704.75   | 8000       |
| sequential | 262144      | 7970.08   | 4000       |
| random     | 4096        | 57275.82  | 256000     |
| random     | 131072      | 10311.90  | 8000       |
| random     | 262144      | 7839.20   | 4000       |


## with passthrough

### Size = 1.0M

| Mode       | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096        | 8.50      | 256        |
| sequential | 131072      | 7.54      | 8          |
| sequential | 262144      | 8.71      | 4          |
| random     | 4096        | 52.16     | 256        |
| random     | 131072      | 9.10      | 8          |
| random     | 262144      | 9.54      | 4          |


### Size = 500M

| Mode       | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096        | 3320.70   | 128000     |
| sequential | 131072      | 3234.08   | 4000       |
| sequential | 262144      | 2881.98   | 2000       |
| random     | 4096        | 28457.52  | 128000     |
| random     | 131072      | 4558.78   | 4000       |
| random     | 262144      | 3476.05   | 2000       |


### Size = 1000M

| Mode       | Buffer Size | Time (ms) | Read Calls |
| ------------ | ------------- | ----------- | ------------ |
| sequential | 4096        | 6842.04   | 256000     |
| sequential | 131072      | 6677.01   | 8000       |
| sequential | 262144      | 6268.29   | 4000       |
| random     | 4096        | 58478.65  | 256000     |
| random     | 131072      | 9435.85   | 8000       |
| random     | 262144      | 7031.16   | 4000       |




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux