Hi David
As your patch was written on top on linux-next I was required to make
some small modifications to make it work on mainline (6.13-rc6).
The following patch is working fine for me on mainline, but i think it
would be better to wait for your confirmation / validation (or new
patch) before applying it on production.
#-------- PATCH --------#
diff --git a/linux-6.13-rc6/nba/_orig_fs.netfs.direct_write.c
b/linux-6.13-rc6/fs/netfs/direct_write.c
index 88f2adf..94a1ee8 100644
--- a/linux-6.13-rc6/nba/_orig_fs.netfs.direct_write.c
+++ b/linux-6.13-rc6/fs/netfs/direct_write.c
@@ -67,7 +67,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct
kiocb *iocb, struct iov_iter *
* allocate a sufficiently large bvec array and may
shorten the
* request.
*/
- if (async || user_backed_iter(iter)) {
+ if (user_backed_iter(iter)) {
n = netfs_extract_user_iter(iter, len,
&wreq->iter, 0);
if (n < 0) {
ret = n;
@@ -77,6 +77,11 @@ ssize_t netfs_unbuffered_write_iter_locked(struct
kiocb *iocb, struct iov_iter *
wreq->direct_bv_count = n;
wreq->direct_bv_unpin =
iov_iter_extract_will_pin(iter);
} else {
+ /* If this is a kernel-generated async DIO
request,
+ * assume that any resources the iterator points
to
+ * (eg. a bio_vec array) will persist till the
end of
+ * the op.
+ */
wreq->iter = *iter;
}
#-------- TESTS --------#
Using this patch Linux 6.13-rc6 build with no error and '--direct-io=on'
is working :
18:38:47 root@deb12-lab-10d:~# uname -a
Linux deb12-lab-10d.lab.lan 6.13.0-rc6-amd64 #0 SMP PREEMPT_DYNAMIC Mon
Jan 6 18:14:07 CET 2025 x86_64 GNU/Linux
18:39:29 root@deb12-lab-10d:~# losetup
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE
DIO LOG-SEC
/dev/loop2046 0 0 0 0
/mnt/FBX24T/FS-LAN/bckcrypt2046 1 4096
18:39:32 root@deb12-lab-10d:~# dmsetup ls | grep bckcrypt
bckcrypt (254:7)
18:39:55 root@deb12-lab-10d:~# cryptsetup status bckcrypt
/dev/mapper/bckcrypt is active and is in use.
type: LUKS2
cipher: aes-xts-plain64
keysize: 512 bits
key location: keyring
device: /dev/loop2046
loop: /mnt/FBX24T/FS-LAN/bckcrypt2046
sector size: 512
offset: 32768 sectors
size: 8589901824 sectors
mode: read/write
18:40:36 root@deb12-lab-10d:~# df -h | egrep 'cifs|bckcrypt'
//10.0.10.100/FBX24T cifs 22T 13T 9,0T 60% /mnt/FBX24T
/dev/mapper/bckcrypt btrfs 4,0T 3,3T 779G 82%
/mnt/bckcrypt
09:08:44 root@deb12-lab-10d:~# LANG=en_US.UTF-8
09:08:46 root@deb12-lab-10d:~# dd if=/dev/zero
of=/mnt/bckcrypt/test/test.dd bs=256M count=16 oflag=direct
status=progress
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 14 s, 302 MB/s
16+0 records in
16+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 14.2061 s, 302 MB/s
No write errors using '--direct-io=on' option of losetup with this patch
=> writing to the back-file is more than 20x faster ...
It seems to be ok !
Let me know if something's wrong in this patch or if it can safely be
used in production.
Again thanks everyone for help.
Nicolas
Le 2025-01-06 13:07, nicolas.baranger@xxxxxx a écrit :
Hi David
Thanks for the job !
I will buid Linux 6.10 and mainline with the provided change and I'm
comming here as soon as I get results from tests (CET working time).
Thanks again for help in this issue
Nicolas
Le 2025-01-06 12:37, David Howells a écrit :
Hi Nicolas,
Does the attached fix your problem?
David
---
netfs: Fix kernel async DIO
Netfslib needs to be able to handle kernel-initiated asynchronous DIO
that
is supplied with a bio_vec[] array. Currently, because of the async
flag,
this gets passed to netfs_extract_user_iter() which throws a warning
and
fails because it only handles IOVEC and UBUF iterators. This can be
triggered through a combination of cifs and a loopback blockdev with
something like:
mount //my/cifs/share /foo
dd if=/dev/zero of=/foo/m0 bs=4K count=1K
losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0
echo hello >/dev/loop2046
This causes the following to appear in syslog:
WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50
netfs_extract_user_iter+0x170/0x250 [netfs]
and the write to fail.
Fix this by removing the check in netfs_unbuffered_write_iter_locked()
that
causes async kernel DIO writes to be handled as userspace writes.
Note
that this change relies on the kernel caller maintaining the existence
of
the bio_vec array (or kvec[] or folio_queue) until the op is complete.
Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support")
Reported by: Nicolas Baranger <nicolas.baranger@xxxxxx>
Closes:
https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@xxxxxx/
Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
cc: Steve French <smfrench@xxxxxxxxx>
cc: Jeff Layton <jlayton@xxxxxxxxxx>
cc: netfs@xxxxxxxxxxxxxxx
cc: linux-cifs@xxxxxxxxxxxxxxx
cc: linux-fsdevel@xxxxxxxxxxxxxxx
---
fs/netfs/direct_write.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index eded8afaa60b..42ce53cc216e 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -67,7 +67,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct
kiocb *iocb, struct iov_iter *
* allocate a sufficiently large bvec array and may shorten the
* request.
*/
- if (async || user_backed_iter(iter)) {
+ if (user_backed_iter(iter)) {
n = netfs_extract_user_iter(iter, len, &wreq->buffer.iter, 0);
if (n < 0) {
ret = n;
@@ -77,6 +77,11 @@ ssize_t netfs_unbuffered_write_iter_locked(struct
kiocb *iocb, struct iov_iter *
wreq->direct_bv_count = n;
wreq->direct_bv_unpin = iov_iter_extract_will_pin(iter);
} else {
+ /* If this is a kernel-generated async DIO request,
+ * assume that any resources the iterator points to
+ * (eg. a bio_vec array) will persist till the end of
+ * the op.
+ */
wreq->buffer.iter = *iter;
}
}