Re: [RFC 0/7] ext4: Allocator changes for atomic write support with DIO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/12/2023 13:38, Ojaswin Mujoo wrote:
So are we supposed to be doing atomic writes on unwritten ranges only in the
file to get the aligned allocations?
If we do an atomic write on a hole, ext4 will give us an aligned extent
provided the hole is big enough to accomodate it.

However, if we do an atomic write on a existing extent (written or
unwritten) ext4 would check if it satisfies the alignment and length
requirement and returns an error if it doesn't.

This seems a rather big drawback.

Since we don't have cow
like functionality afaik the only way we could let this kind of write go
through is by punching the pre-existing extent which is not something we
can directly do in the same write operation, hence we return error.

Well, as you prob saw, for XFS we were relying on forcing extent alignment, and not CoW (yet).


I actually tried that, and I got a WARN triggered:

# mkfs.ext4 /dev/sda
mke2fs 1.46.5 (30-Dec-2021)
Creating filesystem with 358400 1k blocks and 89760 inodes
Filesystem UUID: 7543a44b-2957-4ddc-9d4a-db3a5fd019c9
Superblock backups stored on blocks:
         8193, 24577, 40961, 57345, 73729, 204801, 221185

Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

[   12.745889] mkfs.ext4 (150) used greatest stack depth: 13304 bytes left
# mount /dev/sda mnt
[   12.798804] EXT4-fs (sda): mounted filesystem
7543a44b-2957-4ddc-9d4a-db3a5fd019c9 r/w with ordered data mode. Quota
mode: none.
# touch mnt/file
#
# /test-statx -a /root/mnt/file
statx(/root/mnt/file) = 0
dump_statx results=5fff
   Size: 0               Blocks: 0          IO Block: 1024    regular file
Device: 08:00           Inode: 12          Links: 1
Access: (0644/-rw-r--r--)  Uid:     0   Gid:     0
Access: 2023-12-04 10:27:40.002848720+0000
Modify: 2023-12-04 10:27:40.002848720+0000
Change: 2023-12-04 10:27:40.002848720+0000
  Birth: 2023-12-04 10:27:40.002848720+0000
stx_attributes_mask=0x703874
         STATX_ATTR_WRITE_ATOMIC set
         unit min: 1024
         uunit max: 524288
Attributes: 0000000000400000 (........ ........ ........ ........
........ .?--.... ..---... .---.-..)
#



looks ok so far, then write 4KB at offset 0:

# /test-pwritev2 -a -d -p 0 -l 4096  /root/mnt/file
file=/root/mnt/file write_size=4096 offset=0 o_flags=0x4002 wr_flags=0x24

...

Please note that I tested on my own dev branch, which contains changes over
[1], but I expect it would not make a difference for this test.
Hmm this should not ideally happen, can you please share your test
script with me if possible?

It's doing nothing special, just RWF_ATOMIC flag is set for DIO write:

https://github.com/johnpgarry/linux/blob/236870d48ecb19c1cf89dc439e188182a0524cd4/samples/vfs/test-pwritev2.c


Script to test using pwritev2() can be found here:
https://urldefense.com/v3/__https://gist.github.com/OjaswinM/e67accee3cbb7832bd3f1a9543c01da9__;!!ACWV5N9M2RV99hQ!Lr0j4iDHrfXisXOGZ82HNPefBtVDDGe9zbLhey7rRDfPE7A_tsrrQ9Dw_4Ng_qS7xTGCZaEWBKtd6pqA_LIBfA$
Please note that the posix_memalign() call in the program should PAGE align.
Why do you say that? direct IO seems to be working when the userspace
buffer is 512 byte aligned, am I missing something?

ah, sorry, if you just use 1x IOV vector then no special alignment are required, so ignore this. Indeed, I need to improve kernel checks for alignment anyway.

Thanks,
John





[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux