> From: Matthew Wilcox <mailto:willy@xxxxxxxxxxxxx> > Sent: Wednesday, January 24, 2024 10:03 PM > To: Mo, Yuezhang <mailto:Yuezhang.Mo@xxxxxxxx> > Cc: mailto:linkinjeon@xxxxxxxxxx; mailto:sj1557.seo@xxxxxxxxxxx; mailto:linux-fsdevel@xxxxxxxxxxxxxxx > Subject: Re: [PATCH] exfat: fix file not locking when writing zeros in exfat_file_mmap() > > > (3) xfs_io -t -f -c "mmap -rw 0 3072" -c "mwrite -S 0x5a 2048 512" $filename > > (3.1) "mmap -rw 0 3072" > > - write zeros to 1024~3071 > > - "valid_size" is changed to 3072 > > - "size" is still 4096 > > That's not what the code says you do. Is this from a trace or were you > making up an example? > > loff_t start = ((loff_t)vma->vm_pgoff << PAGE_SHIFT); > loff_t end = min_t(loff_t, i_size_read(inode), > start + vma->vm_end - vma->vm_start); > ret = exfat_file_zeroed_range(file, ei->valid_size, end); > > vm_end - vm_start will be 4kB because Linux rounds to PAGE_SIZE even if > you ask to map 3072 bytes. I just gave an example to explain why there is an operation of writing zeros in exfat_file_mmap(). Yes, it's better to explain it with traced data, the data in my example confuses you. > Update ->valid_size in the writeback path. If I'm reading > exfat_get_block() correctly, you already correctly zero pages in the page > cache that are read after ->valid_size, so there is very little work for > you to do. Do you mean to move update valid_size to ->page_mkwrite()? If yes, we need to consider this case. (1) The file size is 12KB (3 pages), valid_size is 0. (2) mmap maps these 3 pages. (3) Only the second page is written. (4) Need to be done in ->page_mkwrite(), (4.1) Mark the second page as dirty(Currently implemented in filemap_page_mkwrite()) (4.2) Mark the first page as dirty(do extra in exfat). (4.3) Update valid_size to the end of the second page(do extra in exfat). Is my understanding correct? If yes, how to do (4.2)? Are there examples of this? > Oh! I just figured out why you probably do it this way. > > (1) xfs_io -t -f -c "pwrite -S 0x59 0 1024" $filename > (2) xfs_io -t -f -c "truncate 4T" $filename > (3) xfs_io -t -f -c "mmap -rw 3T 4096" -c "mwrite -S 0x5a 3T 512" $filename > > Now (at whatever point you're delaying writing zeroes to) you have to > write 3TB of zeroes. >From this example, it seems that the current implementation is not good. I referred to mmap code of ntfs3, and ntfs3 also writes zeros in ->mmap. Is the current implementation acceptable as a workaround? > And it's probably better to do that at mmap time > than at page fault time, so you can still return an error. It's a bit > weird to return ENOSPC from mmap, but here we are. exfat_file_zeroed_range() never returns NOSPEC, because exfat_file_zeroed_range() doesn't change the file size(i.e. not allocate new clusters). > > It'd be nice to have a comment to explain this. Also, it seems to me > that I can write a script that floods the kernel log with: > > exfat_err(inode->i_sb, > "mmap: fail to zero from %llu to %llu(%d)", > start, end, ret); > > That error message should probably be taken out entirely (maybe use a > tracepoint if you really want some kind of logging). The error message can be seen directly by developers during development. How about use pr_err_ratelimited() ?