Hello Andreas, >>> > Gang, > > On Thu, May 4, 2017 at 5:33 AM, Gang He <ghe@xxxxxxxx> wrote: >> Hello Guys, >> >> I found a interesting thing on GFS2 file system, After I did a direct IO > write for a whole file, I still saw there were some page caches in this > inode. >> It looks this GFS2 behavior does not follow file system POSIX semantics, I > just want to know this problem belongs to a know issue or we can fix it? >> By the way, I did the same testing on EXT4 and OCFS2 file systems, the > result looks OK. >> I will paste my testing command lines and outputs as below, >> >> For EXT4 file system, >> tb-nd1:/mnt/ext4 # rm -rf f3 >> tb-nd1:/mnt/ext4 # dd if=/dev/urandom of=./f3 bs=1M count=4 oflag=direct >> 4+0 records in >> 4+0 records out >> 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.0393563 s, 107 MB/s >> tb-nd1:/mnt/ext4 # vmtouch -v f3 >> f3 >> [ ] 0/1024 >> >> Files: 1 >> Directories: 0 >> Resident Pages: 0/1024 0/4M 0% >> Elapsed: 0.000424 seconds >> tb-nd1:/mnt/ext4 # >> >> For OCFS2 file system, >> tb-nd1:/mnt/ocfs2 # rm -rf f3 >> tb-nd1:/mnt/ocfs2 # dd if=/dev/urandom of=./f3 bs=1M count=4 oflag=direct >> 4+0 records in >> 4+0 records out >> 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.0592058 s, 70.8 MB/s >> tb-nd1:/mnt/ocfs2 # vmtouch -v f3 >> f3 >> [ ] 0/1024 >> >> Files: 1 >> Directories: 0 >> Resident Pages: 0/1024 0/4M 0% >> Elapsed: 0.000226 seconds >> >> For GFS2 file system, >> tb-nd1:/mnt/gfs2 # rm -rf f3 >> tb-nd1:/mnt/gfs2 # dd if=/dev/urandom of=./f3 bs=1M count=4 oflag=direct >> 4+0 records in >> 4+0 records out >> 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.0579509 s, 72.4 MB/s >> tb-nd1:/mnt/gfs2 # vmtouch -v f3 >> f3 >> [ oo oOo ] 48/1024 > > I cannot reproduce, at least not so easily. What kernel version is > this? If it's not a mainline kernel, can you reproduce on mainline? I always reproduce. I am using the kernel version 4.11.0-rc4-2-default, although the version is not latest, it is enough new. By the way, I add some printk in GFS2 and OCFS2 kernel module, I find GFS2 direct-IO always falls back to buffered IO, I am not sure this behavior is by-design. Of source, even GFS2 falls back to buffered IO, the code still make sure the related page cache invalidated, but the testing result is not by-expected, I need to look at the code deeply. the printk outputs like, [ 198.176774] gfs2_file_write_iter: enter ino 132419 0 - 1048576 [ 198.176785] gfs2_direct_IO: enter ino 132419 pages 0 0 - 1048576 [ 198.176787] gfs2_direct_IO: exit ino 132419 - (0) <<== here, gfs2_direct_IO always return 0, then fall back to buffered IO, his behavior is by-design? [ 198.184640] gfs2_file_write_iter: exit ino 132419 - (1048576) <<== The write_iter looks to return the right bytes. [ 198.189151] gfs2_file_write_iter: enter ino 132419 1048576 - 1048576 [ 198.189163] gfs2_direct_IO: enter ino 132419 pages 8 1048576 - 1048576 <<== here, the inode's page number is greater than zero. [ 198.189165] gfs2_direct_IO: exit ino 132419 - (0) [ 198.195901] gfs2_file_write_iter: exit ino 132419 - (1048576) But for OCFS2 [ 120.331053] ocfs2_file_write_iter: enter ino 297475 0 - 1048576 [ 120.331065] ocfs2_direct_IO: enter ino 297475 pages 0 0 - 1048576 [ 120.343129] ocfs2_direct_IO: exit ino 297475 (1048576) <<== here, ocfs2_direct_IO can return the right bytes. [ 120.343132] ocfs2_file_write_iter: exit ino 297475 - (1048576) [ 120.347705] ocfs2_file_write_iter: enter ino 297475 1048576 - 1048576 [ 120.347713] ocfs2_direct_IO: enter ino 297475 pages 0 1048576 - 1048576 <<== here, the inode's page number is always zero. [ 120.354096] ocfs2_direct_IO: exit ino 297475 (1048576) [ 120.354099] ocfs2_file_write_iter: exit ino 297475 - (1048576) Thanks Gang > > Thanks, > Andreas