Re: [PATCH] fsync_range, was: Re: munmap, msync: synchronization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oops -- I see that I forgot to attach the test program in my last
mail. Appended below, now.)

On 04/23/2014 05:45 PM, Christoph Hellwig wrote:
> On Wed, Apr 23, 2014 at 04:33:06PM +0200, Michael Kerrisk (man-pages) wrote:
>> # Take journaling and atime out of the equation:
>>
>> $ sudo umount /dev/sdb6
>> $ sudo tune2fs -O ^has_journal /dev/sdb6$ 
>> [sudo] password for mtk: 
>> tune2fs 1.42.8 (20-Jun-2013)
>> $ sudo mount -o norelatime,strictatime /dev/sdb6 /testfs
> 
> The second strictatime argument overrides the earlier norelatime,
> so you put it into the picture.

Oh -- have I misunderstood something? I was wanting classical behavior:
atime always updated (but only synced to disk by FILESYNC). Is that not
what I should get with norelatime+strictatime?

>> But I have a question:
>>
>> When I precreate a 10MB file, and repeat the tests (this time with 
>> 100 loops), I no longer see any significant difference between 
>> FFILESYNC and FDATASYNC. What am I missing? Sample runs here, 
>> though I did the tests repeatedly with broadly similar results 
>> each time:
> 
> Not sure.  Do you also see this on other filesystems?

=======

So, here's some results from XFS:

# 1000 loops. 1MB file, 1MB fsync_range()
# As with ext4, FDATASYNC is faster than FFILESYNC (as expected)

$ sudo umount /dev/sdb6; sudo mount -o norelatime,strictatime /dev/sdb6 /testfs
$ time ./t_fsync_range /testfs/f 1000 0 1000000 f 0 1000000
fsync_range(3, 0x20, 0, 1000000)
Performed 16000 writes
Performed 1000 sync operations

real	0m52.264s
user	0m0.018s
sys	0m0.926s
$ sudo umount /dev/sdb6; sudo mount -o norelatime,strictatime /dev/sdb6 /testfs
$ time ./t_fsync_range /testfs/f 1000 0 1000000 d 0 1000000
fsync_range(3, 0x10, 0, 1000000)
Performed 16000 writes
Performed 1000 sync operations

real	0m33.689s
user	0m0.002s
sys	0m0.915s

# (Note that I did not disable XFS journalling--it's not possible to
# do so, right?)

====

# 100 loops, 100MB file, 100MB fsync_range()
# FDATASYNC and FFIFLESYNC times are again similar

$ time ./t_fsync_range /testfs/f 100 0 100000000 f 0 100000000
fsync_range(3, 0x20, 0, 100000000)
Performed 152600 writes
Performed 100 sync operations

real	4m45.257s
user	0m0.004s
sys	0m5.607s

$ time ./t_fsync_range /testfs/f 100 0 100000000 d 0 100000000
fsync_range(3, 0x10, 0, 100000000)
Performed 152600 writes
Performed 100 sync operations

real	4m43.925s
user	0m0.010s
sys	0m3.824s

# Again, the same pattern: no difference between FFILESYNC and FDATASYNC

=====
On JFS, I get

1000 loops, 1MB file, 1MB fsync_range, FFILESYNC:
* Quite a lot of variability (11.3 to 16.5 secs)
1000 loops, 1MB file, 1MB fsync_range, FDATASYNC:
* Quite a lot of variability (8.6 to 10.9 secs)
==> FDATASYNC is on average faster than FFILESYNC

100 loops, 100 MB file, 100MB fsync_range, FFILESYNC:
281 seconds (just a single test)
100 loops, 100 MB file, 100MB fsync_range, FDATASYNC:
280 seconds (just a single test)

So, again, it seems like for a large file sync, there's no difference between
FFILESYNC and FDATASYNC

>> Add another question: is there any piece of sync_file_range() 
>> functionality that could or should be incorporated in this API?
> 
> I don't think so.  sync_file_range is a complete mess and impossible
> to use correctly for data integrity operations.  Especially the whole
> notion that submitting I/O and waiting for it are separate operations
> is incompatible with a data integrity call.

Okay -- I just thought it worth checking.

Cheers,

Michael

========
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define errExit(msg) 	do { perror(msg); exit(EXIT_FAILURE); \
			} while (0)

/* flags for fsync_range */
#define FDATASYNC	0x0010
#define FFILESYNC	0x0020

#define SYS_fsync_range 317

static int
fsync_range(unsigned int fd, int how, loff_t start, loff_t length)
{
    return syscall(SYS_fsync_range, fd, how, start, length);
}

#define BUF_SIZE 65536
static char buf[BUF_SIZE];

int
main(int argc, char *argv[])
{
    int j, fd, nloops, how;
    size_t writeLen, syncLen, wlen;
    size_t bufSize;
    off_t writeOffset, syncOffset;
    int scnt, wcnt;

    if (argc != 8 || strcmp(argv[1], "--help") == 0) {
        fprintf(stderr, "%s pathname nloops write-offset write-length {f|d} "
	        "sync-offset sync-len\n", argv[0]);
	exit(EXIT_SUCCESS);
    }

    fd = open(argv[1], O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
    if (fd == -1)
	errExit("read");

    nloops = atoi(argv[2]);
    writeOffset = atoi(argv[3]);
    writeLen = atoi(argv[4]);
    how = (argv[5][0] == 'd') ? FDATASYNC :
	  (argv[5][0] == 'f') ? FFILESYNC : 0;
    syncOffset = atoi(argv[6]);
    syncLen = atoi(argv[7]);

    if (how != 0)
        fprintf(stderr, "fsync_range(%d, 0x%x, %lld, %zd)\n",
	        fd, how, (long long) syncOffset, syncLen);

    scnt = 0;
    wcnt = 0;

    for (j = 0; j < nloops; j++) {
	memset(buf, j % 256, BUF_SIZE);
	if (lseek(fd, writeOffset, SEEK_SET) == -1)
	    errExit("lseek");

	wlen = writeLen;
        while (wlen > 0) {
            bufSize = (wlen > BUF_SIZE) ? BUF_SIZE : wlen;
	    wlen -= bufSize;
    
	    if (write(fd, buf, bufSize) != bufSize) {
	        fprintf(stderr, "Write failed\n");
	        exit(EXIT_FAILURE);
	    }

	    wcnt++;
        }

	if (how != 0) {
	    scnt++;
	    if (fsync_range(fd, how, syncOffset, syncLen) == -1)
	        errExit("fsync_range");
	}
    }

    fprintf(stderr, "Performed %d writes\n", wcnt);
    fprintf(stderr, "Performed %d sync operations\n", scnt);
    exit(EXIT_SUCCESS);
}



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux