Re: [fuse-devel] Writing to FUSE via mmap extremely slow (sometimes) on some machines?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for clarifying. I have modified the mmap test program (see
attached) to optionally read in the entire file when the WORKAROUND=
environment variable is set, thereby preventing the FUSE reads in the
write phase. I can now see a batch of reads, followed by a batch of
writes.

What’s interesting: when polling using “while :; do grep ^Bdi
/sys/kernel/debug/bdi/0:93/stats; sleep 0.1; done” and running the
mmap test program, I see:

BdiDirtied:            3566304 kB
BdiWritten:            3563616 kB
BdiWriteBandwidth:       13596 kBps

BdiDirtied:            3566304 kB
BdiWritten:            3563616 kB
BdiWriteBandwidth:       13596 kBps

BdiDirtied:            3566528 kB (+224 kB) <-- starting to dirty pages
BdiWritten:            3564064 kB (+448 kB) <-- starting to write
BdiWriteBandwidth:       10700 kBps <-- only bandwidth update!

BdiDirtied:            3668224 kB (+ 101696 kB) <-- all pages dirtied
BdiWritten:            3565632 kB (+1568 kB)
BdiWriteBandwidth:       10700 kBps

BdiDirtied:            3668224 kB
BdiWritten:            3665536 kB (+ 99904 kB) <-- all pages written
BdiWriteBandwidth:       10700 kBps

BdiDirtied:            3668224 kB
BdiWritten:            3665536 kB
BdiWriteBandwidth:       10700 kBps

This seems to suggest that the bandwidth measurements only capture the
rising slope of the transfer, but not the bulk of the transfer itself,
resulting in inaccurate measurements. This effect is worsened when the
test program doesn’t pre-read the output file and hence the kernel
gets fewer FUSE write requests out.

On Mon, Mar 9, 2020 at 3:36 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>
> On Mon, Mar 9, 2020 at 3:32 PM Michael Stapelberg
> <michael+lkml@xxxxxxxxxxxxx> wrote:
> >
> > Here’s one more thing I noticed: when polling
> > /sys/kernel/debug/bdi/0:93/stats, I see that BdiDirtied and BdiWritten
> > remain at their original values while the kernel sends FUSE read
> > requests, and only goes up when the kernel transitions into sending
> > FUSE write requests. Notably, the page dirtying throttling happens in
> > the read phase, which is most likely why the write bandwidth is
> > (correctly) measured as 0.
> >
> > Do we have any ideas on why the kernel sends FUSE reads at all?
>
> Memory writes (stores) need the memory page to be up-to-date wrt. the
> backing file before proceeding.   This means that if the page hasn't
> yet been cached by the kernel, it needs to be read first.
>
> Thanks,
> Miklos
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h> 
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdint.h>

/*
 * An implementation of copy ("cp") that uses memory maps.  Various
 * error checking has been removed to promote readability
 */

// Where we want the source file's memory map to live in virtual memory
// The destination file resides immediately after the source file
#define MAP_LOCATION 0x6100

int main (int argc, char *argv[]) {
 int fdin, fdout;
 char *src, *dst;
 struct stat statbuf;
 off_t fileSize = 0;

 if (argc != 3) {
   printf ("usage: a.out <fromfile> <tofile>\n");
   exit(0);
 }

 /* open the input file */
 if ((fdin = open (argv[1], O_RDONLY)) < 0) {
   printf ("can't open %s for reading\n", argv[1]);
   exit(0);
 }

 /* open/create the output file */
 if ((fdout = open (argv[2], O_RDWR | O_CREAT | O_TRUNC, 0600)) < 0) {
   printf ("can't create %s for writing\n", argv[2]);
   exit(0);
 }
 
 /* find size of input file */
 fstat (fdin,&statbuf) ;
 fileSize = statbuf.st_size;
 
 /* go to the location corresponding to the last byte */
 if (lseek (fdout, fileSize - 1, SEEK_SET) == -1) {
   printf ("lseek error\n");
   exit(0);
 }
 
 /* write a dummy byte at the last location */
 write (fdout, "", 1);
 
 /* 
  * memory map the input file.  Only the first two arguments are
  * interesting: 1) the location and 2) the size of the memory map 
  * in virtual memory space. Note that the location is only a "hint";
  * the OS can choose to return a different virtual memory address.
  * This is illustrated by the printf command below.
 */

 src = mmap ((void*) MAP_LOCATION, fileSize, 
	     PROT_READ, MAP_SHARED | MAP_POPULATE, fdin, 0);

 /* memory map the output file after the input file */
 dst = mmap ((void*) MAP_LOCATION + fileSize , fileSize , 
	     PROT_READ | PROT_WRITE, MAP_SHARED, fdout, 0);


 printf("pid: %d\n", getpid());
 printf("Mapped src: 0x%p  and dst: 0x%p\n",src,dst);

 if (getenv("WORKAROUND") != NULL) {
   printf("workaround: reading output file before dirtying its pages\n");
   uint8_t sum = 0;
   uint8_t *ptr = (uint8_t*)dst;
   for (off_t i = 0; i < fileSize; i++) {
     sum += *ptr;
     ptr++;
   }
   printf("sum: %d\n", sum);
   sleep(1);
   printf("writing\n");
 }

 /* Copy the input file to the output file */
 memcpy (dst, src, fileSize);

 printf("memcpy done\n");

 // we should probably unmap memory and close the files
} /* main */

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux