Re: The ctime of fstat is not correct which lead to "tar" utility error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



George,

Sorry. I sent you a version of the fix which was stale. Can you try with:
https://review.gluster.org/20549

This patch passes the test case you've given.

regards,
Raghavendra

On Fri, Jul 20, 2018 at 2:59 PM, Lian, George (NSB - CN/Hangzhou) <george.lian@xxxxxxxxxxxxxxx> wrote:

Hi,

 

Sorry, there seems still have issue.

 

We use dd application of linux tools instead of my demo program, and if the file is not exist before dd, the issue still be there.

 

The test command is

rm -rf /mnt/test/file.txt ; dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz

1)     If we set md-cache-timeout to 0, the issue will not happen

2)     If we set md-cache-timeout to 1, the issue will 100% reproduced! (with new patch you mentioned in the mail)

 

Please see detail test result as the below:

 

bash-4.4# gluster v set export md-cache-timeout 0

volume set: failed: Volume export does not exist

bash-4.4# gluster v set test md-cache-timeout 0

volume set: success

bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz /mnt/test/file.txt;stat /mnt/test/file.txt^C

bash-4.4# rm /mnt/test/file.txt

bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz /mnt/test/file.txt;stat /mnt/test/file.txt

1+0 records in

1+0 records out

512 bytes copied, 0.00932571 s, 54.9 kB/s

  File: /mnt/test/file.txt

  Size: 512             Blocks: 1          IO Block: 131072 regular file

Device: 33h/51d Inode: 9949244856126716752  Links: 1

Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)

Access: 2018-07-13 17:55:02.756000000 +0000

Modify: 2018-07-13 17:55:02.764000000 +0000

Change: 2018-07-13 17:55:02.768000000 +0000

Birth: -

tar: Removing leading `/' from member names

/mnt/test/file.txt

  File: /mnt/test/file.txt

  Size: 512             Blocks: 1          IO Block: 131072 regular file

Device: 33h/51d Inode: 9949244856126716752  Links: 1

Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)

Access: 2018-07-13 17:55:02.776000000 +0000

Modify: 2018-07-13 17:55:02.764000000 +0000

Change: 2018-07-13 17:55:02.768000000 +0000

Birth: -

bash-4.4# gluster v set test md-cache-timeout 1

volume set: success

bash-4.4# rm /mnt/test/file.txt

bash-4.4# dd if=/dev/zero of=/mnt/test/file.txt bs=512 count=1 oflag=sync;stat /mnt/test/file.txt;tar -czvf /tmp/abc.gz /mnt/test/file.txt;stat /mnt/test/file.txt

1+0 records in

1+0 records out

512 bytes copied, 0.0107589 s, 47.6 kB/s

  File: /mnt/test/file.txt

  Size: 512             Blocks: 1          IO Block: 131072 regular file

Device: 33h/51d Inode: 13569976446871695205  Links: 1

Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)

Access: 2018-07-13 17:55:11.548000000 +0000

Modify: 2018-07-13 17:55:11.560000000 +0000

Change: 2018-07-13 17:55:11.560000000 +0000

Birth: -

tar: Removing leading `/' from member names

/mnt/test/file.txt

tar: /mnt/test/file.txt: file changed as we read it

  File: /mnt/test/file.txt

  Size: 512             Blocks: 1          IO Block: 131072 regular file

Device: 33h/51d Inode: 13569976446871695205  Links: 1

Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)

Access: 2018-07-13 17:55:11.580000000 +0000

Modify: 2018-07-13 17:55:11.560000000 +0000

Change: 2018-07-13 17:55:11.564000000 +0000

Birth: -

 

 

Best Regards,

George

From: gluster-devel-bounces@gluster.org [mailto:gluster-devel-bounces@gluster.org] On Behalf Of Raghavendra Gowdappa
Sent: Friday, July 20, 2018 4:01 PM
To: Lian, George (NSB - CN/Hangzhou) <george.lian@xxxxxxxxxxxxxxx>
Cc: Zhang, Bingxuan (NSB - CN/Hangzhou) <bingxuan.zhang@nokia-sbell.com>; Raghavendra G <raghavendra@xxxxxxxxxxx>; Gluster-devel@xxxxxxxxxxx


Subject: Re: The ctime of fstat is not correct which lead to "tar" utility error

 

 

 

On Fri, Jul 20, 2018 at 1:22 PM, Lian, George (NSB - CN/Hangzhou) <george.lian@xxxxxxxxxxxxxxx> wrote:

>>>We recently identified an issue with stat-prefetch. Fix can be found at:

>>>https://review.gluster.org/#/c/20410/11

 

>>>Can you let us know whether this helps?

 

 

The patch can resolve this issue, I have verified in Gluster 4.2(master trunk branch) and Gluster 3.12.3!

 

Thanks we'll merge it.

 

 

Thanks & Best Regards,

George

 

From: gluster-devel-bounces@gluster.org [mailto:gluster-devel-bounces@gluster.org] On Behalf Of Raghavendra Gowdappa
Sent: Thursday, July 19, 2018 5:06 PM
To: Lian, George (NSB - CN/Hangzhou) <
george.lian@xxxxxxxxxxxxxxx>
Cc: Zhang, Bingxuan (NSB - CN/Hangzhou) <
bingxuan.zhang@nokia-sbell.com>; Gluster-devel@xxxxxxxxxxx; Raghavendra G <raghavendra@xxxxxxxxxxx>
Subject: Re: The ctime of fstat is not correct which lead to "tar" utility error

 

 

 

On Thu, Jul 19, 2018 at 2:29 PM, Lian, George (NSB - CN/Hangzhou) <george.lian@xxxxxxxxxxxxxxx> wrote:

Hi, Gluster Experts,

 

In glusterfs version 3.12.3, There seems a fstat issue for ctime after we use fsync,

We have a demo execute binary which write some data and then do fsync for this file, it named as tt,

Then run tar command right after tt command, it will always error with tar: /mnt/test/file1.txt: file changed as we read it

 

The command output is list as the below, the source code and volume info configuration attached FYI,

This issue will be 100% reproducible! (/mnt/test is mountpoint of glusterfs volume test , which the volume info is attached in mail)

------------------------------------------------------------------

./tt;tar -czvf /tmp/abc.gz /mnt/test/file1.txt

mtime:1531247107.272000000

ctime:1531247107.272000000

tar: Removing leading `/' from member names

/mnt/test/file1.txt

tar: /mnt/test/file1.txt: file changed as we read it

----------------------------------------------------------------------

 

After my investigation, the xattrop for changelog is later than the fsync response , this is mean:

In function  afr_fsync_cbk will call afr_delayed_changelog_wake_resume (this, local->fd, stub);

 

In our case, it always a pending changelog , so glusterfs save the metadata information to stub, and handle pending changelog first,

But the changelog will also change the ctime, from the packet captured by tcpdump, the response packet of xattrop will not include the metadata information,  and the wake_resume also not handle this metadata changed case.

 

So in this case, the metadata in mdc_cache is not right, and when cache is valid, the application will get WRONG metadata!

 

For verify my guess, if I change the configuration for this volume

gluster v set test md-cache-timeout 0 or

gluster v set export stat-prefetch off

This issue will be GONE!

 

We recently identified an issue with stat-prefetch. Fix can be found at:

 

Can you let us know whether this helps?

 

 

 

And I restore the configuration to default, which mean stat-prefetch is on and md-cache-timeout is 1 second,

I try invalidate the md-cache in source code as the below in function mdc_fync_cbk on md-cache.c

The issue also will be GONE!

 

So GLusterFS Experts,

Could you please verify this issue, and share your comments on my investigation?

And your finally solutions is highly appreciated!

 

Does the following fix you've posted solves the problem?

 

 

------------------------------------changes in function mdc_fsync_cbk

int

mdc_fsync_cbk (call_frame_t *frame, void *cookie, xlator_t *this,

               int32_t op_ret, int32_t op_errno,

               struct iatt *prebuf, struct iatt *postbuf, dict_t *xdata)

{

        mdc_local_t  *local = NULL;

 

        local = frame->local;

 

        if (op_ret != 0)

                goto out;

 

        if (!local)

                goto out;

 

        mdc_inode_iatt_set_validate(this, local->fd->inode, prebuf, postbuf,

                                     _gf_true);

/* new added for ctime issue*/

        mdc_inode_iatt_invalidate(this, local->fd->inode);                                   

/* new added end*/

out:

        MDC_STACK_UNWIND (fsync, frame, op_ret, op_errno, prebuf, postbuf,

                          xdata);

 

        return 0;

}

---------------------------------------------------------------------------------

Best Regards,

George

 

 


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux