Re: cephfs issue with moving files between data pools gives Input/output error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 1, 2018 at 12:43 PM Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx> wrote:
Hmmm, did not know that, so it becomes a soft link or so?

totally new for me, also not what I would expect of a mv on a fs. I know
this is normal to expect coping between pools, also from the s3cmd
client. But I think more people will not expect this behaviour. Can't
the move be implemented as a move?

How can users even know about what folders have a 'different layout'.
What happens if we export such mixed pool filesystem via smb. How would
smb deal with the 'move' between those directories?

Since the CephX permissions are thoroughly outside of POSIX, handling this is unfortunately just your problem. :(

Consider it the other way around — what if a mv *did* copy the file data into a new pool, and somebody who had the file open was suddenly no longer able to access it? There's no feasible way for us to handle that with rules that fall inside of POSIX; what we have now is better.

John's right; it would be great if we could do a server-side "re-stripe" or "re-layout" or something, but that will also be an "outside POSIX" operation and never the default.
-Greg
 




-----Original Message-----
From: Gregory Farnum [mailto:gfarnum@xxxxxxxxxx]
Sent: maandag 1 oktober 2018 21:28
To: Marc Roos
Cc: ceph-users; jspray; ukernel
Subject: Re: cephfs issue with moving files between data
pools gives Input/output error

Moving a file into a directory with a different layout does not, and is
not intended to, copy the underlying file data into a different pool
with the new layout. If you want to do that you have to make it happen
yourself by doing a copy.

On Mon, Oct 1, 2018 at 12:16 PM Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx>
wrote:



        I will explain the test again, I think you might have some bug in
your
        cephfs copy between data pools.

        c04 has mounted the root cephfs
        /a (has data pool a, ec21)
        /test (has data pool b, r1)

        test2 has mounted
        /m  (nfs mount of cephfs /a)
        /m2 (cephfs mount of /a)

        Creating the test file.
        [root@c04 test]# echo asdfasdfasdfasdfasdfasdfasdfasdfasdf >
        testfile.txt

        Then I am moving on c04 the test file from the test folder(pool b)
to
        the a folder/pool

        Now on test2
        [root@test2 m]# ls -arlt
        -rw-r--r--  1 nobody nobody            21 Oct  1 20:48 r1.txt
        -rw-r--r--  1 nobody nobody            21 Oct  1 20:49 r1-copy.txt
        -rw-r--r--  1 nobody nobody            37 Oct  1 21:02 testfile.txt

        [root@test2 /]# cat /mnt/m/testfile.txt
        cat: /mnt/m/old/testfile.txt: Input/output error

        [root@test2 /]# cat /mnt/m2/testfile.txt
        cat: /mnt/m2/old/testfile.txt: Operation not permitted

        Now I am creating a copy of the test file in the same directory
back on
        c04

        [root@c04 a]# cp testfile.txt testfile-copy.txt
        [root@c04 a]# ls -alrt
        -rw-r--r-- 1 root root         21 Oct  1 20:49 r1-copy.txt
        -rw-r--r-- 1 root root         37 Oct  1 21:02 testfile.txt
        -rw-r--r-- 1 root root         37 Oct  1 21:07 testfile-copy.txt

        Now I trying to access the copy of testfile.txt back on test2
(without
        unmounting, or changing permissions)

        [root@test2 /]# cat /mnt/m/testfile-copy.txt
        asdfasdfasdfasdfasdfasdfasdfasdfasdf
        [root@test2 /]# cat /mnt/m2/testfile-copy.txt
        asdfasdfasdfasdfasdfasdfasdfasdfasdf







        -----Original Message-----
        From: Yan, Zheng [mailto:ukernel@xxxxxxxxx]
        Sent: zaterdag 29 september 2018 6:55
        To: Marc Roos
        Subject: Re: cephfs issue with moving files between
data
        pools gives Input/output error

        check_pool_perm on pool 30 ns  need Fr, but no read perm

        client does not permission to read the pool.  ceph-fuse did return
EPERM
        for the kernel readpage request.  But kernel return -EIO for any
        readpage error.
        On Fri, Sep 28, 2018 at 10:09 PM Marc Roos
<M.Roos@xxxxxxxxxxxxxxxxx>
        wrote:
        >
        >
        > Is this useful? I think this is the section of the client log
when
        >
        > [@test2 m]$ cat out6
        > cat: out6: Input/output error
        >
        > 2018-09-28 16:03:39.082200 7f1ad01f1700 10 client.3246756
fill_statx
        > on 0x100010943bc snap/devhead mode 040557 mtime 2018-09-28
        > 14:49:35.349370 ctime 2018-09-28 14:49:35.349370
        > 2018-09-28 16:03:39.082223 7f1ad01f1700  3 client.3246756
ll_getattrx
        > 0x100010943bc.head = 0
        > 2018-09-28 16:03:39.082727 7f1ae813f700 10 client.3246756
fill_statx
        > on
        > 0x10001698ac5 snap/devhead mode 0100644 mtime 2018-09-28
        > 14:45:50.323273 ctime 2018-09-28 14:47:47.028679
        > 2018-09-28 16:03:39.082737 7f1ae813f700  3 client.3246756
ll_getattrx
        > 0x10001698ac5.head = 0
        > 2018-09-28 16:03:39.083149 7f1ac07f8700  3 client.3246756 ll_open

        > 0x10001698ac5.head 0
        > 2018-09-28 16:03:39.083160 7f1ac07f8700 10 client.3246756
_getattr
        > mask As issued=1
        > 2018-09-28 16:03:39.083165 7f1ac07f8700  3 client.3246756
may_open
        > 0x7f1a7810ad00 = 0
        > 2018-09-28 16:03:39.083169 7f1ac07f8700 10 break_deleg: breaking
        > delegs on 0x10001698ac5.head(faked_ino=0 ref=2 ll_ref=1
cap_refs={}
        > open={1=1}
        > mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273
        > caps=pAsLsXsFs(0=pAsLsXsFs) objectset[0x10001698ac5 ts 0/0
objects 0
        > dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
        > 2018-09-28 16:03:39.083183 7f1ac07f8700 10 delegations_broken:
        > delegations empty on 0x10001698ac5.head(faked_ino=0 ref=2
ll_ref=1
        > cap_refs={} open={1=1} mode=100644 size=17/0 nlink=1
mtime=2018-09-28
        > 14:45:50.323273 caps=pAsLsXsFs(0=pAsLsXsFs)
objectset[0x10001698ac5 ts

        > 0/0 objects 0 dirty_or_tx 0] parents=0x7f1a780f1dd0
0x7f1a7810ad00)
        > 2018-09-28 16:03:39.083198 7f1ac07f8700 10 client.3246756
        > choose_target_mds from caps on inode
0x10001698ac5.head(faked_ino=0
        > ref=3 ll_ref=1 cap_refs={} open={1=1} mode=100644 size=17/0
nlink=1
        > mtime=2018-09-28 14:45:50.323273 caps=pAsLsXsFs(0=pAsLsXsFs)
        > objectset[0x10001698ac5 ts 0/0 objects 0 dirty_or_tx 0]
        > parents=0x7f1a780f1dd0 0x7f1a7810ad00)
        > 2018-09-28 16:03:39.083209 7f1ac07f8700 10 client.3246756
send_request

        > rebuilding request 1911 for mds.0
        > 2018-09-28 16:03:39.083218 7f1ac07f8700 10 client.3246756
send_request
        > client_request(unknown.0:1911 open #0x10001698ac5 2018-09-28
        > 16:03:39.083194 caller_uid=501, caller_gid=501{501,}) v4 to mds.0
        > 2018-09-28 16:03:39.084088 7f1a82ffd700  5 client.3246756
        > set_cap_epoch_barrier epoch = 24093
        > 2018-09-28 16:03:39.084097 7f1a82ffd700 10 client.3246756  mds.0
seq
        > now
        > 1
        > 2018-09-28 16:03:39.084108 7f1a82ffd700  5 client.3246756
        > handle_cap_grant on in 0x10001698ac5 mds.0 seq 7 caps now
pAsLsXsFscr
        > was pAsLsXsFs
        > 2018-09-28 16:03:39.084118 7f1a82ffd700 10 client.3246756
        > update_inode_file_time 0x10001698ac5.head(faked_ino=0 ref=3
ll_ref=1
        > cap_refs={} open={1=1} mode=100644 size=17/0 nlink=1
mtime=2018-09-28
        > 14:45:50.323273 caps=pAsLsXsFs(0=pAsLsXsFs)
objectset[0x10001698ac5 ts

        > 0/0 objects 0 dirty_or_tx 0] parents=0x7f1a780f1dd0
0x7f1a7810ad00)
        > pAsLsXsFs ctime 2018-09-28 14:47:47.028679 mtime 2018-09-28
        > 14:45:50.323273
        > 2018-09-28 16:03:39.084133 7f1a82ffd700 10 client.3246756   
grant, new
        > caps are Fcr
        > 2018-09-28 16:03:39.084143 7f1a82ffd700 10 client.3246756
insert_trace

        > from 2018-09-28 16:03:39.083217 mds.0 is_target=1 is_dentry=0
        > 2018-09-28 16:03:39.084147 7f1a82ffd700 10 client.3246756 
features
        > 0x3ffddff8eea4fffb
        > 2018-09-28 16:03:39.084148 7f1a82ffd700 10 client.3246756
        > update_snap_trace len 48
        > 2018-09-28 16:03:39.084181 7f1a82ffd700 10 client.3246756
        > update_snap_trace snaprealm(0x1 nref=755 c=0 seq=1 parent=0x0
        > my_snaps=[] cached_snapc=1=[]) seq 1 <= 1 and same parent,
SKIPPING
        > 2018-09-28 16:03:39.084186 7f1a82ffd700 10 client.3246756  hrm
        > is_target=1 is_dentry=0
        > 2018-09-28 16:03:39.084195 7f1a82ffd700 10 client.3246756
        > add_update_cap issued pAsLsXsFscr -> pAsLsXsFscr from mds.0 on
        > 0x10001698ac5.head(faked_ino=0 ref=3 ll_ref=1 cap_refs={}
open={1=1}
        > mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273
        > caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10001698ac5 ts 0/0
objects

        > 0 dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
        > 2018-09-28 16:03:39.084268 7f1ac07f8700 10 client.3246756
_create_fh
        > 0x10001698ac5 mode 1
        > 2018-09-28 16:03:39.084280 7f1ac07f8700  3 client.3246756 ll_open

        > 0x10001698ac5.head 0 = 0 (0x7f1a24028e10)
        > 2018-09-28 16:03:39.084373 7f1a82ffd700 10 client.3246756
put_inode on

        > 0x10001698ac5.head(faked_ino=0 ref=5 ll_ref=1 cap_refs={}
open={1=1}
        > mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273
        > caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10001698ac5 ts 0/0
objects

        > 0 dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
        > 2018-09-28 16:03:39.084392 7f1a82ffd700 10 client.3246756
put_inode on

        > 0x10001698ac5.head(faked_ino=0 ref=4 ll_ref=1 cap_refs={}
open={1=1}
        > mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273
        > caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10001698ac5 ts 0/0
objects

        > 0 dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
        > 2018-09-28 16:03:39.084899 7f1af0161700  3 client.3246756 ll_read

        > 0x7f1a24028e10 0x10001698ac5  0~17
        > 2018-09-28 16:03:39.084911 7f1af0161700 10 client.3246756
        > check_pool_perm on pool 30 ns  need Fr, but no read perm
        > 2018-09-28 16:03:39.086697 7f1ac02ba700  3 client.3246756
ll_release
        > (fh)0x7f1a24028e10 0x10001698ac5
        > 2018-09-28 16:03:39.086707 7f1ac02ba700  8 client.3246756
_release_fh
        > 0x7f1a24028e10 mode 1 on 0x10001698ac5.head(faked_ino=0 ref=3
ll_ref=1

        > cap_refs={} open={1=1} mode=100644 size=17/0 nlink=1
mtime=2018-09-28
        > 14:45:50.323273 caps=pAsLsXsFscr(0=pAsLsXsFscr)
        > objectset[0x10001698ac5 ts 0/0 objects 0 dirty_or_tx 0]
        > parents=0x7f1a780f1dd0 0x7f1a7810ad00)
        > 2018-09-28 16:03:39.086721 7f1ac02ba700 10 client.3246756 _flush
        > 0x10001698ac5.head(faked_ino=0 ref=4 ll_ref=1 cap_refs={}
open={1=0}
        > mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273
        > caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10001698ac5 ts 0/0
objects

        > 0 dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
        > 2018-09-28 16:03:39.086729 7f1ac02ba700 10 client.3246756 
nothing to
        > flush
        > 2018-09-28 16:03:39.086731 7f1ac02ba700 10 client.3246756
put_inode on

        > 0x10001698ac5.head(faked_ino=0 ref=4 ll_ref=1 cap_refs={}
open={1=0}
        > mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273
        > caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10001698ac5 ts 0/0
objects

        > 0 dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
        > 2018-09-28 16:03:39.086759 7f1ac02ba700 10 client.3246756
check_caps
        > on 0x10001698ac5.head(faked_ino=0 ref=3 ll_ref=1 cap_refs={}
        > open={1=0}
        > mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273
        > caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10001698ac5 ts 0/0
objects

        > 0 dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00) wanted -
used
        > - issued pAsLsXsFscr revoking - flags=0
        > 2018-09-28 16:03:39.086772 7f1ac02ba700 10 client.3246756
        > cap_delay_requeue on 0x10001698ac5.head(faked_ino=0 ref=3
ll_ref=1
        > cap_refs={} open={1=0} mode=100644 size=17/0 nlink=1
mtime=2018-09-28
        > 14:45:50.323273 caps=pAsLsXsFscr(0=pAsLsXsFscr)
        > objectset[0x10001698ac5 ts 0/0 objects 0 dirty_or_tx 0]
        > parents=0x7f1a780f1dd0 0x7f1a7810ad00)
        > 2018-09-28 16:03:39.086780 7f1ac02ba700 10 client.3246756  cap
mds.0
        > issued pAsLsXsFscr implemented pAsLsXsFscr revoking -
        > 2018-09-28 16:03:39.086784 7f1ac02ba700 10 client.3246756
delaying cap

        > release
        > 2018-09-28 16:03:39.086786 7f1ac02ba700 10 client.3246756
_release_fh
        > 0x7f1a24028e10 on inode 0x10001698ac5.head(faked_ino=0 ref=3
ll_ref=1
        > cap_refs={} open={1=0} mode=100644 size=17/0 nlink=1
mtime=2018-09-28
        > 14:45:50.323273 caps=pAsLsXsFscr(0=pAsLsXsFscr)
        > objectset[0x10001698ac5 ts 0/0 objects 0 dirty_or_tx 0]
        > parents=0x7f1a780f1dd0 0x7f1a7810ad00) no async_err state
        > 2018-09-28 16:03:39.086796 7f1ac02ba700 10 client.3246756
put_inode on

        > 0x10001698ac5.head(faked_ino=0 ref=3 ll_ref=1 cap_refs={}
open={1=0}
        > mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273
        > caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10001698ac5 ts 0/0
objects

        > 0 dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
        > 2018-09-28 16:03:41.223087 7f1b0bfff700 10 client.3246741
renew_caps()
        > 2018-09-28 16:03:41.223633 7f1b0affd700 10 client.3246741
        > handle_client_session client_session(renewcaps seq 9) v1 from
mds.0
        > 2018-09-28 16:03:41.243573 7f1af3fff700 10 client.3246744
renew_caps()
        > 2018-09-28 16:03:41.243584 7f1af3fff700 10 client.3246744
renew_caps
        > mds.0
        > 2018-09-28 16:03:41.244175 7f1af2ffd700 10 client.3246744
        > handle_client_session client_session(renewcaps seq 9) v1 from
mds.0
        > 2018-09-28 16:03:41.265549 7f1adbfff700 10 client.3246747
renew_caps()
        > 2018-09-28 16:03:41.265570 7f1adbfff700 10 client.3246747
renew_caps
        > mds.0
        > 2018-09-28 16:03:41.266197 7f1adaffd700 10 client.3246747
        > handle_client_session client_session(renewcaps seq 9) v1 from
mds.0
        > 2018-09-28 16:03:41.284915 7f1ac3fff700 10 client.3246750
renew_caps()
        > 2018-09-28 16:03:41.284923 7f1ac3fff700 10 client.3246750
renew_caps
        > mds.0
        > 2018-09-28 16:03:41.285495 7f1ac2ffd700 10 client.3246750
        > handle_client_session client_session(renewcaps seq 9) v1 from
mds.0
        > 2018-09-28 16:03:41.314160 7f1ab3fff700 10 client.3246753
renew_caps()
        > 2018-09-28 16:03:41.314182 7f1ab3fff700 10 client.3246753
renew_caps
        > mds.0
        > 2018-09-28 16:03:41.314717 7f1ab2ffd700 10 client.3246753
        > handle_client_session client_session(renewcaps seq 9) v1 from
mds.0
        > 2018-09-28 16:03:41.333642 7f1aa0ff9700 10 client.3246586
renew_caps()
        > 2018-09-28 16:03:41.333670 7f1aa0ff9700 10 client.3246586
renew_caps
        > mds.0
        > 2018-09-28 16:03:41.334151 7f1a9b7fe700 10 client.3246586
        > handle_client_session client_session(renewcaps seq 9) v1 from
mds.0
        > 2018-09-28 16:03:41.352189 7f1a88ff9700 10 client.3246756
renew_caps()
        > 2018-09-28 16:03:41.352196 7f1a88ff9700 10 client.3246756
renew_caps
        > mds.0
        > 2018-09-28 16:03:41.352630 7f1a82ffd700 10 client.3246756
        > handle_client_session client_session(renewcaps seq 9) v1 from
mds.0
        > 2018-09-28 16:03:41.372079 7f1a6bfff700 10 client.3246759
renew_caps()
        > 2018-09-28 16:03:41.372087 7f1a6bfff700 10 client.3246759
renew_caps
        > mds.0
        > 2018-09-28 16:03:41.372544 7f1a6affd700 10 client.3246759
        > handle_client_session client_session(renewcaps seq 9) v1 from
mds.0
        >
        >
        >
        >
        > -----Original Message-----
        > From: John Spray [mailto:jspray@xxxxxxxxxx]
        > Sent: vrijdag 28 september 2018 15:45
        > To: Marc Roos
        > Cc: ceph-users@xxxxxxxxxxxxxx
        > Subject: Re: cephfs issue with moving files between
data
        > pools gives Input/output error
        >
        > On Fri, Sep 28, 2018 at 2:28 PM Marc Roos
<M.Roos@xxxxxxxxxxxxxxxxx>
        > wrote:
        > >
        > >
        > > Looks like that if I move files between different data pools of
the
        > > cephfs, something is still refering to the 'old location' and
gives
        > > an
        >
        > > Input/output error. I assume this, because I am using different

        > > client
        >
        > > ids for authentication.
        > >
        > > With the same user as configured in ganesha, mounting (kernel)
        > > erasure
        >
        > > code cephfs m can create file out4
        > >
        > > At nfs4 client, same location m
        > > I can read out4
        > > I can create out5
        > > I can read out5
        > >
        > > Mounted root cephfs create file in folder t (test replicated 1)
I
        > > can create out6 I can move out6 to a the folder m (erasure
coded) I
        > > can read out6
        > >
        > > At nfs4 client, m location
        > > [@m]# cat out6
        > > cat: out6: Input/output error
        >
        > If it was due to permissions, I would expect to see EPERM rather
than
        > EIO.  EIO suggests something more fundamentally broken, like a
client
        > version that doesn't understand the latest layout format.
        >
        > Assuming you're using the CephFS FSAL in Ganesha (rather than
        > re-exporting a local mount of CephFS), it should be possible to
create

        > an /etc/ceph/ceph.conf file with a "[client]" section that
enables
        > debug logging (debug client = 10 or similar), and sets an output
        > location ("log file = /tmp/client.log") -- that might give a bit
more
        > information about the nature of the error.
        >
        > John
        >
        > >
        > >
        > >
        > > [client.cephfs.t]
        > >      key = xxx==
        > >      caps mds = "allow rw path=/t"
        > >      caps mgr = "allow r"
        > >      caps mon = "allow r"
        > >      caps osd = "allow rwx pool=fs_meta,allow rwx pool=fs_data,

        > > allow
        >
        > > rwx pool=fs_data.r1"
        > >
        > > [client.cephfs.m]
        > >      key = xxx==
        > >      caps mds = "allow rw path=/m"
        > >      caps mgr = "allow r"
        > >      caps mon = "allow r"
        > >      caps osd = "allow rwx pool=fs_meta,allow rwx
pool=fs_data.ec"
        > >
        > >
        > > [@ test]# cat /etc/redhat-release
        > > CentOS Linux release 7.5.1804 (Core)
        > >
        > > [@ test]# rpm -qa | grep ceph | sort
        > > ceph-12.2.8-0.el7.x86_64
        > > ceph-base-12.2.8-0.el7.x86_64
        > > ceph-common-12.2.8-0.el7.x86_64
        > > ceph-fuse-12.2.8-0.el7.x86_64
        > > ceph-mds-12.2.8-0.el7.x86_64
        > > ceph-mgr-12.2.8-0.el7.x86_64
        > > ceph-mon-12.2.8-0.el7.x86_64
        > > ceph-osd-12.2.8-0.el7.x86_64
        > > ceph-radosgw-12.2.8-0.el7.x86_64
        > > ceph-selinux-12.2.8-0.el7.x86_64
        > > collectd-ceph-5.8.0-2.el7.x86_64
        > > libcephfs2-12.2.8-0.el7.x86_64
        > > python-cephfs-12.2.8-0.el7.x86_64
        > >
        > > _______________________________________________
        > > ceph-users mailing list
        > > ceph-users@xxxxxxxxxxxxxx
        > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
        >
        >
        > _______________________________________________
        > ceph-users mailing list
        > ceph-users@xxxxxxxxxxxxxx
        > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


        _______________________________________________
        ceph-users mailing list
        ceph-users@xxxxxxxxxxxxxx
        http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux