Issues with replacing hard links with symlinks in the .glusterfs directory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



First off: I've based my work off of the release of 3.7.3, since it was the most recent release when I started this project, and I couldn't get HEAD to build on freebsd. (I'm using a freebsd server, and linux clients)

I realize that many things will be broken by doing this (renaming open files, deleting open files, possibly some other stuff), but I can live with those limitations. 

What I've done:
 I've modified the code to failback to a symlink if making a hardlink fails (which it will do somewhat frequently due to being on a different filesystem).
I created an extended property on symlinks that are emulating hard links
changed the setattr code to check this before it tries to set the attributs, and if it is set, it dereferences the link, then proceeds with the setattr

To test this, I made a file, and ran chmod +x on it
the good: attributes were correctly set on the file!
the bad: chmod says it failed with EIO
my issue: I have no clue where this EIO is coming from.  Under the hood,s chmod is calling fchmodat

After no luck with printf debugging, I just ran gluster under gdb, and set a breakpoint on send_fuse_iov.  Here's the backtrace:
#0  send_fuse_iov (this=0x63a150, finh=0x7fffe0005fe0, iov_out=0x7ffff08e7500, count=2) at fuse-bridge.c:158
#1  0x00007ffff550fcfd in send_fuse_data (this=0x63a150, finh=0x7fffe0005fe0, data="" size=104) at fuse-bridge.c:197
#2  0x00007ffff5511be1 in fuse_attr_cbk (frame=0x7fffe000145c, cookie=0x7fffe000616c, this=0x63a150, op_ret=0, op_errno=117, buf=0x7fffe0006734, xdata=0x0) at fuse-bridge.c:734
#3  0x00007ffff0b08714 in io_stats_stat_cbk (frame=0x7fffe000616c, cookie=0x7fffe000626c, this=0x7fffec014de0, op_ret=0, op_errno=117, buf=0x7fffe0006734, xdata=0x0) at io-stats.c:1344
#4  0x00007ffff0d2397e in mdc_stat_cbk (frame=0x7fffe000626c, cookie=0x7fffe000645c, this=0x7fffec013890, op_ret=0, op_errno=117, buf=0x7fffe0006734, xdata=0x0) at md-cache.c:901
#5  0x00007ffff7b30ad3 in default_stat_cbk (frame=0x7fffe000645c, cookie=0x7fffe00029ec, this=0x7fffec00b910, op_ret=0, op_errno=117, buf=0x7fffe0006734, xdata=0x0) at defaults.c:853
#6  0x00007ffff1be3165 in dht_attr_cbk (frame=0x7fffe00029ec, cookie=0x7fffe0001bdc, this=0x7fffec00a400, op_ret=0, op_errno=0, stbuf=0x7ffff08e78d0, xdata=0x0) at dht-inode-read.c:250
#7  0x00007ffff1e1e7b7 in client3_3_stat_cbk (req=0x7fffe00075dc, iov=0x7fffe000761c, count=1, myframe=0x7fffe0001bdc) at client-rpc-fops.c:535
#8  0x00007ffff78ec67c in rpc_clnt_handle_reply (clnt=0x7fffec02bd40, pollin=0x7fffe40058a0) at rpc-clnt.c:766
#9  0x00007ffff78eca73 in rpc_clnt_notify (trans=0x7fffec02c020, mydata=0x7fffec02bd70, event=RPC_TRANSPORT_MSG_RECEIVED, data="" at rpc-clnt.c:894
#10 0x00007ffff78e8bb2 in rpc_transport_notify (this=0x7fffec02c020, event=RPC_TRANSPORT_MSG_RECEIVED, data="" at rpc-transport.c:544
#11 0x00007ffff32f1591 in socket_event_poll_in (this=0x7fffec02c020) at socket.c:2290
#12 0x00007ffff32f1ad5 in socket_event_handler (fd=12, idx=1, data="" poll_in=1, poll_out=0, poll_err=0) at socket.c:2403
#13 0x00007ffff7b9c02b in event_dispatch_epoll_handler (event_pool=0x635210, event=0x7ffff08e7e30) at event-epoll.c:575
#14 0x00007ffff7b9c409 in event_dispatch_epoll_worker (data="" at event-epoll.c:678
#15 0x00007ffff698137b in start_thread () from /lib64/libpthread.so.0
#16 0x00007ffff63216fd in clone () from /lib64/libc.so.6

  It is currently trying to send two buffers, one of length 16, one of length 104. Printing out the individual components of this buffer, we have:
(gdb) p *fouh
$27 = {len = 120, error = 0, unique = 11}
(gdb) p fao
$28 = {attr_valid = 1, attr_valid_nsec = 0, dummy = 0, attr = {ino = 13385484696163529676, size = 6, blocks = 2, atime = 1445490621, mtime = 1444602468, ctime = 1444602468, atimensec = 947118511,
    mtimensec = 479067414, ctimensec = 479067414, mode = 16877, nlink = 2, uid = 1043, gid = 1045, rdev = 4026597375, blksize = 131072, padding = 32767}}
and to round things out, the input:
(gdb) p *finh
$30 = {len = 56, opcode = 3, unique = 11, nodeid = 140736951487340, uid = 0, gid = 0, pid = 6233, padding = 0}

Since all of the code that I've changed is on the server side, I assume that some pieces of data are being sent incorrectly, but I cannot identify them.  From what I can tell, the data being sent back to the kernel is correct.

Questions: Does anything look wrong with the data that is being sent to the kernel?
Can anyone think of another reason that this would result in an EIO?
Is there any more information that would help you answer any of these questions.

If you want more real-time conversing than email typically provides, I'm on irc as mjrosenb.  Thanks --Marty
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux