> If you want to set up a separate key+cert for each server, each one > having a "CA" file for the others, you certainly can and it works. ^^everyone had every CA > However, you'll still have to deal with distributing those new certs. ^^ssh and cat ;) >> only control connection is encrypted > That's not true. There are *separate* options to control encryption for the data path, and in fact that code's much older. ^^i'll look it up, but in fact since 3.5 the game changed it seems > Why separate? > Because the data-path usage of SSL is based on a different identity model - probably more what you expected, > with a separate identity per client instead of a shared one between servers. ^^clients need to have server ca's and servers need to have server ca's.. i knew > For a long time, the only internal conditions that might have caused > the .glusterfs links not to be cleaned up were about 1000x less common > than similar problems which arise when users try to manipulate files > directly on the bricks. Perhaps if you could describe what you were > doing on the bricks, we could help identify what was going on and > suggest safer ways of achieving the same goals. ^^create and use until they fail , then try to fix, mostly resulting in recreating and in fact i played around with manipulating data directly, but that's long ago, i only use fuse/nfs > Syncing what? I'm guessing a bit here, but it sounds like you were > trying to do the equivalent of a replace-brick (or perhaps rebalance) by > hand. As you've clearly discovered, such attempts are fraught with > peril. Again, with some more constructive engagement perhaps we can > help guide you toward safer solutions. ^^moving the data from failed bricks on 2 servers into a fuse mount ON AND ON again > The "mismatching layout" messages are usually the result of extended > attributes that are missing from one brick's copy of a directory. It's > possible that the XFS resize code is racy, in the sense that extended > attributes become unavailable at some stage even though the directory > itself is still accessible. I suggest that you follow up on that bug > with the XFS developers, who are sure to be much more polite and > responsive than we are. ^^xfs failed 100% less until now, and even then the racy code seems to be from glusterfs ;) > More suggestions might have been available if you had sought them > earlier. At this point, none of us can tell what state your volume is > in, and there are many indications that it's probably a state none of us > have never seen or anticipated. The mailing list is full of "how to resync/migrate etc", and the answer is nearly all the time "recreate volume" > Our first priority should be to get things back to a > known and stable state. Unfortunately at this point the only such > state at this point would seem to be a clean volume. i did the "clean volume" trick at least 5 times 02.09.2014, 16:13, "Jeff Darcy" <jdarcy@xxxxxxxxxx>: >> ssl keys have to be 2048-bit fixed size > > No, they don't. >> all keys have to bey verywhere(all versions....which noob programmed >> that ??) > > That noob would be me. > > t's not necessary to have the same key on all servers, but using > different ones would be even more complex and confusing for users. > Instead, the servers authenticate to one another using a single > identity. According to SSL 101, anyone authenticating as an identity > needs the key for that identity, because it's really the key - not the > publicly readable cert - that guarantees authenticity. > > If you want to set up a separate key+cert for each server, each one > having a "CA" file for the others, you certainly can and it works. > However, you'll still have to deal with distributing those new certs. > That's inherent to how SSL works. Instead of forcing a particular PKI > or cert-distribution scheme on users, the GlusterFS SSL implementation > is specifically intended to let users make those choices. >> only control connection is encrypted > > That's not true. There are *separate* options to control encryption > for the data path, and in fact that code's much older. Why separate? > Because the data-path usage of SSL is based on a different identity > model - probably more what you expected, with a separate identity per > client instead of a shared one between servers. >> At a certain point it also used tons of diskspace due to not deleting >> files in the ".glusterfs" directory , (but still being connected and >> up serving volumes) > > For a long time, the only internal conditions that might have caused > the .glusterfs links not to be cleaned up were about 1000x less common > than similar problems which arise when users try to manipulate files > directly on the bricks. Perhaps if you could describe what you were > doing on the bricks, we could help identify what was going on and > suggest safer ways of achieving the same goals. >> IT WAS A LONG AND PAINFUL SYNCING PROCESS until i thought i was happy >> ;) > > Syncing what? I'm guessing a bit here, but it sounds like you were > trying to do the equivalent of a replace-brick (or perhaps rebalance) by > hand. As you've clearly discovered, such attempts are fraught with > peril. Again, with some more constructive engagement perhaps we can > help guide you toward safer solutions. >> Due to an Online-resizing lvm/XFS glusterfs (i watch the logs nearly >> all the time) i discovered "mismacthing disk layouts" , realizing also >> that >> >> server1 was up and happy when you mount from it, but server2 spew >> input/output errors on several directories (for now just in that >> volume), > > The "mismatching layout" messages are usually the result of extended > attributes that are missing from one brick's copy of a directory. It's > possible that the XFS resize code is racy, in the sense that extended > attributes become unavailable at some stage even though the directory > itself is still accessible. I suggest that you follow up on that bug > with the XFS developers, who are sure to be much more polite and > responsive than we are. >> i tried to rename one directory, it created a recursive loop inside >> XFS (e.g. BIGGEST FILE-SYSTEM FAIL : TWO INODES linking to one dir , >> ideally containing another) i got at least the XFS loop solved. > > Another one for the XFS developers. >> Then the pre-last resort option came up.. deleted the volumes, cleaned >> all xattr on that ~2T ... and recreated the volumes, since shd seems >> to work somehow since 3.4 > > You mention that you cleared all xattrs. Did you also clear out > .glusterfs? In general, using anything but a completely empty directory > tree as a brick can be a bit problematic. >> Maybe anyone has a suggestion , except "create a new clean volume and >> move all your TB's" . > > More suggestions might have been available if you had sought them > earlier. At this point, none of us can tell what state your volume is > in, and there are many indications that it's probably a state none of us > have never seen or anticipated. As you've found, attempting random > fixes in such a situation often makes things worse. It would be > irresponsible for us to suggest that you go down even more unknown and > untried paths. Our first priority should be to get things back to a > known and stable state. Unfortunately at this point the only such > state at this point would seem to be a clean volume. _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users