Re: complete f......p thanks to glusterfs...applause, you crashed weeks of work

Bencho Naut <all@xxxxxxxxxxxx> · Tue, 02 Sep 2014 21:36:36 +0200

>  If you want to set up a separate key+cert for each server, each one
>  having a "CA" file for the others, you certainly can and it works.

^^everyone had every CA 
>  However, you'll still have to deal with distributing those new certs.

^^ssh and cat ;)
>>   only control connection is encrypted
>  That's not true.  There are *separate* options to control encryption  for the data path, and in fact that code's much older.

^^i'll look it up, but in fact since 3.5 the game changed it seems

> Why separate?
> Because the data-path usage of SSL is based on a different identity model - probably more what you expected,
> with a separate identity per client instead of a shared one between servers.

^^clients need to have server ca's and  servers need to have server ca's.. i knew

>  For a long time, the only internal conditions that might have caused
>  the .glusterfs links not to be cleaned up were about 1000x less common
>  than similar problems which arise when users try to manipulate files
>  directly on the bricks.  Perhaps if you could describe what you were
>  doing on the bricks, we could help identify what was going on and
>  suggest safer ways of achieving the same goals.
^^create and use until they fail , then try to fix, mostly resulting in recreating
and in fact i played around with manipulating data directly, but that's long ago, i only use fuse/nfs 

>  Syncing what?  I'm guessing a bit here, but it sounds like you were
>  trying to do the equivalent of a replace-brick (or perhaps rebalance) by
>  hand.  As you've clearly discovered, such attempts are fraught with
>  peril.  Again, with some more constructive engagement perhaps we can
>  help guide you toward safer solutions.
^^moving the data from failed bricks on 2 servers into a fuse mount  ON AND ON again

>  The "mismatching layout" messages are usually the result of extended
>  attributes that are missing from one brick's copy of a directory.  It's
>  possible that the XFS resize code is racy, in the sense that extended
>  attributes become unavailable at some stage even though the directory
>  itself is still accessible.  I suggest that you follow up on that bug
>  with the XFS developers, who are sure to be much more polite and
>  responsive than we are.
^^xfs failed 100% less  until now, and even then the racy code seems to be from glusterfs ;)

>  More suggestions might have been available if you had sought them
>  earlier.  At this point, none of us can tell what state your volume is
>  in, and there are many indications that it's probably a state none of us
>  have never seen or anticipated.
The mailing list is full of "how to resync/migrate etc", and the answer is nearly all the time "recreate volume"

> Our first priority should be to get things back to a
>  known and stable state.  Unfortunately at this point the only such
>  state at this point would seem to be a clean volume.
i did the "clean volume" trick at least 5 times

02.09.2014, 16:13, "Jeff Darcy" <jdarcy@xxxxxxxxxx>:
>>  ssl keys have to be 2048-bit fixed size
>
> No, they don't.
>>  all keys have to bey verywhere(all versions....which noob programmed
>>  that ??)
>
> That noob would be me.
>
> t's not necessary to have the same key on all servers, but using
> different ones would be even more complex and confusing for users.
> Instead, the servers authenticate to one another using a single
> identity.  According to SSL 101, anyone authenticating as an identity
> needs the key for that identity, because it's really the key - not the
> publicly readable cert - that guarantees authenticity.
>
> If you want to set up a separate key+cert for each server, each one
> having a "CA" file for the others, you certainly can and it works.
> However, you'll still have to deal with distributing those new certs.
> That's inherent to how SSL works.  Instead of forcing a particular PKI
> or cert-distribution scheme on users, the GlusterFS SSL implementation
> is specifically intended to let users make those choices.
>>  only control connection is encrypted
>
> That's not true.  There are *separate* options to control encryption
> for the data path, and in fact that code's much older.  Why separate?
> Because the data-path usage of SSL is based on a different identity
> model - probably more what you expected, with a separate identity per
> client instead of a shared one between servers.
>>  At a certain point it also used tons of diskspace due to not deleting
>>  files in the ".glusterfs" directory , (but still being connected and
>>  up serving volumes)
>
> For a long time, the only internal conditions that might have caused
> the .glusterfs links not to be cleaned up were about 1000x less common
> than similar problems which arise when users try to manipulate files
> directly on the bricks.  Perhaps if you could describe what you were
> doing on the bricks, we could help identify what was going on and
> suggest safer ways of achieving the same goals.
>>  IT WAS A LONG AND PAINFUL SYNCING PROCESS until i thought i was happy
>>  ;)
>
> Syncing what?  I'm guessing a bit here, but it sounds like you were
> trying to do the equivalent of a replace-brick (or perhaps rebalance) by
> hand.  As you've clearly discovered, such attempts are fraught with
> peril.  Again, with some more constructive engagement perhaps we can
> help guide you toward safer solutions.
>>  Due to an Online-resizing lvm/XFS glusterfs (i watch the logs nearly
>>  all the time) i discovered "mismacthing disk layouts" , realizing also
>>  that
>>
>>  server1 was up and happy when you mount from it, but server2 spew
>>  input/output errors on several directories (for now just in that
>>  volume),
>
> The "mismatching layout" messages are usually the result of extended
> attributes that are missing from one brick's copy of a directory.  It's
> possible that the XFS resize code is racy, in the sense that extended
> attributes become unavailable at some stage even though the directory
> itself is still accessible.  I suggest that you follow up on that bug
> with the XFS developers, who are sure to be much more polite and
> responsive than we are.
>>  i tried to rename one directory, it created a recursive loop inside
>>  XFS (e.g.  BIGGEST FILE-SYSTEM FAIL : TWO INODES linking to one dir ,
>>  ideally containing another) i got at least the XFS loop solved.
>
> Another one for the XFS developers.
>>  Then the pre-last resort option came up.. deleted the volumes, cleaned
>>  all xattr on that ~2T ... and recreated the volumes, since shd seems
>>  to work somehow since 3.4
>
> You mention that you cleared all xattrs.  Did you also clear out
> .glusterfs?  In general, using anything but a completely empty directory
> tree as a brick can be a bit problematic.
>>  Maybe anyone has a suggestion , except "create a new clean volume and
>>  move all your TB's" .
>
> More suggestions might have been available if you had sought them
> earlier.  At this point, none of us can tell what state your volume is
> in, and there are many indications that it's probably a state none of us
> have never seen or anticipated.  As you've found, attempting random
> fixes in such a situation often makes things worse.  It would be
> irresponsible for us to suggest that you go down even more unknown and
> untried paths.  Our first priority should be to get things back to a
> known and stable state.  Unfortunately at this point the only such
> state at this point would seem to be a clean volume.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users