Re: 3.7.13, index healing broken?

Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> · Wed, 13 Jul 2016 11:54:40 +0530

On Wed, Jul 13, 2016 at 11:49 AM, Dmitry Melekhov <dm@xxxxxxxxxx> wrote:

    13.07.2016 10:10, Pranith Kumar
      Karampuri пишет:

          On Wed, Jul 13, 2016 at 11:27 AM,
            Dmitry Melekhov <dm@xxxxxxxxxx> wrote:

                13.07.2016 09:50, Pranith Kumar Karampuri пишет:

                          On Wed, Jul 13, 2016
                            at 11:11 AM, Dmitry Melekhov <dm@xxxxxxxxxx>
                            wrote:

                                13.07.2016 09:36, Pranith Kumar
                                  Karampuri пишет:

                                          On
                                            Wed, Jul 13, 2016 at 10:58
                                            AM, Dmitry Melekhov <dm@xxxxxxxxxx>
                                            wrote:

                                                13.07.2016 09:26,
                                                  Pranith Kumar
                                                  Karampuri пишет:

                                                          On

                                                          Wed, Jul 13,
                                                          2016 at 10:50
                                                          AM, Dmitry
                                                          Melekhov <dm@xxxxxxxxxx>
                                                          wrote:

                                                          13.07.2016
                                                          09:16, Pranith
                                                          Kumar
                                                          Karampuri
                                                          пишет:

                                                          On

                                                          Wed, Jul 13,
                                                          2016 at 10:38
                                                          AM, Dmitry
                                                          Melekhov <dm@xxxxxxxxxx>
                                                          wrote:

                                                          13.07.2016

                                                          09:04, Pranith
                                                          Kumar
                                                          Karampuri
                                                          пишет:

                                                          On

                                                          Wed, Jul 13,
                                                          2016 at 10:29
                                                          AM, Dmitry
                                                          Melekhov <dm@xxxxxxxxxx>
                                                          wrote:

                                                          13.07.2016

                                                          08:56, Pranith
                                                          Kumar
                                                          Karampuri
                                                          пишет:

                                                          On

                                                          Wed, Jul 13,
                                                          2016 at 10:23
                                                          AM, Dmitry
                                                          Melekhov <dm@xxxxxxxxxx>
                                                          wrote:

                                                          13.07.2016

                                                          08:46, Pranith
                                                          Kumar
                                                          Karampuri
                                                          пишет:

                                                          On

                                                          Wed, Jul 13,
                                                          2016 at 10:10
                                                          AM, Dmitry
                                                          Melekhov <dm@xxxxxxxxxx>
                                                          wrote:

                                                          13.07.2016

                                                          08:36, Pranith
                                                          Kumar
                                                          Karampuri
                                                          пишет:

                                                          On

                                                          Wed, Jul 13,
                                                          2016 at 9:35
                                                          AM, Dmitry
                                                          Melekhov <dm@xxxxxxxxxx>
                                                          wrote:

                                                          13.07.2016

                                                          01:52,
                                                          Anuradha Talur
                                                          пишет:

                                                          ----- Original
                                                          Message -----

                                                          From: "Dmitry
                                                          Melekhov" <dm@xxxxxxxxxx>

                                                          To: "Pranith
                                                          Kumar
                                                          Karampuri"
                                                          <pkarampu@xxxxxxxxxx>

                                                          Cc:
                                                          "gluster-users"
                                                          <gluster-users@xxxxxxxxxxx>

                                                          Sent: Tuesday,
                                                          July 12, 2016
                                                          9:27:17 PM

                                                          Subject: Re:

                                                          3.7.13, index
                                                          healing
                                                          broken?

                                                          12.07.2016
                                                          17:39, Pranith
                                                          Kumar
                                                          Karampuri
                                                          пишет:

                                                          Wow, what are
                                                          the steps to
                                                          recreate the
                                                          problem?

                                                          just set file
                                                          length to
                                                          zero, always
                                                          reproducible.

                                                          If you are
                                                          setting the
                                                          file length to
                                                          0 on one of
                                                          the bricks
                                                          (looks like

                                                          that is the
                                                          case), it is
                                                          not a bug.

                                                          Index heal
                                                          relies on
                                                          failures seen
                                                          from the mount
                                                          point(s)

                                                          to identify
                                                          the files that
                                                          need heal. It
                                                          won't be able
                                                          to recognize
                                                          any file

                                                          modification
                                                          done directly
                                                          on bricks.
                                                          Same goes for
                                                          heal info
                                                          command which

                                                          is the reason
                                                          heal info also
                                                          shows 0
                                                          entries.

                                                           Well,
                                                          this makes
                                                          self-heal
                                                          useless then-
                                                          if any file is
                                                          accidently
                                                          corrupted or
                                                          deleted (yes!
                                                          if file is
                                                          deleted
                                                          directly from
                                                          brick this is
                                                          no recognized
                                                          by idex heal
                                                          too), then it
                                                          will not be
                                                          self-healed,
                                                          because
                                                          self-heal uses
                                                          index heal.

                                                          It is
                                                          better to look
                                                          into bit-rot
                                                          feature if you
                                                          want to guard
                                                          against these
                                                          kinds of
                                                          problems.

                                                           Bit
                                                          rot detects
                                                          bit problems,
                                                          not missing
                                                          files or their
                                                          wrong length,
                                                          i.e. this is
                                                          overhead for
                                                          such simple
                                                          task.

                                                          It
                                                          detects wrong
                                                          length.
                                                          Because
                                                          checksum won't
                                                          match anymore.

                                                           Yes,
                                                          sure. I guess
                                                          that it will
                                                          detect missed
                                                          files too. But
                                                          it needs far
                                                          more
                                                          resources,
                                                          then just
                                                          comparing
                                                          directories in
                                                          bricks?

                                                          What
                                                          use-case you
                                                          are trying out
                                                          is leading to
                                                          changing
                                                          things
                                                          directly on
                                                          the brick?

                                                           I'm
                                                          trying to test
                                                          gluster
                                                          failure
                                                          tolerance and
                                                          right now I'm
                                                          not happy with
                                                          it...

                                                          Which
                                                          cases of fault
                                                          tolerance are
                                                          you not happy
                                                          with? Making
                                                          changes
                                                          directly on
                                                          the brick or
                                                          anything else
                                                          as well?

                                                          I'll repeat:

                                                          As I already
                                                          said- if I for
                                                          some reason (
                                                          real case  can
                                                          be only by
                                                          accident )
                                                          will delete
                                                          file this will
                                                          not be
                                                          detected by
                                                          self-heal
                                                          daemon, and,
                                                          thus, will
                                                          lead to lower
                                                          replication
                                                          level, i.e.
                                                          lower failure
                                                          tolerance. 

                                                          To prevent
                                                          such accidents
                                                          you need to
                                                          set selinux
                                                          policies so
                                                          that files
                                                          under the
                                                          brick are not
                                                          modified by
                                                          accident by
                                                          any user. At
                                                          least that is
                                                          the solution I
                                                          remember when
                                                          this was
                                                          discussed 3-4
                                                          years back.

                                                          So only
                                                          supported
                                                          platfrom is
                                                          linux? Or, may
                                                          be, it is
                                                          better to
                                                          improve
                                                          self-healing
                                                          to detect
                                                          missing or
                                                          wrong length
                                                          files, I guess
                                                          this is very
                                                          low cost in
                                                          terms of host
                                                          resources
                                                          operation.

                                                          Just a
                                                          suggestion,
                                                          may be we need
                                                          to look to
                                                          alternatives
                                                          in near
                                                          future....

                                                          This is a
                                                          corner case,
                                                          from design
                                                          perspective it
                                                          is generally
                                                          not a good
                                                          idea to
                                                          optimize for
                                                          the corner
                                                          case. It is
                                                          better to
                                                          protect
                                                          ourselves from
                                                          the corner
                                                          case (SElinux
                                                          etc) or you
                                                          can also use
                                                          snapshots to
                                                          protect
                                                          against these
                                                          kind of
                                                          mishaps.

                                                          Sorry, I'm not
                                                          agree. 

                                                          As you  know
                                                          if on access
                                                          missed or
                                                          wrong lenghted
                                                          file from fuse
                                                          client it is
                                                          restored
                                                          (healed), i.e.
                                                          gluster
                                                          recognizes
                                                          file is wrong
                                                          and heal it ,
                                                          so I do not
                                                          see any reason
                                                          to provide
                                                          this such
                                                          function as
                                                          self-healing.

                                                          Thank you!

                                                          Ah! Now how do
                                                          you suggest we
                                                          keep track of
                                                          which of 10s
                                                          of millions of
                                                          files the user
                                                          accidentally
                                                          deleted from
                                                          the brick
                                                          without
                                                          gluster's
                                                          knowledge?
                                                          Once it comes
                                                          to gluster's
                                                          knowledge we
                                                          can do
                                                          something. But
                                                          how does
                                                          gluster become
                                                          aware of
                                                          something it
                                                          is not keeping
                                                          track of? At
                                                          the time you
                                                          access it
                                                          gluster knows
                                                          something went
                                                          wrong so it
                                                          restores it.
                                                          If you change
                                                          something on
                                                          the bricks
                                                          even by
                                                          accident all
                                                          the data
                                                          gluster keeps
                                                          (similar to
                                                          journal) is a
                                                          waste. Even
                                                          the disk
                                                          filesystems
                                                          will ask you
                                                          to do fsck if
                                                          something
                                                          unexpected
                                                          happens so
                                                          full self-heal
                                                          is similar
                                                          operation.

                                                You are absolutely
                                                right- question is why
                                                gluster does not become
                                                aware about such problem
                                                is case of self-healing?

                                            Because the operations
                                              that are performed
                                              directly on brick do not
                                              go through gluster stack.

                                OK, I'll repeat-  

                                 As you  know if on access missed
                                  or wrong lenghted file from fuse
                                  client it is restored (healed), i.e.
                                  gluster recognizes file is wrong and
                                  heal it , so I do not see any reason
                                  to provide this such function as
                                  self-healing.

                            For which you need accessing the file.

                That's right.

                          For which you need full crawl. You can't
                            detect the modification which doesn't go
                            through the stack so this is the only
                            possibility. 

                 OK, then, if self-heal is really useless and no
                possible way to get it will be provided, I guess we'll
                use external script to check bricks directories
                consistency,

                don't think ls and diff will get much resources.

            How is this different from full self-heal?

    Self-heal does not detect deleted or wrong-length files .

It detects when you do full crawl. Which essentially is ls -laR kind of thing on the whole volume. You don't need any external scripts, keep doing full crawl once in a while may be? If you need any performance improvements here, we will be happy to help. Please give us feedback. All I was saying is it is not possible to detect them through index heal. Because for the index to be populated you need the operations to go through gluster stack.

    Why it can't ? I don't know, you just said it is impossible in
    gluster because it can only track changes only made through gluster,
    i.e. bricks can have different files sets and it is not recognized
    (true) because , as I understand, gluster's  self-heal thinks that
    brick underlying filesystem can't be corrupted by server admin  (not
    true, I can say this as almost 25 years experienced engineer, i.e. I
    did this several times ;-) ).

                Thank you!

                p.s.

                still can't understand why it can't be implemented in
                gluster... :-(

                                                        -- 

                                                          Pranith

                                        -- 

                                          Pranith

                        -- 

                          Pranith

          -- 

            Pranith

-- 
Pranith

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users