Hi,
This is basically a RFC for bringing more structure into the
indices/xattrop directory thus improving heal performance.
Problem statement - Right now the indices/xattrop directory is a flat
directory with a bunch of files with their names as the gfids of the
files/directories that need to be healed. These files are handlinks to
each other. The way self heal daemon works is that it does a directory
read of xattrop directory and does a heal entry on per entry basis.
Since the dependencies are not built across the entries, the heal
ordering is dependent on how the entries (on reading the directory) are
returned to the self heal daemon. And this causes self heal to execute
among multiple crawls because the parents MAY not get healed before the
children. The number of crawls becomes large when there is a big
hierarchy in the directory in the brick; that would degrade the
performance of the self heal.
Solution-
To maintain a hierarchy among dependent entries - This can be done in
memory and if ncecessary be logged into a file in the same
directory(indices/xattrop). There is a patch sent
http://review.gluster.org/#/c/5140/
that allows for a parent gfid to be stored per file. So when a brick
daemon needs to create a new entry in xattrop directory, it checks if
its parent gfid is also required to be healed. If yes, it inserts this
entry against the parent. This was we can build a hierarchy of linked
lists which is two dimensional; each member of list points to its peer
because they have the same parent (in one dimension) and they point to
their children (in second dimension). This normally would happen when
say in a N way replicate scheme, one brick was down and many
creates/writes are hapenning on other bricks. AT time of healing we heal
all the parents followed by its children and so on. This way we will be
sure that in one crawl, all heals will pass thru. All independent
(unrelated) files that require self heal will have their gfids stored
seperately; these could be healed at a higher priority since these are
low hanging fruits as far as heal is concerned since there are no
dependencies.
Having such a mechanism could thus create a more deterministic
mechanism to heal. Also the errors that we see during heal (basically
because of the file non existing because parent directory entry is still
unhealed) would go away making the self heal process cleaner. This is
observed in the following bug:
https://bugzilla.redhat.com/show_bug.cgi?id=978335
Like I mentioned above this can be logged into a file. Flushing into a
file could make this information persistent across reboots; though its
use cases needs to be better evaluated. Doing this on every writable fop
that could cause an entry creation into index/xattrop could make the
performance slower. Hence this can be done at some periodic intervals.
Since this is an additional log file that is used as an input to heal,
it should not cause any problems on upgrade. Worst case is that self
heal does not look into this file and does the old style healing.
Implementation wise this can be implemented using a new translator (in
front of index translator) Again doing this would help in upgrade scenarios.
Please revert back with ur suggestions.
Thanks & Regards
Raghav