On second thought, this corner case is triggered only for cases where reopen after handshake is failed. Hence this definitely is not a regression and we can ignore this case. ----- Original Message ----- > From: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx> > To: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx> > Cc: "Anand Avati" <aavati@xxxxxxxxxx>, "Amar Tumballi" <atumball@xxxxxxxxxx>, "devel" <gluster-devel@xxxxxxxxxx>, > "Mohammed Junaid" <junaid@xxxxxxxxxx> > Sent: Monday, January 7, 2013 4:46:59 PM > Subject: Re: Need review for client-reopen changes > > Pranith, > > This comment is on the second patch. While the implementation looks > fine, I've some concerns related to the idea itself. Consider > following situation with a replicate volume of two subvolumes: > > 1. process 1 (p1) acquires a mandatory lock. > 2. stop first server, replace disk > 3. reopen of fd opened by p1 fail (since file is not present). > 4. self heal completes. parent is notified that child is up. However > fd is not opened yet. > 5. now, there is a possibility that another process p2 can > successfully write to another fd opened on the same file (on > server1), since lock (from p1) is not yet acquired on server1. > > Similar situation can arise even without this patch, but only when p1 > and p2 are not running on same mount point. With this patch it can > happen even on single mount point too. I am not sure whether we can > ignore this corner case. Others, please let us know your opinion on > this. > > regards, > Raghavendra. > > ----- Original Message ----- > > From: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx> > > To: "devel" <gluster-devel@xxxxxxxxxx> > > Cc: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>, "Krishnan > > Parthasarathi" <kparthas@xxxxxxxxxx>, "Jeff Darcy" > > <jdarcy@xxxxxxxxxx>, "Amar Tumballi" <atumball@xxxxxxxxxx> > > Sent: Monday, January 7, 2013 10:15:36 AM > > Subject: Need review for client-reopen changes > > > > hi, > > http://review.gluster.org/#change,4357 > > http://review.gluster.org/#change,4358 > > > > are the changes I made to handle re-opens of files in the case > > where > > a disk is replaced while a brick is offline. The idea is to attempt > > re-opens after self-heal completes and the file could be opened. > > With these changes readv/fxattrop/writev/findelk for fds with > > remote-fd -1 are attempted using anon-fds and if the fop succeeds > > then the re-open is attempted for every 1024th success. 1024 is an > > arbitrary number I used. The re-open of files could fail because of > > posix lock re-acquisition failure, that is the reason re-opens are > > attempted periodically (for every 1024 successful fops on that fd). > > > > I think the re-attempt logic could be better. > > For instance, we can attempt re-open on the first success on > > anon-fd > > instead of waiting till 1024th success and if this re-open fails we > > could fall-back on 'periodic attempts' i.e. for every 1024 > > successes > > on the anon-fd. > > > > Let me know your thoughts. > > > > Pranith > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx > https://lists.nongnu.org/mailman/listinfo/gluster-devel >