> > > As for as 2, goes, application can checkpoint by doing fsync and on write > > > failures, roll-back to last checkpoint and replay writes from that > > > checkpoint. Or, glusterfs can retry the writes on behalf of the > > > application. However, glusterfs retrying writes cannot be a complete > > > solution as the error-condition we've run into might never get resolved > > > (For eg., running out of space). So, glusterfs has to give up after some > > > time. The application should not be expected to replay writes. glusterfs must be retrying the failed write. In gluster-swift, we had hit into a case where the application would get EIO but the write had actually failed because of ENOSPC. https://bugzilla.redhat.com/show_bug.cgi?id=986812 Regards, -Prashanth Pai ----- Original Message ----- > From: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx> > To: "Vijay Bellur" <vbellur@xxxxxxxxxx> > Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>, "Ben Turner" <bturner@xxxxxxxxxx>, "Ira Cooper" <icooper@xxxxxxxxxx> > Sent: Tuesday, September 29, 2015 4:56:33 PM > Subject: Re: Handling Failed flushes in write-behind > > + gluster-devel > > > > > On Tuesday 29 September 2015 04:45 PM, Raghavendra Gowdappa wrote: > > > Hi All, > > > > > > Currently on failure of flushing of writeback cache, we mark the fd bad. > > > The rationale behind this is that since the application doesn't know > > > which > > > of the writes that are cached failed, fd is in a bad state and cannot > > > possibly do a meaningful/correct read. However, this approach (though > > > posix-complaint) is not acceptable for long standing applications like > > > QEMU [1]. So, a two part solution was decided: > > > > > > 1. No longer mark the fd bad during failures while flushing data to > > > backend > > > from write-behind cache. > > > 2. retry the writes > > > > > > As for as 2, goes, application can checkpoint by doing fsync and on write > > > failures, roll-back to last checkpoint and replay writes from that > > > checkpoint. Or, glusterfs can retry the writes on behalf of the > > > application. However, glusterfs retrying writes cannot be a complete > > > solution as the error-condition we've run into might never get resolved > > > (For eg., running out of space). So, glusterfs has to give up after some > > > time. > > > > > > It would be helpful if you give your inputs on how other writeback > > > systems > > > (Eg., kernel page-cache, nfs, samba, ceph, lustre etc) behave in this > > > scenario and what would be a sane policy for glusterfs. > > > > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1200862 > > > > > > regards, > > > Raghavendra > > > > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel