----- Original Message ----- > From: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx> > To: "Shyam" <srangana@xxxxxxxxxx> > Cc: gluster-devel@xxxxxxxxxxx > Sent: Tuesday, May 19, 2015 11:46:19 AM > Subject: Re: Moratorium on new patch acceptance > > > > ----- Original Message ----- > > From: "Shyam" <srangana@xxxxxxxxxx> > > To: gluster-devel@xxxxxxxxxxx > > Sent: Tuesday, May 19, 2015 6:13:06 AM > > Subject: Re: Moratorium on new patch acceptance > > > > On 05/18/2015 07:05 PM, Shyam wrote: > > > On 05/18/2015 03:49 PM, Shyam wrote: > > >> On 05/18/2015 10:33 AM, Vijay Bellur wrote: > > >> > > >> The etherpad did not call out, ./tests/bugs/distribute/bug-1161156.t > > >> which did not have an owner, and so I took a stab at it and below are > > >> the results. > > >> > > >> I also think failure in ./tests/bugs/quota/bug-1038598.t is the same as > > >> the observation below. > > >> > > >> NOTE: Anyone with better knowledge of Quota can possibly chip in as to > > >> what should we expect in this case and how to correct the expectation > > >> from these test cases. > > >> > > >> (Details of ./tests/bugs/distribute/bug-1161156.t) > > >> 1) Failure is in TEST #20 > > >> Failed line: TEST ! dd if=/dev/zero of=$N0/$mydir/newfile_2 bs=1k > > >> count=10240 conv=fdatasync > > >> > > >> 2) The above line is expected to fail (i.e dd is expected to fail) as, > > >> the set quota is 20MB and we are attempting to exceed it by another 5MB > > >> at this point in the test case. > > >> > > >> 3) The failure is easily reproducible in my laptop, 2/10 times > > >> > > >> 4) On debugging, I see that when the above dd succeeds (or the test > > >> fails, which means dd succeeded in writing more than the set quota), > > >> there are no write errors from the bricks or any errors on the final > > >> COMMIT RPC call to NFS. > > >> > > >> As a result the expectation of this test fails. > > >> > > >> NOTE: Sometimes there is a write failure from one of the bricks (the > > >> above test uses AFR as well), but AFR self healing kicks in and fixes > > >> the problem, as expected, as the write succeeded on one of the replicas. > > >> I add this observation, as the failed regression run logs, has some > > >> EDQUOT errors reported in the client xlator, but only from one of the > > >> client bricks, and there are further AFR self heal logs noted in the > > >> logs. > > >> > > >> 5) When the test case succeeds the writes fail with EDQUOT as expected. > > >> There are times when the quota is exceeded by say 1MB - 4.8MB, but the > > >> test case still passes. Which means that, if we were to try to exceed > > >> the quota by 1MB (instead of the 5MB as in the test case), this test > > >> case may fail always. > > > > > > Here is why I think this passes by quota sometime and not others making > > > this and the other test case mentioned below spurious. > > > - Each write is 256K from the client (that is what is sent over the wire) > > > - If more IO was queued by io-threads after passing quota checks, which > > > in this 5MB case requires >20 IOs to be queued (16 IOs could be active > > > in io-threads itself), we could end up writing more than the quota amount > > > > > > So, if quota checks to see if a write is violating the quota, and let's > > > it through, and updates on the UNWIND the space used for future checks, > > > we could have more IO outstanding than what the quota allows, and as a > > > result allow such a larger write to pass through, considering IO threads > > > queue and active IOs as well. Would this be a fair assumption of how > > > quota works? > > Yes, this is a possible scenario. There is a finite time window between, > > 1. Querying the size of a directory. In other words checking whether current > write can be allowed > 2. The "effect" of this write getting reflected in size of all the parent > directories of a file till root > > If 1 and 2 were atomic, another parallel write which could've exceed the > quota-limit could not have slipped through. Unfortunately, in the current > scheme of things they are not atomic. Now there can be parallel writes in > this test case because of nfs-client and/or glusterfs write-back (though > we've one single threaded application - dd - running). One way of testing > this hypothesis is to disable nfs and glusterfs write-back and run the same > (unmodified) test and the test should succeed always (dd should fail). To > disable write-back in nfs you can use noac option while mounting. > > The situation becomes worse in real-life scenarios because of parallelism > involved at many layers: > > 1. multiple applications, each possibly being multithreaded writing to > possibly many/or single file(s) in a quota subtree > 2. write-back in NFS-client and glusterfs > 3. Multiple bricks holding files of a quota-subtree. Each brick processing > simultaneously many write requests through io-threads. 4. Background accounting of directory sizes _after_ a write is complete. > > I've tried in past to fix the issue, though unsuccessfully. It seems to me > that one effective strategy is to make enforcement and updation of size of > parents atomic. But if we do that we end up adding latency of accounting to > latency of fop. Other options can be explored. But, our Quota functionality > requirements allow a buffer of 10% while enforcing limits. So, this issue > has not been high on our priority list till now. So, our tests should also > expect failures allowing for this 10% buffer. > > > > > > > I believe this is what is happening in this case. Checking a fix on my > > > machine, and will post the same if it proves to be help the situation. > > > > Posted a patch to fix the problem: http://review.gluster.org/#/c/10811/ > > > > There are arguably other ways to fix/overcome the same, this seemed apt > > for this test case though. > > > > > > > >> > > >> 6) Note on dd with conv=fdatasync > > >> As one of the fixes attempts to overcome this issue with the addition of > > >> "conv=fdatasync", wanted to cover that behavior here. > > >> > > >> What the above parameter does is to send an NFS_COMMIT (which internally > > >> becomes a flush FOP) at the end of writing the blocks to the NFS share. > > >> This commit as a result triggers any pending writes for this file and > > >> sends the flush to the brick, all of which succeeds at times, resulting > > >> in the failure of the test case. > > >> > > >> NOTE: In the TC ./tests/bugs/quota/bug-1038598.t the failed line is > > >> pretty much in the same context (LINE 26: TEST ! dd if=/dev/zero > > >> of=$M0/test_dir/file1.txt bs=1024k count=15 (expecting hard limit to be > > >> exceeded and there are no write failures in the logs (which should be > > >> expected with EDQUOT (122))). > > > _______________________________________________ > > > Gluster-devel mailing list > > > Gluster-devel@xxxxxxxxxxx > > > http://www.gluster.org/mailman/listinfo/gluster-devel > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@xxxxxxxxxxx > > http://www.gluster.org/mailman/listinfo/gluster-devel > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel