Re: file corruption on Gluster 3.5.1 and Ubuntu 14.04

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The only reason O_APPEND gets stripped on the server side, is because of one of the following xlators:

- stripe
- quiesce
- crypt

If you have any of these, please try unloading/reconfiguring without these features and try again.

Thanks


On Sat, Sep 6, 2014 at 3:31 PM, mike <mike@xxxxxxxxxxxxxxxxxxxx> wrote:
I was able to narrow it down to smallish python script.

I've attached that to the bug.

https://bugzilla.redhat.com/show_bug.cgi?id=1138970


On Sep 6, 2014, at 1:05 PM, Justin Clift <justin@xxxxxxxxxxx> wrote:

> Thanks Mike, this is good stuff. :)
>
> + Justin
>
>
> On 06/09/2014, at 8:19 PM, mike wrote:
>> I upgraded the client to Gluster 3.5.2, but there is no difference.
>>
>> The bug is almost certainly in the Fuse client. If I remount the filesystem with NFS, the problem is no longer observable.
>>
>> I spent a little time looking through the xlator/fuse-bridge to see where the offsets are coming from, but I'm really not familiar enough with the code, so it is slow going.
>>
>> Unfortunately, I'm still having trouble reproducing this in a python script that could be readily attached to a bug report.
>>
>> I'll take a crack at that again, but I will a file a bug anyway for completeness.
>>
>> On Sep 5, 2014, at 7:10 PM, mike <mike@xxxxxxxxxxxxxxxxxxxx> wrote:
>>
>>> I have narrowed down the source of the bug.
>>>
>>> Here is an strace of glusterfsd http://fpaste.org/131455/40996378/
>>>
>>> The first line represents a write that does *not* make it into the underlying file.
>>>
>>> The last line is the write that stomps the earlier write.
>>>
>>> As I said, the client file is opened in O_APPEND mode, but on the glusterfsd side, the file is just O_CREAT|O_WRONLY. The means the offsets to pwrite() need to be valid.
>>>
>>> I correlated this to a tcpdump I took and I can see that in fact, the RPCs being sent have the wrong offset.  Interestingly, glusterfs.write-is-append = 0, which I wouldn't have expected.
>>>
>>> I think the bug lies in the glusterfs fuse client.
>>>
>>> As to your question about Gluster 3.5.2, I may be able to do that if I am unable to find the bug in the source.
>>>
>>> -Mike
>>>
>>> On Sep 5, 2014, at 6:16 PM, Justin Clift <justin@xxxxxxxxxxx> wrote:
>>>
>>>> On 06/09/2014, at 12:10 AM, mike wrote:
>>>>> I have found that the O_APPEND flag is key to this failure - I had overlooked that flag when reading the strace and trying to cobble up a minimal reproduction.
>>>>>
>>>>> I now have a small pair of python scripts that can reliably reproduce this failure.
>>>>
>>>>
>>>> As a thought, is there a reasonable way you can test this on GlusterFS 3.5.2?
>>>>
>>>> There were some important bug fixes in 3.5.2 (from 3.5.1).
>>>>
>>>> Note I'm not saying yours is one of them, I'm just asking if it's
>>>> easy to test and find out. :)
>>>>
>>>> Regards and best wishes,
>>>>
>>>> Justin Clift
>>>>
>>>> --
>>>> GlusterFS - http://www.gluster.org
>>>>
>>>> An open source, distributed file system scaling to several
>>>> petabytes, and handling thousands of clients.
>>>>
>>>> My personal twitter: twitter.com/realjustinclift
>>>>
>>>
>>
>
> --
> GlusterFS - http://www.gluster.org
>
> An open source, distributed file system scaling to several
> petabytes, and handling thousands of clients.
>
> My personal twitter: twitter.com/realjustinclift
>

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux