Re: corruption using gluster and iSCSI with LIO

Krutika Dhananjay <kdhananj@xxxxxxxxxx> · Fri, 18 Nov 2016 15:30:28 +0530

Assuming you're using FUSE, if your gluster volume is mounted at /some/dir, for example,
then its corresponding logs will be at /var/log/glusterfs/some-dir.log

-Krutika

On Fri, Nov 18, 2016 at 7:13 AM, Olivier Lambert <lambert.olivier@xxxxxxxxx> wrote:
Attached, bricks log. Where could I find the fuse client log?

On Fri, Nov 18, 2016 at 2:22 AM, Krutika Dhananjay <kdhananj@xxxxxxxxxx> wrote:

> Could you attach the fuse client and brick logs?

>

> -Krutika

>

> On Fri, Nov 18, 2016 at 6:12 AM, Olivier Lambert <lambert.olivier@xxxxxxxxx>

> wrote:

>>

>> Okay, used the exact same config you provided, and adding an arbiter

>> node (node3)

>>

>> After halting node2, VM continues to work after a small "lag"/freeze.

>> I restarted node2 and it was back online: OK

>>

>> Then, after waiting few minutes, halting node1. And **just** at this

>> moment, the VM is corrupted (segmentation fault, /var/log folder empty

>> etc.)

>>

>> dmesg of the VM:

>>

>> [ 1645.852905] EXT4-fs error (device xvda1):

>> htree_dirblock_to_tree:988: inode #19: block 8286: comm bash: bad

>> entry in directory: rec_len is smaller than minimal - offset=0(0),

>> inode=0, rec_len=0, name_len=0

>> [ 1645.854509] Aborting journal on device xvda1-8.

>> [ 1645.855524] EXT4-fs (xvda1): Remounting filesystem read-only

>>

>> And got a lot of " comm bash: bad entry in directory" messages then...

>>

>> Here is the current config with all Node back online:

>>

>> # gluster volume info

>>

>> Volume Name: gv0

>> Type: Replicate

>> Volume ID: 5f15c919-57e3-4648-b20a-395d9fe3d7d6

>> Status: Started

>> Snapshot Count: 0

>> Number of Bricks: 1 x (2 + 1) = 3

>> Transport-type: tcp

>> Bricks:

>> Brick1: 10.0.0.1:/bricks/brick1/gv0

>> Brick2: 10.0.0.2:/bricks/brick1/gv0

>> Brick3: 10.0.0.3:/bricks/brick1/gv0 (arbiter)

>> Options Reconfigured:

>> nfs.disable: on

>> performance.readdir-ahead: on

>> transport.address-family: inet

>> features.shard: on

>> features.shard-block-size: 16MB

>> network.remote-dio: enable

>> cluster.eager-lock: enable

>> performance.io-cache: off

>> performance.read-ahead: off

>> performance.quick-read: off

>> performance.stat-prefetch: on

>> performance.strict-write-ordering: off

>> cluster.server-quorum-type: server

>> cluster.quorum-type: auto

>> cluster.data-self-heal: on

>>

>>

>> # gluster volume status

>> Status of volume: gv0

>> Gluster process                             TCP Port  RDMA Port  Online

>> Pid

>>

>> ------------------------------------------------------------------------------

>> Brick 10.0.0.1:/bricks/brick1/gv0           49152     0          Y

>> 1331

>> Brick 10.0.0.2:/bricks/brick1/gv0           49152     0          Y

>> 2274

>> Brick 10.0.0.3:/bricks/brick1/gv0           49152     0          Y

>> 2355

>> Self-heal Daemon on localhost               N/A       N/A        Y

>> 2300

>> Self-heal Daemon on 10.0.0.3                N/A       N/A        Y

>> 10530

>> Self-heal Daemon on 10.0.0.2                N/A       N/A        Y

>> 2425

>>

>> Task Status of Volume gv0

>>

>> ------------------------------------------------------------------------------

>> There are no active volume tasks

>>

>>

>>

>> On Thu, Nov 17, 2016 at 11:35 PM, Olivier Lambert

>> <lambert.olivier@xxxxxxxxx> wrote:

>> > It's planned to have an arbiter soon :) It was just preliminary tests.

>> >

>> > Thanks for the settings, I'll test this soon and I'll come back to you!

>> >

>> > On Thu, Nov 17, 2016 at 11:29 PM, Lindsay Mathieson

>> > <lindsay.mathieson@xxxxxxxxx> wrote:

>> >> On 18/11/2016 8:17 AM, Olivier Lambert wrote:

>> >>>

>> >>> gluster volume info gv0

>> >>>

>> >>> Volume Name: gv0

>> >>> Type: Replicate

>> >>> Volume ID: 2f8658ed-0d9d-4a6f-a00b-96e9d3470b53

>> >>> Status: Started

>> >>> Snapshot Count: 0

>> >>> Number of Bricks: 1 x 2 = 2

>> >>> Transport-type: tcp

>> >>> Bricks:

>> >>> Brick1: 10.0.0.1:/bricks/brick1/gv0

>> >>> Brick2: 10.0.0.2:/bricks/brick1/gv0

>> >>> Options Reconfigured:

>> >>> nfs.disable: on

>> >>> performance.readdir-ahead: on

>> >>> transport.address-family: inet

>> >>> features.shard: on

>> >>> features.shard-block-size: 16MB

>> >>

>> >>

>> >>

>> >> When hosting VM's its essential to set these options:

>> >>

>> >> network.remote-dio: enable

>> >> cluster.eager-lock: enable

>> >> performance.io-cache: off

>> >> performance.read-ahead: off

>> >> performance.quick-read: off

>> >> performance.stat-prefetch: on

>> >> performance.strict-write-ordering: off

>> >> cluster.server-quorum-type: server

>> >> cluster.quorum-type: auto

>> >> cluster.data-self-heal: on

>> >>

>> >> Also with replica two and quorum on (required) your volume will become

>> >> read-only when one node goes down to prevent the possibility of

>> >> split-brain

>> >> - you *really* want to avoid that :)

>> >>

>> >> I'd recommend a replica 3 volume, that way 1 node can go down, but the

>> >> other

>> >> two still form a quorum and will remain r/w.

>> >>

>> >> If the extra disks are not possible, then a Arbiter volume can be setup

>> >> -

>> >> basically dummy files on the third node.

>> >>

>> >>

>> >>

>> >> --

>> >> Lindsay Mathieson

>> >>

>> >> _______________________________________________

>> >> Gluster-users mailing list

>> >> Gluster-users@xxxxxxxxxxx

>> >> http://www.gluster.org/mailman/listinfo/gluster-users

>> _______________________________________________

>> Gluster-users mailing list

>> Gluster-users@xxxxxxxxxxx

>> http://www.gluster.org/mailman/listinfo/gluster-users

>

>

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users