Re: Help needed in understanding GlusterFS logs and debugging elasticsearch failures

Sachidananda URS <surs@xxxxxxxxxx> · Mon, 14 Dec 2015 16:13:53 +0530

Hi,

On Sat, Dec 12, 2015 at 2:35 AM, Vijay Bellur <vbellur@xxxxxxxxxx> wrote:

----- Original Message -----

> From: "Sachidananda URS" <surs@xxxxxxxxxx>

> To: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>

> Sent: Friday, December 11, 2015 10:26:04 AM

> Subject:  Help needed in understanding GlusterFS logs and debugging elasticsearch failures

>

> Hi,

>

> I was trying to use GlusterFS as a backend filesystem for storing the

> elasticsearch indices on GlusterFS mount.

>

> The filesystem operations as far as I can understand is, lucene engine

> does a lot of renames on the index files. And multiple threads read

> from the same file concurrently.

>

> While writing index, elasticsearch/lucene complains of index corruption and

> the

> health of the cluster goes to red, and all the operations on the index fail

> hereafter.

>

> ===================

>

> [2015-12-10 02:43:45,614][WARN ][index.engine             ] [client-2]

> [logstash-2015.12.09][3] failed engine [merge failed]

> org.apache.lucene.index.MergePolicy$MergeException:

> org.apache.lucene.index.CorruptIndexException: checksum failed (hardware

> problem?) : expected=0 actual=6d811d06

> (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/mnt/gluster2/rhs/nodes/0/indices/logstash-2015.12.09/3/index/_a7.cfs")

> [slice=_a7_Lucene50_0.doc]))

>         at

>         org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$1.doRun(InternalEngine.java:1233)

>         at

>         org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)

>         at

>         java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

>         at

>         java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

>         at java.lang.Thread.run(Thread.java:745)

> Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed

> (hardware problem?) : expected=0 actual=6d811d06

> (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/mnt/gluster2/rhs/nodes/0/indices/logstash-2015.12.09/3/index/_a7.cfs")

> [slice=_a7_Lucene50_0.doc]))

>

> =====================

>

>

> Server logs does not have anything. The client logs is full of messages like:

>

>

>

> [2015-12-03 18:44:17.882032] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]

> 0-esearch-dht: renaming

> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-61881676454442626.tlog

> (hash=esearch-replicate-0/cache=esearch-replicate-0) =>

> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-311.ckp

> (hash=esearch-replicate-1/cache=<nul>)

> [2015-12-03 18:45:31.276316] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]

> 0-esearch-dht: renaming

> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-2384654015514619399.tlog

> (hash=esearch-replicate-0/cache=esearch-replicate-0) =>

> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-312.ckp

> (hash=esearch-replicate-0/cache=<nul>)

> [2015-12-03 18:45:31.587660] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]

> 0-esearch-dht: renaming

> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-4957943728738197940.tlog

> (hash=esearch-replicate-0/cache=esearch-replicate-0) =>

> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-312.ckp

> (hash=esearch-replicate-0/cache=<nul>)

> [2015-12-03 18:46:48.424605] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]

> 0-esearch-dht: renaming

> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-1731620600607498012.tlog

> (hash=esearch-replicate-1/cache=esearch-replicate-1) =>

> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-313.ckp

> (hash=esearch-replicate-1/cache=<nul>)

> [2015-12-03 18:46:48.466558] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]

> 0-esearch-dht: renaming

> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-5214949393126318982.tlog

> (hash=esearch-replicate-1/cache=esearch-replicate-1) =>

> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-313.ckp

> (hash=esearch-replicate-1/cache=<nul>)

> [2015-12-03 18:48:06.314138] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]

> 0-esearch-dht: renaming

> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-9110755229226773921.tlog

> (hash=esearch-replicate-0/cache=esearch-replicate-0) =>

> /rhs/nodes/0/indices/logstash-2015.12.03/4/translog/translog-314.ckp

> (hash=esearch-replicate-1/cache=<nul>)

> [2015-12-03 18:48:06.332919] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]

> 0-esearch-dht: renaming

> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-5193443717817038271.tlog

> (hash=esearch-replicate-1/cache=esearch-replicate-1) =>

> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-314.ckp

> (hash=esearch-replicate-1/cache=<nul>)

> [2015-12-03 18:49:24.694263] I [MSGID: 109066] [dht-rename.c:1410:dht_rename]

> 0-esearch-dht: renaming

> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-2750483795035758522.tlog

> (hash=esearch-replicate-1/cache=esearch-replicate-1) =>

> /rhs/nodes/0/indices/logstash-2015.12.03/1/translog/translog-315.ckp

> (hash=esearch-replicate-0/cache=<nul>)

>

> ==============================================================

>

> The same setup works well on any of the disk filesystems.

> This is 2 x 2 distributed-replicate setup:

>

> # gluster vol info

>

> Volume Name: esearch

> Type: Distributed-Replicate

> Volume ID: 4e4b205e-28ed-4f9e-9fa4-0d020428dede

> Status: Started

> Number of Bricks: 2 x 2 = 4

> Transport-type: tcp,rdma

> Bricks:

> Brick1: 10.70.47.171:/gluster/brick1

> Brick2: 10.70.47.187:/gluster/brick1

> Brick3: 10.70.47.121:/gluster/brick1

> Brick4: 10.70.47.172:/gluster/brick1

> Options Reconfigured:

> performance.read-ahead: off

> performance.write-behind: off

>

>

> I need a little bit help in understanding the failures. Let me know if you

> need

> further information on setup or access to the system to debug further. I've

> attached the debug logs for further investigation.

>

Would it be possible to turn off all the performance translators (md-cache, quickread, io-cache etc.) and check if the same problem persists? Collecting strace of the elasticsearch process that does I/O on gluster can also help.

I turned off all the performance xlators. 

 gluster vol info

Volume Name: esearch
Type: Distributed-Replicate
Volume ID: 4e4b205e-28ed-4f9e-9fa4-0d020428dede
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp,rdma
Bricks:
Brick1: 10.70.47.171:/gluster/brick1
Brick2: 10.70.47.187:/gluster/brick1
Brick3: 10.70.47.121:/gluster/brick1
Brick4: 10.70.47.172:/gluster/brick1
Options Reconfigured:
performance.stat-prefetch: off
performance.md-cache-timeout: 0
performance.quick-read: off
performance.io-cache: off
performance.read-ahead: off
performance.write-behind: off

The problem still persists. Attaching strace logs.

-sac

Attachment:
elastic_strace.log.bz2

Description: BZip2 compressed data
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel