Re: Jewel 10.2.2 - Error when flushing journal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

> root@:~# ceph-osd -i 12 --flush-journal
> SG_IO: questionable sense data, results may be incorrect
> SG_IO: questionable sense data, results may be incorrect

As far as I understand these lines is a hdparm warning (OSD uses hdparm command to query the journal device write cache state).
The message means hdparm is unable to reliably figure out if the drive write cache is enabled. This might indicate a hardware problem.

> ceph-osd -i 12 --flush-journal

I think it's a good idea to a) check the journal drive (smartctl), b) capture a more verbose log,
i.e. add this to ceph.conf

[osd]
debug filestore = 20/20
debug journal = 20/20

and try flushing the journal once more (note: this won't fix the problem, the point is to get a useful log)

Best regards,
      Alexey


On Wed, Sep 7, 2016 at 6:48 PM, Mehmet <ceph@xxxxxxxxxx> wrote:
Hey again,

now i have stopped my osd.12 via

root@:~# systemctl stop ceph-osd@12

and when i am flush the journal...

root@:~# ceph-osd -i 12 --flush-journal
SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
*** Caught signal (Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
2016-09-07 17:42:58.128839 7f421d49d700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

     0> 2016-09-07 17:42:58.128839 7f421d49d700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f421d49d700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x564545e65dde]
 2: (()+0x113d0) [0x7f422277e3d0]
 3: [0x56455055a3c0]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Segmentation fault

The logfile with further information
- http://slexy.org/view/s2T8AohMfU

I guess i will get same message when i flush the other journals.

- Mehmet


Am 2016-09-07 13:23, schrieb Mehmet:
Hello ceph people,

yesterday i stopped one of my OSDs via

root@:~# systemctl stop ceph-osd@10

and tried to flush the journal for this osd via

root@:~# ceph-osd -i 10 --flush-journal

but getting this output on the screen:

SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
*** Caught signal (Segmentation fault) **
 in thread 7fd846333700 thread_name:ceph-osd
 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
2016-09-06 22:12:51.850739 7fd846333700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7fd846333700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.

     0> 2016-09-06 22:12:51.850739 7fd846333700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7fd846333700 thread_name:ceph-osd

 ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
 1: (()+0x96bdde) [0x55f33b862dde]
 2: (()+0x113d0) [0x7fd84b6143d0]
 3: [0x55f345bbff80]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.

Segmentation fault

This is the logfile from my osd.10 with further informations
- http://slexy.org/view/s21tfwQ1fZ

Today i stopped another OSD (osd.11)

root@:~# systemctl stop ceph-osd@11

I did not not get the above mentioned error - but this

root@:~# ceph-osd -i 11 --flush-journal
SG_IO: questionable sense data, results may be incorrect
SG_IO: questionable sense data, results may be incorrect
2016-09-07 13:19:39.729894 7f3601a298c0 -1 flushed journal
/var/lib/ceph/osd/ceph-11/journal for object store
/var/lib/ceph/osd/ceph-11

This is the logfile from my osd.11 with further informations
- http://slexy.org/view/s2AlEhV38m

This is not realy a case actualy cause i will setup the journal
partitions again with 20GB (from 5GB actual) an bring the OSD then
bring up again.
But i thought i should mail this error to the mailing list.

This is my Setup:

*Software/OS*
- Jewel
#> ceph tell osd.* version | grep version | uniq
"version": "ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)"

#> ceph tell mon.* version
[...] ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)

- Ubuntu 16.04 LTS on all OSD and MON Server
#> uname -a
31.08.2016: Linux reilif 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11
18:01:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

*Server*
3x OSD Server, each with

- 2x Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz ==> 12 Cores, no Hyper-Threading

- 64GB RAM
- 10x 4TB HGST 7K4000 SAS2 (6GB/s) Disks as OSDs

- 1x INTEL SSDPEDMD400G4 (Intel DC P3700 NVMe) as Journaling Device
for 10-12 Disks

- 1x Samsung SSD 840/850 Pro only for the OS

3x MON Server
- Two of them with 1x Intel(R) Xeon(R) CPU E3-1265L V2 @ 2.50GHz (4
Cores, 8 Threads) - The third one has 2x Intel(R) Xeon(R) CPU L5430 @
2.66GHz ==> 8 Cores, no Hyper-Threading

- 32 GB RAM
- 1x Raid 10 (4 Disks)

*Network*
- Actualy each Server and Client has on active connection @ 1x 1GB; In
Short this will be changed to 2x 10GB Fibre perhaps with LACP when
possible.

- We do not use Jumbo Frames yet..

- Public and Cluster-Network related Ceph traffic is actualy going
through this one active 1GB Interface on each Server.

hf
- Mehmet
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux