Hello Mark, On Thu, 04 Jun 2015 20:34:55 +1200 Mark Kirkwood wrote: > Sorry Christian, > > I did briefly wonder, then thought, oh yeah, that fix is already merged > in...However - on reflection, perhaps *not* in the 0.80 tree...doh! > No worries, I'm just happy to hear that you think it's the same thing as well. I upgraded to 0.80.9 (fun fact, NO predicted and actual data movement after setting "straw_calc_version 1" and doing a reweight all) today. Should it happen again, I know who and where to poke. ^^ Christian > On 04/06/15 18:57, Christian Balzer wrote: > > > > Hello, > > > > Actually after going through the changelogs with a fine comb and the > > ole Mark I eyeball I think I might be seeing this: > > --- > > osd: fix journal direct-io shutdown (#9073 Mark Kirkwood, Ma Jianpeng, > > Somnath Roy) --- > > > > The details in the various related bug reports certainly make it look > > related. > > Funny that nobody involved in those bug reports noticed the similarity. > > > > Now I wouldn't have installed 0.80.8 due to the regression speed bug > > anyway, but now that 0.80.9 has made it into Jessie backports I shall > > install that tomorrow and hopefully never see that problem again. > > > > Christian > > > > On Thu, 28 May 2015 07:01:15 -0700 Gregory Farnum wrote: > > > >> On Thu, May 28, 2015 at 12:22 AM, Christian Balzer <chibi@xxxxxxx> > >> wrote: > >>> > >>> Hello Greg, > >>> > >>> On Wed, 27 May 2015 22:53:43 -0700 Gregory Farnum wrote: > >>> > >>>> The description of the logging abruptly ending and the journal being > >>>> bad really sounds like part of the disk is going back in time. I'm > >>>> not sure if XFS internally is set up in such a way that something > >>>> like losing part of its journal would allow that? > >>>> > >>> I'm special. ^o^ > >>> No XFS, EXT4. As stated in the original thread, below. > >>> And the (OSD) journal is a raw partition on a DC S3700. > >>> > >>> And since there was at least a 30 seconds pause between the > >>> completion of the "/etc/init.d/ceph stop" and issuing of the > >>> shutdown command, the logging abruptly ending seems to be unlikely > >>> related to the shutdown at all. > >> > >> Oh, sorry... > >> I happened to read this article last night: > >> http://lwn.net/SubscriberLink/645720/01149aa7c58954eb/ > >> > >> Depending on configuration (I think you'd need to have a > >> journal-as-file) you could be experiencing that. And again, not many > >> people use ext4 so who knows what other ways there are of things being > >> broken that nobody else has seen yet. > >> > >>> > >>>> If any of the OSD developers have the time it's conceivable a copy > >>>> of the OSD journal would be enlightening (if e.g. the header > >>>> offsets are wrong but there are a bunch of valid journal entries), > >>>> but this is two reports of this issue from you and none very > >>>> similar from anybody else. I'm still betting on something in the > >>>> software or hardware stack misbehaving. (There aren't that many > >>>> people running Debian; there are lots of people running Ubuntu and > >>>> we find bad XFS kernels there not infrequently; I think you're > >>>> hitting something like that.) > >>>> > >>> There should be no file system involved with the raw partition SSD > >>> journal, n'est-ce pas? > >> > >> ...and I guess probably you aren't since you are using partitions. > >> > >>> > >>> The hardware is vastly different, the previous case was on an AMD > >>> system with onboard SATA (SP5100), this one is a SM storage goat with > >>> LSI 3008. > >>> > >>> The only thing they have in common is the Ceph version 0.80.7 (via > >>> the Debian repository, not Ceph) and Debian Jessie as OS with kernel > >>> 3.16 (though there were minor updates on that between those > >>> incidents, backported fixes) > >>> > >>> A copy of the journal would consist of the entire 10GB partition, > >>> since we don't know where in loop it was at the time, right? > >> > >> Yeah. > >> > > > > > > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com