Re: [ceph-users] Ceph scrub logs: _scan_snaps no head for $object?

Sage Weil <sage@xxxxxxxxxxxx> · Wed, 2 May 2018 18:37:28 +0000 (UTC)

[Moving to ceph-devel]

On Wed, 2 May 2018, Stefan Kooman wrote:
> Hi,
> 
> Quoting Stefan Kooman (stefan@xxxxxx):
> > Hi,
> > 
> > We see the following in the logs after we start a scrub for some osds:
> > 
> > ceph-osd.2.log:2017-12-14 06:50:47.180344 7f0f47db2700  0 log_channel(cluster) log [DBG] : 1.2d8 scrub starts
> > ceph-osd.2.log:2017-12-14 06:50:47.180915 7f0f47db2700 -1 osd.2 pg_epoch: 11897 pg[1.2d8( v 11890'165209 (3221'163647,11890'165209] local-lis/les=11733/11734 n=67 ec=132/132 lis/c 11733/11733 les/c/f 11734/11734/0 11733/11733/11733) [2,45,31] r=0 lpr=11733 crt=11890'165209 lcod 11890'165208 mlcod 11890'165208 active+clean+scrubbing] _scan_snaps no head for 1:1b518155:::rbd_data.620652ae8944a.0000000000000126:29 (have MIN)
> > ceph-osd.2.log:2017-12-14 06:50:47.180929 7f0f47db2700 -1 osd.2 pg_epoch: 11897 pg[1.2d8( v 11890'165209 (3221'163647,11890'165209] local-lis/les=11733/11734 n=67 ec=132/132 lis/c 11733/11733 les/c/f 11734/11734/0 11733/11733/11733) [2,45,31] r=0 lpr=11733 crt=11890'165209 lcod 11890'165208 mlcod 11890'165208 active+clean+scrubbing] _scan_snaps no head for 1:1b518155:::rbd_data.620652ae8944a.0000000000000126:14 (have MIN)
> > ceph-osd.2.log:2017-12-14 06:50:47.180941 7f0f47db2700 -1 osd.2 pg_epoch: 11897 pg[1.2d8( v 11890'165209 (3221'163647,11890'165209] local-lis/les=11733/11734 n=67 ec=132/132 lis/c 11733/11733 les/c/f 11734/11734/0 11733/11733/11733) [2,45,31] r=0 lpr=11733 crt=11890'165209 lcod 11890'165208 mlcod 11890'165208 active+clean+scrubbing] _scan_snaps no head for 1:1b518155:::rbd_data.620652ae8944a.0000000000000126:a (have MIN)
> > ceph-osd.2.log:2017-12-14 06:50:47.214198 7f0f43daa700  0 log_channel(cluster) log [DBG] : 1.2d8 scrub ok
> > 
> > So finally it logs "scrub ok", but what does " _scan_snaps no head for ..." mean?
> > Does this indicate a problem?
> 
> Still seeing this issue on a freshly installed luminous cluster. I
> *think* it either has to do with "cloned" RBDs that get snapshots by
> themselves or RBDs that are cloned from a snapshot.
> 
> Any dev that wants to debug this behaviour if I'm able to reliably
> reproduce this?

I'm interested!

The best thing would be if you can reproduce from an empty pool or new 
image with debug osd = 20 and debug ms = 1 on the osds... is that feasible 
in your environment?

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html