Re: Luminous 12.2.12 - filestore OSDs take an hour to boot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for this info -- adding it to our list of reasons never to use
FileStore again.
In your case, are you able to migrate?


On Tue, Jul 14, 2020 at 3:13 PM Eric Smith <Eric.Smith@xxxxxxxxxx> wrote:
>
> FWIW Bluestore is not affected by this problem!
>
> -----Original Message-----
> From: Eric Smith <Eric.Smith@xxxxxxxxxx>
> Sent: Saturday, July 11, 2020 6:40 AM
> To: ceph-users@xxxxxxx
> Subject:  Re: Luminous 12.2.12 - filestore OSDs take an hour to boot
>
> It does appear that long file names and filestore seem to be a real problem. We have a cluster where 99% of the objects have names longer than N (220+?) characters such that it truncates the file name (as seen below with "_<sha-sum>_0_long") and stores the full object name in xattrs for the object. During boot the OSD goes out to lunch for increasing amounts of time based on the number of objects on disk you have that meet this criteria (With 2.4 million ish objects that meet this criteria, the OSD takes over an hour to boot). I plan on testing this same scenario with BlueStore to see if it's also susceptible to these boot / read issues.
>
> Eric
>
> -----Original Message-----
> From: Eric Smith <Eric.Smith@xxxxxxxxxx>
> Sent: Friday, July 10, 2020 1:46 PM
> To: ceph-users@xxxxxxx
> Subject:  Re: Luminous 12.2.12 - filestore OSDs take an hour to boot
>
> For what it's worth - all of our objects are generating LONG named object files like so...
>
> \uABCD\ucontent.\srecording\swzdchd\u\utnda-trg-1008007-wzdchd-216203706303281120-230932949-1593482400-159348660000000001\swzdchd\u\utpc2-tp1-1008007-wzdchd-216203706303281120-230932949-1593482400-159348660000000001\u\uwzdchd3._0bfd7c716b839cb7b3ad_0_long
>
> Does this matter? AFAICT it sees this as a long file name and has to lookup the object name in the xattrs ? Is that bad?
>
> -----Original Message-----
> From: Eric Smith <Eric.Smith@xxxxxxxxxx>
> Sent: Friday, July 10, 2020 6:59 AM
> To: ceph-users@xxxxxxx
> Subject:  Luminous 12.2.12 - filestore OSDs take an hour to boot
>
> I have a cluster running Luminous 12.2.12 with Filestore and it takes my OSDs somewhere around an hour to start (They do start successfully - eventually). I have the following log entries that seem to show the OSD process attempting to descend into the PG directory on disk and create an object list of some sort:
>
> 2020-07-09 18:29:28.017207 7f3b680afd80 20 osd.1 137390  clearing temps in 8.14ads3_head pgid 8.14ads3
> 2020-07-09 18:29:28.017211 7f3b680afd80 20 filestore(/var/lib/ceph/osd/ceph-1) collection_list(5012): pool is 8 shard is 3 pgid 8.14ads3
> 2020-07-09 18:29:28.017213 7f3b680afd80 10 filestore(/var/lib/ceph/osd/ceph-1) collection_list(5020): first checking temp pool
> 2020-07-09 18:29:28.017215 7f3b680afd80 20 filestore(/var/lib/ceph/osd/ceph-1) collection_list(5012): pool is -10 shard is 3 pgid 8.14ads3
> 2020-07-09 18:29:28.017221 7f3b680afd80 20 _collection_list_partial start:GHMIN end:GHMAX-64 ls.size 0
> 2020-07-09 18:29:28.017263 7f3b680afd80 20 filestore(/var/lib/ceph/osd/ceph-1) objects: []
> 2020-07-09 18:29:28.017268 7f3b680afd80 10 filestore(/var/lib/ceph/osd/ceph-1) collection_list(5028): fall through to non-temp collection, start 3#-1:00000000::::0#
> 2020-07-09 18:29:28.017272 7f3b680afd80 20 _collection_list_partial start:3#-1:00000000::::0# end:GHMAX-64 ls.size 0
> 2020-07-09 18:29:28.038124 7f3b680afd80 20 list_by_hash_bitwise prefix D
> 2020-07-09 18:29:28.058679 7f3b680afd80 20 list_by_hash_bitwise prefix DA
> 2020-07-09 18:29:28.069432 7f3b680afd80 20 list_by_hash_bitwise prefix DA4
> 2020-07-09 18:29:29.789598 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 5.000074
> 2020-07-09 18:29:29.789634 7f3b51a87700 10 journal commit_start max_applied_seq 53085082, open_ops 0
> 2020-07-09 18:29:29.789639 7f3b51a87700 10 journal commit_start blocked, all open_ops have completed
> 2020-07-09 18:29:29.789641 7f3b51a87700 10 journal commit_start nothing to do
> 2020-07-09 18:29:29.789663 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994):  waiting for max_interval 5.000000
> 2020-07-09 18:29:34.789815 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 5.000109
> 2020-07-09 18:29:34.789898 7f3b51a87700 10 journal commit_start max_applied_seq 53085082, open_ops 0
> 2020-07-09 18:29:34.789902 7f3b51a87700 10 journal commit_start blocked, all open_ops have completed
> 2020-07-09 18:29:34.789906 7f3b51a87700 10 journal commit_start nothing to do
> 2020-07-09 18:29:34.789939 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994):  waiting for max_interval 5.000000
> 2020-07-09 18:29:38.651689 7f3b680afd80 20 list_by_hash_bitwise prefix DA41
> 2020-07-09 18:29:39.790069 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 5.000128
> 2020-07-09 18:29:39.790090 7f3b51a87700 10 journal commit_start max_applied_seq 53085082, open_ops 0
> 2020-07-09 18:29:39.790092 7f3b51a87700 10 journal commit_start blocked, all open_ops have completed
> 2020-07-09 18:29:39.790093 7f3b51a87700 10 journal commit_start nothing to do
> 2020-07-09 18:29:39.790102 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994):  waiting for max_interval 5.000000
> 2020-07-09 18:29:44.790200 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 5.000095
> 2020-07-09 18:29:44.790256 7f3b51a87700 10 journal commit_start max_applied_seq 53085082, open_ops 0
> 2020-07-09 18:29:44.790265 7f3b51a87700 10 journal commit_start blocked, all open_ops have completed
> 2020-07-09 18:29:44.790268 7f3b51a87700 10 journal commit_start nothing to do
> 2020-07-09 18:29:44.790286 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994):  waiting for max_interval 5.000000
> 2020-07-09 18:29:49.790353 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(4010): woke after 5.000066
> 2020-07-09 18:29:49.790374 7f3b51a87700 10 journal commit_start max_applied_seq 53085082, open_ops 0
> 2020-07-09 18:29:49.790376 7f3b51a87700 10 journal commit_start blocked, all open_ops have completed
> 2020-07-09 18:29:49.790378 7f3b51a87700 10 journal commit_start nothing to do
> 2020-07-09 18:29:49.790387 7f3b51a87700 20 filestore(/var/lib/ceph/osd/ceph-1) sync_entry(3994):  waiting for max_interval 5.000000
> 2020-07-09 18:29:50.564479 7f3b680afd80 20 list_by_hash_bitwise prefix DA410000
> 2020-07-09 18:29:50.564501 7f3b680afd80 20 list_by_hash_bitwise prefix DA410000 ob 3#8:b5280000::::head#
> 2020-07-09 18:29:50.564508 7f3b680afd80 20 list_by_hash_bitwise prefix DA41002A
>
> Any idea what's going on here? I can run a find of every file on the filesystem in under 12 minutes so I'm not sure what's taking so long.
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux