Re: Placing replaced disks to correct buckets.

Eugen Block <eblock@xxxxxx> · Tue, 19 Feb 2019 07:25:42 +0000

Hi,

We skipped stage 1 and replaced the UUIDs of old disks with the new
ones in the policy.cfg
We ran salt '*' pillar.items and confirmed that the output was correct.
It showed the new UUIDs in the correct places.
Next we ran  salt-run state.orch ceph.stage.3
PS: All of the above ran successfully.

you should lead with the information that you're using SES, otherwise  
it's likely that misunderstandings come up.
Anyway, if you change the policy.cfg you should run stage.2 to make  
sure the changes are applied. Although you state that pillar.items  
shows the correct values I would recommend to run that (short) stage,  
too.

The output of ceph osd tree showed that these new disks are currently
in a ghost bucket, not even under root=default and without a weight.

Where is the respective host listed in the tree? Can you show a ceph  
osd tree, please?
Did you remove all OSDs of one host so the complete host is in the  
"ghost bucket" or just single OSDs? Are other OSDs on that host listed  
correctly?
Since deepsea is not aware of the crush map it can't figure out which  
bucket it should put the new OSDs in. So this part is not automated  
(yet?), you have to do it yourself. But if the host containing the  
replaced OSDs is already placed correctly, then there's definitely  
something wrong.

The first step I then tried was to reweight them but found errors
below:
Error ENOENT: device osd.<OSD NR> does not appear in the crush map
Error ENOENT: unable to set item id 39 name 'osd.39' weight 5.45599 at
location
{host=veeam-mk2-rack1-osd3,rack=veeam-mk2-rack1,room=veeam-mk2,root=veeam}:
does not exist

You can't reweight the OSD if it's not in a bucket yet, try to move it  
to it's dedicated bucket first, if possible.

As already requested by Konstantin, please paste your osd tree?

Regards,
Eugen

Zitat von John Molefe <John.Molefe@xxxxxxxxx>:

Hi David

Removal process/commands ran as follows:

#ceph osd crush reweight osd.<OSD NR> 0
#ceph osd out <OSD NR>
#systemctl stop ceph-osd@<OSD NR>
#umount /var/lib/ceph/osd/ceph-<OSD NR>

#ceph osd crush remove osd.<OSD NR>
#ceph auth del osd.<OSD NR>
#ceph osd rm <OSD NR>
#ceph-disk zap /dev/sd??

Adding them back on:

We skipped stage 1 and replaced the UUIDs of old disks with the new
ones in the policy.cfg
We ran salt '*' pillar.items and confirmed that the output was correct.
It showed the new UUIDs in the correct places.
Next we ran  salt-run state.orch ceph.stage.3
PS: All of the above ran successfully.

The output of ceph osd tree showed that these new disks are currently
in a ghost bucket, not even under root=default and without a weight.

The first step I then tried was to reweight them but found errors
below:
Error ENOENT: device osd.<OSD NR> does not appear in the crush map
Error ENOENT: unable to set item id 39 name 'osd.39' weight 5.45599 at
location
{host=veeam-mk2-rack1-osd3,rack=veeam-mk2-rack1,room=veeam-mk2,root=veeam}:
does not exist

But when I run the command: ceph osd find <OSD NR>
v-cph-admin:/testing # ceph osd find 39
{
"osd": 39,
"ip": "143.160.78.97:6870\/24436",
"crush_location": {}
}

Please let me know if there's any other info that you may need to
assist

Regards
J.
David Turner <drakonstein@xxxxxxxxx> 2019/02/18 17:08 >>>
Also what commands did you run to remove the failed HDDs and the
commands you have so far run to add their replacements back in?

On Sat, Feb 16, 2019 at 9:55 PM Konstantin Shalygin <k0ste@xxxxxxxx>
wrote:

I recently replaced failed HDDs and removed them from their respective
buckets as per procedure. But I’m now facing an issue when trying to
place new ones back into the buckets. I’m getting an error of ‘osd nr
not found’ OR ‘file or directory not found’ OR command sintax error. I
have been using the commands below: ceph osd crush set <osd.nr> <weight>
<bucket>ceph osd crush <osd nr> set <osd.nr> <weight> <bucket> I do
however find the OSD number when i run command: ceph osd find <nr> Your
assistance/response to this will be highly appreciated. Regards John.
Please, paste your `ceph osd tree`, your version and what exactly error
you get include osd number.
Less obfuscation is better in this, perhaps, simple case.

k
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Vrywaringsklousule / Disclaimer:
http://www.nwu.ac.za/it/gov-man/disclaimer.html
( http://www.nwu.ac.za/it/gov-man/disclaimer.html )

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com