Re: One mds daemon damaged, filesystem is offline. How to recover?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 

    On Saturday, May 22, 2021, 03:14:13 PM GMT+8, Eugen Block <eblock@xxxxxx> wrote:  
 What does the MDS report in its logs from when it went down?

What size do you get when you run
rados -p cephfs_metadata stat 200.00006048
# rados -p cephfs_metadata stat 200.00006048cephfs_metadata/200.00006048 mtime 2021-05-20 22:47:30.000000, size 1555896

There's a similar report [3] suggesting to try to force an update on  
the object info, you could give that a shot:

> 1. rados -p [cephfs_metadata] setomapval 200.00006048 temporary-key anything
> 2. ceph pg deep-scrub 2.44
> 3. Wait for the scrub to finish
> 4. rados -p [cephfs_metadata] rmomapkey 200.00006048 temporary-key

I gave it a try. Here are the detail:
Before try it:rados list-inconsistent-obj 2.44 --format=json-pretty{    "epoch": 6996,    "inconsistents": []}

After trying the step 1:# rados list-inconsistent-obj 2.44 --format=json-pretty{    "epoch": 6996,    "inconsistents": [        {            "object": {                "name": "200.00006048",                "nspace": "",                "locator": "",                "snap": "head",                "version": 0            },            "errors": [],            "union_shard_errors": [                "obj_size_info_mismatch"            ],            "shards": [::
That is, It created the info for PG 2.44.
It shows size differences as follows:shards": [                {                    "osd": 0,                    "primary": true,                    "errors": [                        "obj_size_info_mismatch"                    ],                    "size": 1540096,
                    "object_info": {                     "size": 1555896,

 "osd": 1,                    "primary": false,                    "errors": [                        "obj_size_info_mismatch"                    ],                    "size": 1540096,                    "object_info": {                     "size": 1555896,

                    "osd": 2,                    "primary": false,                    "errors": [                        "obj_size_info_mismatch"                    ],                    "size": 1441792,                    "object_info": {                     "size": 1555896,

Please note, the physical file size of 200.00006048 on OSD 0 and 1 is 1540096.
Physical file size on OSD 2 is  1441792.

I understand file sizes of the 200.00006048 should be same on all OSDs.
What should I do in that case?

Please also note, "ceph pg deep-scrub 2.44" did not fix the PG 2.44.

Sagara






  
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux