Re: MDS allocates all memory (>500G) replaying, OOM-killed, repeat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Since my problem is going to be archived on the Internet I'll keep following up, so the next person with this problem might save some time.


The seek was because ext4 can't seek to 23TB, but changing to an xfs mount to create this file resulted in success.


Here is what I wound up doing to fix this:


  • Bring down all MDSes so they stop flapping
  • Back up journal (as seen in previous message)
  • Apply journal manually
  • Reset journal manually
  • Clear session table
  • Clear other tables (not sure I needed to do this)
  • Mark FS down
  • Mark the rank 0 MDS as failed
  • Reset the FS (yes, I really mean it)
  • Restart MDSes
  • Finally get some sleep

If anybody has any idea what may have caused this situation, I am keenly interested. If not, hopefully I at least helped someone else.




From: Pickett, Neale T
Sent: Monday, April 1, 2019 12:31
To: ceph-users@xxxxxxxxxxxxxx
Subject: Re: MDS allocates all memory (>500G) replaying, OOM-killed, repeat
 

We decided to go ahead and try truncating the journal, but before we did, we would try to back it up. However, there are ridiculous values in the header. It can't write a journal this large because (I presume) my ext4 filesystem can't seek to this position in the (sparse) file.


I would not be surprised to learn that memory allocation is trying to do something similar, hence the allocation of all available memory. This seems like a new kind of journal corruption that isn't being reported correctly.


[root@lima /]# time cephfs-journal-tool --cluster=prodstore journal export backup.bin
journal is 24652730602129~673601102
2019-04-01 17:49:52.776977 7fdcb999e040 -1 Error 22 ((22) Invalid argument) seeking to 0x166be9401291
Error ((22) Invalid argument)

real    0m27.832s
user    0m2.028s
sys     0m3.438s
[root@lima /]# cephfs-journal-tool --cluster=prodstore event get summary
Events by type:
  EXPORT: 187
  IMPORTFINISH: 182
  IMPORTSTART: 182
  OPEN: 3133
  SUBTREEMAP: 129
  UPDATE: 42185
Errors: 0
[root@lima /]# cephfs-journal-tool --cluster=prodstore header get
{
    "magic": "ceph fs volume v011",
    "write_pos": 24653404029749,
    "expire_pos": 24652730602129,
    "trimmed_pos": 24652730597376,
    "stream_format": 1,
    "layout": {
        "stripe_unit": 4194304,
        "stripe_count": 1,
        "object_size": 4194304,
        "pool_id": 2,
        "pool_ns": ""
    }
}

[root@lima /]# printf "%x\n" "24653404029749"
166c1163c335
[root@lima /]# printf "%x\n" "24652730602129"
166be9401291

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux