Re: SLOW_OPS problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've got a M.2 drive that will hit 94C on the surface of the drive as seen with my thermal camera if it doesn't have active cooling on it. :D

FWIW, I don't recall seeing thermal throttling in the 60C range in the past.  We've seen it at higher temps though.

Mark


On 10/15/24 12:48, Anthony D'Atri wrote:
Oh yeah that’s really high for a drive.

Do other drives in the same / other chassis show the same temps, or is this an outlier?

With Dell chassis, for example, I’ve often had to increase the iDRAC fan speed offset to get the drive temps below 40C

On Oct 15, 2024, at 1:36 PM, Mat Young <mat.young@xxxxxxxxxxxxx> wrote:

Looking at the smartlog seems to show 63C current temp with 53C as worst case which doesn’t make a lot of sense. Could they drive be thermally throttling?

Rgds

mat

From: Tim Sauerbein <sauerbein@xxxxxxxxxx>
Sent: Tuesday, October 15, 2024 11:21 AM
To: ceph-users <ceph-users@xxxxxxx>
Subject:  Re: SLOW_OPS problems

[External: Do not click links or open attachments without verifying the sender, always login to your account directly.]
Sorry, forgot to mention: I did a secure erase on the drive yesterday, added it to the OSD again with the same result of slow ops a few hours later. > On 15 Oct 2024, at 16:07, Tim Sauerbein <sauerbein@xxxxxxxxxx<mailto:sauerbein@xxxxxxxxxx>> wrote: > >>
NkdkJdXPPEBannerStart
Be Careful With This Message
 From (Tim Sauerbein <sauerbein@xxxxxxxxxx>)<https://godaddy.cloud-protect.net/email-details/?k=k1&payload=53616c7465645f5f15db257bb6f7446517f83bae6d9f7752b417bd2af5cd0f818688bd2dbc09ea4d63ff123db727aac7a26336b0ac38030b4d5f6578b60f00f399cecc7c56cdf55ccf24a1fbdf14dd1574c17a4c300de8705b37b4ef25d11fe41ce0f9fdb7e8228a29a4207e9be5cfb2fd78296edffa23c172a5cf397b0ff766a21788297658e61718f53d2a445d8056650a2d047cf31eeb08d1ff50f3ec7363971db9f2a6809e803c3678894306df1b57d1d2463235136b2beacce4e62ccdca0169afeef74aea8b0af616a0d7c44fc6c0b69d24c4211ba1c98a19ac640aafae16cb463be63f93ad06a2c67696d8bcba>
Learn More<https://godaddy.cloud-protect.net/email-details/?k=k1&payload=53616c7465645f5f15db257bb6f7446517f83bae6d9f7752b417bd2af5cd0f818688bd2dbc09ea4d63ff123db727aac7a26336b0ac38030b4d5f6578b60f00f399cecc7c56cdf55ccf24a1fbdf14dd1574c17a4c300de8705b37b4ef25d11fe41ce0f9fdb7e8228a29a4207e9be5cfb2fd78296edffa23c172a5cf397b0ff766a21788297658e61718f53d2a445d8056650a2d047cf31eeb08d1ff50f3ec7363971db9f2a6809e803c3678894306df1b57d1d2463235136b2beacce4e62ccdca0169afeef74aea8b0af616a0d7c44fc6c0b69d24c4211ba1c98a19ac640aafae16cb463be63f93ad06a2c67696d8bcba>
Potential Impersonation
The sender's identity could not be verified and someone may be impersonating the sender. Take caution when interacting with this message.

NkdkJdXPPEBannerEnd

Sorry, forgot to mention:



I did a secure erase on the drive yesterday, added it to the OSD again with the same result of slow ops a few hours later.



On 15 Oct 2024, at 16:07, Tim Sauerbein <sauerbein@xxxxxxxxxx<mailto:sauerbein@xxxxxxxxxx>> wrote:
On 14 Oct 2024, at 16:01, Anthony D'Atri <aad@xxxxxxxxxxxxxx<mailto:aad@xxxxxxxxxxxxxx>> wrote:
Remind me, have you sent me a full `smartctl -a` output for this drive?
See here, looks good though: https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_sauerbein_6423231adb954d28c8c82a8422256355&d=DwIGaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=bDvyvkr2bb4BZTlvrJx55ZzKgQTyuyI1pgJpjo3ippU&m=4BzydYt8R0yplK8QvqowJz2GaV9Lnf7dg9Xos6bDeNBYcAyuL4faGr3ma7qRjCKc&s=VKfARgTFDlE0uDcv7zs4xjZ6--dLJqws6_O877VPbyw&e=
If there’s a firmware update available, updating it with a subsequent secure-erase could plausibly recover it.
I don't think there is a firmware update publicly available. Other disks of same model and same firmware run without issues in my cluster btw.
On 14 Oct 2024, at 15:56, Mark Nelson <https://urldefense.proofpoint.com/v2/url?u=http-3A__mark.nelson-40clyso.com&d=DwIGaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=bDvyvkr2bb4BZTlvrJx55ZzKgQTyuyI1pgJpjo3ippU&m=4BzydYt8R0yplK8QvqowJz2GaV9Lnf7dg9Xos6bDeNBYcAyuL4faGr3ma7qRjCKc&s=59RCI_vd1KgnEwQGO9-paAVPPm3884F9Oq_hMho-S94&e=> wrote:
I've seen similar issues before where smart showed no failures but the drive performed terribly.  You can try trimming the drive or even doing a secure format to see if it helps, but at least in the case I recall it was an issue with the drive itself.
I think that the disk is just faulty too. Do you have any idea of a test to run on the SSD to prove that independent of Ceph?
Thanks,
Tim


_______________________________________________

ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>

To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>





----------



This email has been scanned for spam and viruses. Visit the following link to report this email as spam:

https://godaddy.cloud-protect.net/app/report_spam.php?mod_id=11&mod_option=logitem&report=1&type=easyspam&k=k1&payload=53616c7465645f5f99858a10fb5d88069d46e0a76bffcf608852edf1729c8dcab288411ea1dcf982c558f4d477adb88e6d258294685049d6531aa93e1b8c603e461cc0af17dbd4010838586edf64d1e3d67b6155f1c08399ef0fbc70bf62abaf9580c00d306da64edb207e0b95ba9646aff218c3a846470698aff34e3499a4dc272c128392c11d8a608369425815e393a3329a6a2a60a1d16dad9704ea168caf
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Best Regards,
Mark Nelson
Head of Research and Development

Clyso GmbH
p: +49 89 21552391 12 | a: Minnesota, USA
w: https://clyso.com | e: mark.nelson@xxxxxxxxx

We are hiring: https://www.clyso.com/jobs/
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux