Hello Anthony, thank you for the answer.
While researching I also found out this type of issues but the thing I did not understand is in the same server the OS drives "SAMSUNG MZ7WD480" is all good.
root@sd-01:~# lsblk -D
NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
sda 0 512B 2G 0
├─sda1 0 512B 2G 0
├─sda2 0 512B 2G 0
└─sda3 0 512B 2G 0
└─md0 0 512B 2G 0
└─md0p1 0 512B 2G 0
sdb 0 512B 2G 0
├─sdb1 0 512B 2G 0
├─sdb2 0 512B 2G 0
└─sdb3 0 512B 2G 0
└─md0 0 512B 2G 0
└─md0p1 0 512B 2G 0
NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
sda 0 512B 2G 0
├─sda1 0 512B 2G 0
├─sda2 0 512B 2G 0
└─sda3 0 512B 2G 0
└─md0 0 512B 2G 0
└─md0p1 0 512B 2G 0
sdb 0 512B 2G 0
├─sdb1 0 512B 2G 0
├─sdb2 0 512B 2G 0
└─sdb3 0 512B 2G 0
└─md0 0 512B 2G 0
└─md0p1 0 512B 2G 0
root@sd-01:~# find /sys/ -name provisioning_mode -exec grep -H . {} + | sort
/sys/devices/pci0000:00/0000:00:11.4/ata1/host1/target1:0:0/1:0:0:0/scsi_disk/1:0:0:0/provisioning_mode:writesame_16
/sys/devices/pci0000:00/0000:00:11.4/ata2/host2/target2:0:0/2:0:0:0/scsi_disk/2:0:0:0/provisioning_mode:writesame_16
/sys/devices/pci0000:80/0000:80:03.0/0000:81:00.0/host0/port-0:0/end_device-0:0/target0:0:0/0:0:0:0/scsi_disk/0:0:0:0/provisioning_mode:full
/sys/devices/pci0000:80/0000:80:03.0/0000:81:00.0/host0/port-0:1/end_device-0:1/target0:0:1/0:0:1:0/scsi_disk/0:0:1:0/provisioning_mode:full
/sys/devices/pci0000:00/0000:00:11.4/ata1/host1/target1:0:0/1:0:0:0/scsi_disk/1:0:0:0/provisioning_mode:writesame_16
/sys/devices/pci0000:00/0000:00:11.4/ata2/host2/target2:0:0/2:0:0:0/scsi_disk/2:0:0:0/provisioning_mode:writesame_16
/sys/devices/pci0000:80/0000:80:03.0/0000:81:00.0/host0/port-0:0/end_device-0:0/target0:0:0/0:0:0:0/scsi_disk/0:0:0:0/provisioning_mode:full
/sys/devices/pci0000:80/0000:80:03.0/0000:81:00.0/host0/port-0:1/end_device-0:1/target0:0:1/0:0:1:0/scsi_disk/0:0:1:0/provisioning_mode:full
root@sd-01:~# disklist
HCTL NAME SIZE REV TRAN WWN SERIAL MODEL
1:0:0:0 /dev/sda 447.1G 203Q sata 0x5002538500231d05 S1G1NYAF923 SAMSUNG MZ7WD4
2:0:0:0 /dev/sdb 447.1G 203Q sata 0x5002538500231a41 S1G1NYAF922 SAMSUNG MZ7WD4
0:0:0:0 /dev/sdc 3.6T 046 sas 0x500a0751e6bd969b 2312E6BD969 CT4000MX500SSD
0:0:1:0 /dev/sdd 3.6T 046 sas 0x500a0751e6bd97ee 2312E6BD97E CT4000MX500SSD
0:0:2:0 /dev/sde 3.6T 046 sas 0x500a0751e6bd9805 2312E6BD980 CT4000MX500SSD
0:0:3:0 /dev/sdf 3.6T 046 sas 0x500a0751e6bd9681 2312E6BD968 CT4000MX500SSD
0:0:4:0 /dev/sdg 3.6T 045 sas 0x500a0751e6b5d30a 2309E6B5D30 CT4000MX500SSD
0:0:5:0 /dev/sdh 3.6T 046 sas 0x500a0751e6bd967e 2312E6BD967 CT4000MX500SSD
0:0:6:0 /dev/sdi 3.6T 046 sas 0x500a0751e6bd97e4 2312E6BD97E CT4000MX500SSD
0:0:7:0 /dev/sdj 3.6T 046 sas 0x500a0751e6bd96a0 2312E6BD96A CT4000MX500SSD
1:0:0:0 /dev/sda 447.1G 203Q sata 0x5002538500231d05 S1G1NYAF923 SAMSUNG MZ7WD4
2:0:0:0 /dev/sdb 447.1G 203Q sata 0x5002538500231a41 S1G1NYAF922 SAMSUNG MZ7WD4
0:0:0:0 /dev/sdc 3.6T 046 sas 0x500a0751e6bd969b 2312E6BD969 CT4000MX500SSD
0:0:1:0 /dev/sdd 3.6T 046 sas 0x500a0751e6bd97ee 2312E6BD97E CT4000MX500SSD
0:0:2:0 /dev/sde 3.6T 046 sas 0x500a0751e6bd9805 2312E6BD980 CT4000MX500SSD
0:0:3:0 /dev/sdf 3.6T 046 sas 0x500a0751e6bd9681 2312E6BD968 CT4000MX500SSD
0:0:4:0 /dev/sdg 3.6T 045 sas 0x500a0751e6b5d30a 2309E6B5D30 CT4000MX500SSD
0:0:5:0 /dev/sdh 3.6T 046 sas 0x500a0751e6bd967e 2312E6BD967 CT4000MX500SSD
0:0:6:0 /dev/sdi 3.6T 046 sas 0x500a0751e6bd97e4 2312E6BD97E CT4000MX500SSD
0:0:7:0 /dev/sdj 3.6T 046 sas 0x500a0751e6bd96a0 2312E6BD96A CT4000MX500SSD
So my question is why it only happens to CT4000MX500SSD drives and why it just started now and I don't have in other servers?
Maybe it is related to firmware version "M3CR046 vs M3CR045"
I check the crucial website and actually "M3CR046" is not exist: https://www.crucial.com/support/ssd-support/mx500-support
In this forum people recommend upgrading "M3CR046" https://forums.unraid.net/topic/134954-warning-crucial-mx500-ssds-world-of-pain-stay-away-from-these/
But actually in my ud cluster all the drives are "M3CR045" and have lower latency. I'm really confused.
Instead of writing udev rules for only CT4000MX500SSD is there any recommended udev rule for ceph and all type of sata drives?
Anthony D'Atri <aad@xxxxxxxxxxxxxx>, 22 Mar 2024 Cum, 17:00 tarihinde şunu yazdı:
?
On Mar 22, 2024, at 09:36, Özkan Göksu <ozkangksu@xxxxxxxxx> wrote:Hello!
After upgrading "5.15.0-84-generic" to "5.15.0-100-generic" (Ubuntu 22.04.2
LTS) , commit latency started acting weird with "CT4000MX500SSD" drives.
osd commit_latency(ms) apply_latency(ms)
36 867 867
37 3045 3045
38 15 15
39 18 18
42 1409 1409
43 1224 1224
I downgraded the kernel but the result did not change.
I have a similar build and it didn't get upgraded and it is just fine.
While I was digging I realised a difference.
This is high latency cluster and as you can see the "DISC-GRAN=0B",
"DISC-MAX=0B"
root@sd-01:~# lsblk -D
NAME DISC-ALN DISC-GRAN DISC-MAX
DISC-ZERO
sdc 0 0B 0B
0
├─ceph--76b7d255--2a01--4bd4--8d3e--880190181183-osd--block--201d5050--db0c--41b4--85c4--6416ee989d6c
│ 0 0B 0B
0
└─ceph--76b7d255--2a01--4bd4--8d3e--880190181183-osd--block--5a376133--47de--4e29--9b75--2314665c2862
root@sd-01:~# find /sys/ -name provisioning_mode -exec grep -H . {} + | sort
/sys/devices/pci0000:80/0000:80:03.0/0000:81:00.0/host0/port-0:0/end_device-0:0/target0:0:0/0:0:0:0/scsi_disk/0:0:0:0/provisioning_mode:full
------------------------------------------------------------------------------------------
This is low latency cluster and as you can see the "DISC-GRAN=4K",
"DISC-MAX=2G"
root@ud-01:~# lsblk -D
NAME DISC-ALN
DISC-GRAN DISC-MAX DISC-ZERO
sdc 0
4K 2G 0
├─ceph--7496095f--18c7--41fd--90f2--d9b3e382bc8e-osd--block--ec86a029--23f7--4328--9600--a24a290e3003
│ 0
4K 2G 0
└─ceph--7496095f--18c7--41fd--90f2--d9b3e382bc8e-osd--block--5b69b748--d899--4f55--afc3--2ea3c8a05ca1
root@ud-01:~# find /sys/ -name provisioning_mode -exec grep -H . {} + | sort
/sys/devices/pci0000:00/0000:00:11.4/ata3/host2/target2:0:0/2:0:0:0/scsi_disk/2:0:0:0/provisioning_mode:writesame_16
I think the problem is related to provisioning_mode but I really did not
understand the reason.
I boot with a live iso and still the drive was "provisioning_mode:full" so
it means this is not related to my OS at all.
With the upgrade something changed and I think during boot sequence
negotiation between LSI controller, drives and kernel started to assign
"provisioning_mode:full" but I'm not sure.
What should I do ?
Best regards.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx