On 28/10/2023 08.58, Eyal Lebedinsky wrote:
Fully updated F28. I had to send one (of 7) member disk for RMA. I notice that the system is very non responsive. 'top' shows PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1365697 root 20 0 0 0 0 R 93.8 0.0 384:40.55 kworker/u16:3+flush-9:127 This continues even when there are no user actions (ff, tb closed). A few days ago it stopped, but today I see that it kept running all night where there were period of inactivity for a few hours. As another point: a few days ago I received a disk from RMA and the recovery went as fast as expected. I then removed another disk to send for RMA. Is this expected? Is there anything I can do to improve the situation? TIA
Maybe a hint. On a whim I decided to look at interrupts on the machine. I see an item in /proc/interrupts that grows by 80-90 every second. It is listed as 'IR-PCI-MSIX-0000:03:00.0 0-edge mpt2sas0-msix0' which is probably related to the raid card used for this array. Another hint: I see a job stuck in D state. $ ps aux|grep parted root 2398175 0.0 0.0 6184 3700 ? D 05:10 0:00 parted -l This command runs overnight to collect some stats, and it seems that this program is hanging. This one started at "2023-10-27 05:10:01", so when the disk was still in the machine (not in the array) but after it just finished being zeroed. -- Eyal at Home (eyal@xxxxxxxxxxxxxx) _______________________________________________ users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue