Re: NFS mount lockups since about a month ago

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Raid0, so there is no redundancy on the data?

And what kind of underlying hard disks?   The desktop drives will try
for a long time (ie a minute or more) to read any bad blocks.  Those
disks will not report an error unless it gets to the default os
timeout, or it hits the disk firmware timeout.

The sar data will show if one of the disks is being slow on the server end.

On the client end you are unlikely to get anything useful from any
samples as it seems pretty likely the server is not responding to nfs
and/or the disks are not responding.

It could be as simple as on login it tries to read a badish/slow block
and that block takes a while to finally get it to read.   If that is
happening it will probably eventually stop being able to read it, and
if you really are using raid0 then some data will be lost.

All of the nfsv4 issues I have ran into involve it just breaking and
staying broke (usually when the server reboots).  I never had it have
big sudden pauses, but using v3 won't hurt and I try to avoid v4
still.

On Thu, Sep 30, 2021 at 11:55 AM Terry Barnaby <terry1@xxxxxxxxxxx> wrote:
>
> On 30/09/2021 11:42, Roger Heflin wrote:
>
> On mine when I first access the NFS volume it takes 5-10 seconds for the disks to spin up.  Mine will spin down later in the day if little or nothing is going on and I will get another delay.
>
> I have also seen delays if a disk gets bad blocks and corrects them.  About 1/2 of time that does have a message but some of the time there are no messages at all about it, and I have had to resort to using Sar to figure out which disk is causing the issue.
>
> So on my machine I see this (sar -d):
> 05:29:01 AM DEV tps rkB/s wkB/s dkB/s areq-sz aqu-sz await %util
> 05:29:01 AM dev8-0 36.16 94.01 683.65 0.00 21.51 0.03 0.67 1.11
> 05:29:01 AM dev8-16 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 05:29:01 AM dev8-32 0.02 0.00 0.00 0.00 0.00 0.00 1.00 0.00
> 05:29:01 AM dev8-48 423.65 71239.92 198.64 0.00 168.63 12.73 29.72 86.07
> 05:29:01 AM dev8-64 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 05:29:01 AM dev8-80 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 05:29:01 AM dev8-144 2071.22 71311.58 212.22 0.00 34.53 11.37 5.47 54.81
> 05:29:01 AM dev8-96 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 05:29:01 AM dev8-128 1630.99 71389.49 198.18 0.00 43.89 15.72 9.62 57.05
> 05:29:01 AM dev8-112 2081.05 71426.01 182.48 0.00 34.41 11.32 5.42 55.68
>
> There is a 4 disk raid6 check going on.
>
> You will notice that dev8-48 is busier than the other 3 disks, in this case that is because it is a 3TB disk vs the other 3 being all newer 6tb disks with higher data/revolution.
>
> If you have sar setup with 60 second samples the one disk that pauses should stand out more obvious than this since the 3tb seems to be only marginally faster than the 6tbs.
>
>
> _______________________________________________
>
> In my case the servers /home is on a partition of the two main Raid0 disks that is shared with the OS and so are active most of the time. No errors reported.
>
> I will try setting up sar with a 60 second sample time on the client, thanks for the idea.
>
>
> _______________________________________________
> users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure



[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux