Network issues with a CephFS client mount via a Cloudstack instance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I’m going to also post this to the Cloudstack list as well.

Attempting to rsync a large file to the Ceph volume, the instance becomes unresponsive at the network level. It eventually returns but it will continually drop offline as the file copies. Dmesg shows this on the Cloudstack host machine:

[ 7144.888744] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
TDH <80>
TDT <d0>
next_to_use <d0>
next_to_clean <7f>
buffer_info[next_to_clean]:
time_stamp <100686d46>
next_to_watch <80>
jiffies <100687140>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 7146.872563] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
TDH <80>
TDT <d0>
next_to_use <d0>
next_to_clean <7f>
buffer_info[next_to_clean]:
time_stamp <100686d46>
next_to_watch <80>
jiffies <100687900>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 7148.856703] e1000e 0000:00:19.0 eno1: Detected Hardware Unit Hang:
TDH <80>
TDT <d0>
next_to_use <d0>
next_to_clean <7f>
buffer_info[next_to_clean]:
time_stamp <100686d46>
next_to_watch <80>
jiffies <1006880c0>
next_to_watch.status <0>
MAC Status <80083>
PHY Status <796d>
PHY 1000BASE-T Status <3800>
PHY Extended Status <3000>
PCI Status <10>
[ 7150.199756] e1000e 0000:00:19.0 eno1: Reset adapter unexpectedly

The host machine:

System Information
Manufacturer: Dell Inc.
Product Name: OptiPlex 990

Running CentOS 8.4.

I also see the same error on another host of a different hw type:

Manufacturer: Hewlett-Packard
Product Name: HP Compaq 8200 Elite SFF PC

but both are using e1000 drivers.

I upgraded the kernel to 5.13.x and I thought this fixed the issue, but now I see the error again.

Migrating the instance to a bigger server class machine (also e1000e, old Rackable system) where I have a bigger pipe via bonding, I don’t seem to have the issue.

Just curious if this could be a known bug with e1000e and if there is any kind of work around.

Thanks
-jeremy

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux