Re: [PATCH] virtio_ring: Fix the stale index in available ring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/15/24 21:05, Michael S. Tsirkin wrote:
On Fri, Mar 15, 2024 at 08:45:10PM +1000, Gavin Shan wrote:
Yes, I guess smp_wmb() ('dmb') is buggy on NVidia's grace-hopper platform. I tried
to reproduce it with my own driver where one thread writes to the shared buffer
and another thread reads from the buffer. I don't hit the out-of-order issue so
far.

Make sure the 2 areas you are accessing are in different cache lines.


Yes, I already put those 2 areas to separate cache lines.


My driver may be not correct somewhere and I will update if I can reproduce
the issue with my driver in the future.

Then maybe your change is just making virtio slower and masks the bug
that is actually elsewhere?

You don't really need a driver. Here's a simple test: without barriers
assertion will fail. With barriers it will not.
(Warning: didn't bother testing too much, could be buggy.

---

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

#define FIRST values[0]
#define SECOND values[64]

volatile int values[100] = {};

void* writer_thread(void* arg) {
	while (1) {
	FIRST++;
	// NEED smp_wmb here
        __asm__ volatile("dmb ishst" : : : "memory");
	SECOND++;
	}
}

void* reader_thread(void* arg) {
     while (1) {
	int first = FIRST;
	// NEED smp_rmb here
        __asm__ volatile("dmb ishld" : : : "memory");
	int second = SECOND;
	assert(first - second == 1 || first - second == 0);
     }
}

int main() {
     pthread_t writer, reader;

     pthread_create(&writer, NULL, writer_thread, NULL);
     pthread_create(&reader, NULL, reader_thread, NULL);

     pthread_join(writer, NULL);
     pthread_join(reader, NULL);

     return 0;
}


Had a quick test on NVidia's grace-hopper and Ampere's CPUs. I hit
the assert on both of them. After replacing 'dmb' with 'dsb', I can
hit assert on both of them too. I need to look at the code closely.

[root@virt-mtcollins-02 test]# ./a
a: a.c:26: reader_thread: Assertion `first - second == 1 || first - second == 0' failed.
Aborted (core dumped)

[root@nvidia-grace-hopper-05 test]# ./a
a: a.c:26: reader_thread: Assertion `first - second == 1 || first - second == 0' failed.
Aborted (core dumped)

Thanks,
Gavin





[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux