Re: [PATCH] pci: endpoint: functions: Add a virtnet EP function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Michael,

Thank you for your feedback!

On Tue, Sep 3, 2019 at 2:25 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
> On Fri, Aug 23, 2019 at 02:31:45PM -0700, Haotian Wang wrote:
> > This endpoint function enables the PCI endpoint to establish a virtual
> > ethernet link with the PCI host. The main features are:
> > 
> > - Zero modification of PCI host kernel. The only requirement for the
> >   PCI host is to enable virtio, virtio_pci, virtio_pci_legacy and
> >   virito_net.
> 
> Do we need to support legacy? Why not just the modern interface?
> Even if yes, limiting device
> to only legacy support is not a good idea.

I absolutely agree with you on modern interfaces being better. The issue
here is that I did not support legacy because of compatibility reasons
but because I was forced to choose legacy.

In the summer, I asked the hardware team whether I had read-write access
to the capabilities registers from the endpoint but did not receive a
response back then.

Now I can write the code using modern virtio but I cannot easily verify
the epf will actually function on the hardware.

Reading and writing of capabilities list registers requires patches to
the pci endpoint framework and the designware endpoint controller as
well. I will probably work on that after I resolve these other issues.

> > +	if (!atomic_xchg(pending, 0))
> > +		usleep_range(check_queues_usec_min,
> > +			     check_queues_usec_max);
> 
> What's the usleep hackery doing? Set it too low and you
> waste cycles. Set it too high and your latency suffers.
> It would be nicer to just use a completion or something like this.

If the pending bit is set, the kthread will go directly into another
round. The usleep is here because, in case the pending bit is not set,
the kthread waits a certain while and then checks for buffers anyway as
a sort of "fallback" check.

Problem with completion is that there is no condition to complete on. I
can change the usleep_range() to schedule() if that is a more sensible
thing to do.

If you mean wait until the pending bit is set, I can do that. Back when
I wrote this module, the reason for not doing that was the endpoint
might fail to catch notification from the host.

If you are interested, here is a more detailed expanation.

> > +static int pci_epf_virtio_catch_notif(void *data)
> > +{
> > +	u16 changed;
> > +#ifdef CONFIG_PCI_EPF_VIRTIO_SUPPRESS_NOTIFICATION
> > +	void __iomem *avail_idx;
> > +	u16 event;
> > +#endif
> > +
> > +	register const __virtio16 default_notify = epf_cpu_to_virtio16(2);
> > +
> > +	struct pci_epf_virtio *const epf_virtio = data;
> > +	atomic_t *const pending = epf_virtio->pending;
> > +
> > +	while (!kthread_should_stop()) {
> > +		changed = epf_virtio16_to_cpu(epf_virtio->legacy_cfg->q_notify);
> > +		if (changed != 2) {
> > +			epf_virtio->legacy_cfg->q_notify = default_notify;
> > +			/* The pci host has made changes to virtqueues */
> > +			if (changed)
> > +				atomic_cmpxchg(pending, 0, 1);
> > +#ifdef CONFIG_PCI_EPF_VIRTIO_SUPPRESS_NOTIFICATION
> > +			avail_idx = IO_MEMBER_PTR(epf_virtio->avail[changed],
> > +						  struct vring_avail,
> > +						  idx);
> > +			event = epf_ioread16(avail_idx) + event_suppression;
> > +			write_avail_event(epf_virtio->used[changed], event);
> > +#endif
> > +		}
> > +		usleep_range(notif_poll_usec_min,
> > +			     notif_poll_usec_max);
> > +	}
> > +	return 0;
> > +}

The pending bit is set if the notification polling thread sees a value
in legacy_cfg->q_notify that is not 2, because the PCI host virtio_pci
will write either 0 when its rx queue consumes something or 1 if its tx
queue has offered a new buffer. My endpoing function will then set that
value back to 2. In this process there are numerous things that can go
wrong.

The host may write multiple 0 or 1's and the endpoint can only
detect one of them in an notif_poll usleep interval.

The host may write
some non-2 value as the endpoint code just finishes detecting the last
non-2 value and reverting that value back to 2, effectively nullifying
the new non-2 value.

The host may decide to write a non-2 value
immediately after the endpoint revert that value back to 2 but before
the endpoint code finishes the current loop of execution, effectively
making the value not reverted back to 2.

All these and other problems are made worse by the fact that the PCI
host Linux usually runs on much faster cores than the one on PCI
endpoint. This is why relying completely on pending bits is not always
safe. Hence the "fallback" check using usleep hackery exists.
Nevertheless I welcome any suggestion, because I do not like this
treatment myself either.

> > +	net_cfg->max_virtqueue_pairs = (__force __u16)epf_cpu_to_virtio16(1);
> 
> You don't need this without VIRTIO_NET_F_MQ.

Noted.

Best,
Haotian



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux