Hi,
Thanks for taking a look at this.
On 5/1/20 2:05 AM, Greg KH wrote:
On Thu, Apr 30, 2020 at 04:19:22PM -0500, Jeremy Linton wrote:
On arm64, and possibly other architectures, requesting
IO coherent memory may return Normal-NC if the underlying
hardware isn't coherent. If these pages are then
remapped into userspace as Normal, that defeats the
purpose of getting Normal-NC, as well as resulting in
mappings with differing cache attributes.
What is "Normal-NC"?
A non-cacheable attribute on arm64 pages. I think Mark R & Marc Z
elaborated while I was asleep (thanks!).
.
In particular this happens with libusb, when it attempts
to create zero-copy buffers as is used by rtl-sdr, and
What is "rtl-sdr"
Its the realtek software defined radio (SDR), a really inexpensive TV
dongle that was discovered could be used as a general purpose SDR a
decade or so ago. In particular, this project
https://github.com/osmocom/rtl-sdr/
which is packaged by fedora/etc.
maybe other applications. The result is usually
application death.
So is this a new problem? Old problem? Old problem only showing up on
future devices? On current devices? I need a hint here as to know if
this is a bugfix or just work to make future devices work properly.
This has been a problem on arm devices without IO coherent USB
apparently for years. The rtl-sdr project itself has a disable zero-copy
mode that people have been using on rpi/etc specific builds. Fedora
OTOH, is building it with the same flags on x86 & arm64 which means that
people report problems. This happened a few days ago (on a pinebook),
and I duplicated it on an NXP platform just running the `rtl_test`
artifact with a nooelec from my junk box. Guessing that it was a page
mismatch I went looking for that, rather than disabling the zero copy
since punishing arm machine that have IO coherent USB adapters for the
sins of these low end devices isn't ideal. I found this, and this patch
allows the rtl_test app to run without issues on my NXP/solidrun.
Plus, given that its actually a kernel/libusb problem its likely there
are other applications having similar problems.
If dma_mmap_attr() is used instead of remap_pfn_range,
the page cache/etc attributes can be matched between the
kernel and userspace.
Signed-off-by: Jeremy Linton <jeremy.linton@xxxxxxx>
---
drivers/usb/core/devio.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
index 6833c918abce..1e7458dd6e5d 100644
--- a/drivers/usb/core/devio.c
+++ b/drivers/usb/core/devio.c
@@ -217,6 +217,7 @@ static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
{
struct usb_memory *usbm = NULL;
struct usb_dev_state *ps = file->private_data;
+ struct usb_hcd *hcd = bus_to_hcd(ps->dev->bus);
size_t size = vma->vm_end - vma->vm_start;
void *mem;
unsigned long flags;
@@ -250,9 +251,7 @@ static int usbdev_mmap(struct file *file, struct vm_area_struct *vma)
usbm->vma_use_count = 1;
INIT_LIST_HEAD(&usbm->memlist);
- if (remap_pfn_range(vma, vma->vm_start,
- virt_to_phys(usbm->mem) >> PAGE_SHIFT,
- size, vma->vm_page_prot) < 0) {
+ if (dma_mmap_attrs(hcd->self.sysdev, vma, mem, dma_handle, size, 0)) {
Given that this code has not changed since 2016, how has no one noticed
this issue before?
They have there are a lot of reports of sdr failures, but the general
use case is rare?
And have you tested this change out on other systems (i.e. x86) to
ensure that this still works properly?
Yes and no, I did some basic libusb tests on an x86 machine, but its a
bit tricky at the moment for me to get the rtl plugged into a x86 test
machine. (its a work in progress).
And why isn't this call used more by drivers if this is a real issue?
The particulars of asking for iocoherent memory and then mapping it to
userspace is rarer than just asking for kmalloc()/remap() and then
performing the dma ops?
Then there are all the softer issues around arm64 testing/availability
and vendors carrying "fixes" for particular issues (like rtl-sdr
disabling zero copy).
And will this cause issues with how the userspace mapping is handled as
now we rely on userspace to do things differently? Or am I reading the
dma_mmap_attrs() documentation wrong?
I don't think userspace is doing anything differently here, and AFAIK,
on systems with IO coherent adapters this ends up with the same page
mapping as just doing the remap_pfn_rage() with the same attributes as
before. I've looked at dma_map_attrs() a bit, but i'm also trusting it
does what it says on the tin.
Thanks again for looking at this.
thanks,
greg k-h