On Mon, Apr 16, 2018 at 10:16:31AM +0300, Oleksandr Andrushchenko wrote: > On 04/13/2018 06:37 PM, Daniel Vetter wrote: > > On Wed, Apr 11, 2018 at 08:59:32AM +0300, Oleksandr Andrushchenko wrote: > > > On 04/10/2018 08:26 PM, Dongwon Kim wrote: > > > > On Tue, Apr 10, 2018 at 09:37:53AM +0300, Oleksandr Andrushchenko wrote: > > > > > On 04/06/2018 09:57 PM, Dongwon Kim wrote: > > > > > > On Fri, Apr 06, 2018 at 03:36:03PM +0300, Oleksandr Andrushchenko wrote: > > > > > > > On 04/06/2018 02:57 PM, Gerd Hoffmann wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > I fail to see any common ground for xen-zcopy and udmabuf ... > > > > > > > > > Does the above mean you can assume that xen-zcopy and udmabuf > > > > > > > > > can co-exist as two different solutions? > > > > > > > > Well, udmabuf route isn't fully clear yet, but yes. > > > > > > > > > > > > > > > > See also gvt (intel vgpu), where the hypervisor interface is abstracted > > > > > > > > away into a separate kernel modules even though most of the actual vgpu > > > > > > > > emulation code is common. > > > > > > > Thank you for your input, I'm just trying to figure out > > > > > > > which of the three z-copy solutions intersect and how much > > > > > > > > > And what about hyper-dmabuf? > > > > > > xen z-copy solution is pretty similar fundamentally to hyper_dmabuf > > > > > > in terms of these core sharing feature: > > > > > > > > > > > > 1. the sharing process - import prime/dmabuf from the producer -> extract > > > > > > underlying pages and get those shared -> return references for shared pages > > > > Another thing is danvet was kind of against to the idea of importing existing > > > > dmabuf/prime buffer and forward it to the other domain due to synchronization > > > > issues. He proposed to make hyper_dmabuf only work as an exporter so that it > > > > can have a full control over the buffer. I think we need to talk about this > > > > further as well. > > > Yes, I saw this. But this limits the use-cases so much. > > > For instance, running Android as a Guest (which uses ION to allocate > > > buffers) means that finally HW composer will import dma-buf into > > > the DRM driver. Then, in case of xen-front for example, it needs to be > > > shared with the backend (Host side). Of course, we can change user-space > > > to make xen-front allocate the buffers (make it exporter), but what we try > > > to avoid is to change user-space which in normal world would have remain > > > unchanged otherwise. > > > So, I do think we have to support this use-case and just have to understand > > > the complexity. > > Erm, why do you need importer capability for this use-case? > > > > guest1 -> ION -> xen-front -> hypervisor -> guest 2 -> xen-zcopy exposes > > that dma-buf -> import to the real display hw > > > > No where in this chain do you need xen-zcopy to be able to import a > > dma-buf (within linux, it needs to import a bunch of pages from the > > hypervisor). > > > > Now if your plan is to use xen-zcopy in the guest1 instead of xen-front, > > then you indeed need to import. > This is the exact use-case I was referring to while saying > we need to import on Guest1 side. If hyper-dmabuf is so > generic that there is no xen-front in the picture, then > it needs to import a dma-buf, so it can be exported at Guest2 side. > > But that imo doesn't make sense: > > - xen-front gives you clearly defined flip events you can forward to the > > hypervisor. xen-zcopy would need to add that again. > xen-zcopy is a helper driver which doesn't handle page flips > and is not a KMS driver as one might think of: the DRM UAPI it uses is > just to export a dma-buf as a PRIME buffer, but that's it. > Flipping etc. is done by the backend [1], not xen-zcopy. > > Same for > > hyperdmabuf (and really we're not going to shuffle struct dma_fence over > > the wire in a generic fashion between hypervisor guests). > > > > - xen-front already has the idea of pixel format for the buffer (and any > > other metadata). Again, xen-zcopy and hyperdmabuf lack that, would need > > to add it shoehorned in somehow. > Again, here you are talking of something which is implemented in > Xen display backend, not xen-zcopy, e.g. display backend can > implement para-virtual display w/o xen-zcopy at all, but in this case > there is a memory copying for each frame. With the help of xen-zcopy > the backend feeds xen-front's buffers directly into Guest2 DRM/KMS or > Weston or whatever as xen-zcopy exports remote buffers as PRIME buffers, > thus no buffer copying is required. Why do you need to copy on every frame for xen-front? In the above pipeline, using xen-front I see 0 architectural reasons to have a copy anywhere. This seems to be the core of the confusion we're having here. > > Ofc you won't be able to shovel sound or media stream data over to another > > guest like this, but that's what you have xen-v4l and xen-sound or > > whatever else for. Trying to make a new uapi, which means userspace must > > be changed for all the different use-case, instead of reusing standard > > linux driver uapi (which just happens to send the data to another > > hypervisor guest instead of real hw) imo just doesn't make much sense. > > > > Also, at least for the gpu subsystem: Any new uapi must have full > > userspace available for it, see: > > > > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements > > > > Adding more uapi is definitely the most painful way to fix a use-case. > > Personally I'd go as far and also change the xen-zcopy side on the > > receiving guest to use some standard linux uapi. E.g. you could write an > > output v4l driver to receive the frames from guest1. > So, we now know that xen-zcopy was not meant to handle page flips, > but to implement new UAPI to let user-space create buffers either > from Guest2 grant references (so it can be exported to Guest1) or > other way round, e.g. create (from Guest1 grant references to export to > Guest 2). For that reason it adds 2 IOCTLs: create buffer from grefs > or produce grefs for the buffer given. > One additional IOCTL is to wait for the buffer to be released by > Guest2 user-space. > That being said, I don't quite see how v4l can be used here to implement > UAPI I need. Under the assumption that you can make xen-front to zerocopy for the kernel->hypervisor path, v4l could be made to work for the hypervisor->kernel side of the pipeline. But it sounds like we have a confusion already on why or why not xen-front can or cannot do zerocopy. > > > > danvet, can you comment on this topic? > > > > > > > > > > 2. the page sharing mechanism - it uses Xen-grant-table. > > > > > > > > > > > > And to give you a quick summary of differences as far as I understand > > > > > > between two implementations (please correct me if I am wrong, Oleksandr.) > > > > > > > > > > > > 1. xen-zcopy is DRM specific - can import only DRM prime buffer > > > > > > while hyper_dmabuf can export any dmabuf regardless of originator > > > > > Well, this is true. And at the same time this is just a matter > > > > > of extending the API: xen-zcopy is a helper driver designed for > > > > > xen-front/back use-case, so this is why it only has DRM PRIME API > > > > > > 2. xen-zcopy doesn't seem to have dma-buf synchronization between two VMs > > > > > > while (as danvet called it as remote dmabuf api sharing) hyper_dmabuf sends > > > > > > out synchronization message to the exporting VM for synchronization. > > > > > This is true. Again, this is because of the use-cases it covers. > > > > > But having synchronization for a generic solution seems to be a good idea. > > > > Yeah, understood xen-zcopy works ok with your use case. But I am just curious > > > > if it is ok not to have any inter-domain synchronization in this sharing model. > > > The synchronization is done with displif protocol [1] > > > > The buffer being shared is technically dma-buf and originator needs to be able > > > > to keep track of it. > > > As I am working in DRM terms the tracking is done by the DRM core > > > for me for free. (This might be one of the reasons Daniel sees DRM > > > based implementation fit very good from code-reuse POV). > > Hm, not sure what tracking you refer to here all ... I got lost in all the > > replies while catching up. > > > I was just referring to accounting stuff already implemented in the DRM > core, > so I don't have to worry about doing the same for buffers to understand > when they are released etc. > > > > > > 3. 1-level references - when using grant-table for sharing pages, there will > > > > > > be same # of refs (each 8 byte) > > > > > To be precise, grant ref is 4 bytes > > > > You are right. Thanks for correction.;) > > > > > > > > > > as # of shared pages, which is passed to > > > > > > the userspace to be shared with importing VM in case of xen-zcopy. > > > > > The reason for that is that xen-zcopy is a helper driver, e.g. > > > > > the grant references come from the display backend [1], which implements > > > > > Xen display protocol [2]. So, effectively the backend extracts references > > > > > from frontend's requests and passes those to xen-zcopy as an array > > > > > of refs. > > > > > > Compared > > > > > > to this, hyper_dmabuf does multiple level addressing to generate only one > > > > > > reference id that represents all shared pages. > > > > > In the protocol [2] only one reference to the gref directory is passed > > > > > between VMs > > > > > (and the gref directory is a single-linked list of shared pages containing > > > > > all > > > > > of the grefs of the buffer). > > > > ok, good to know. I will look into its implementation in more details but is > > > > this gref directory (chained grefs) something that can be used for any general > > > > memory sharing use case or is it jsut for xen-display (in current code base)? > > > Not to mislead you: one grant ref is passed via displif protocol, > > > but the page it's referencing contains the rest of the grant refs. > > > > > > As to if this can be used for any memory: yes. It is the same for > > > sndif and displif Xen protocols, but defined twice as strictly speaking > > > sndif and displif are two separate protocols. > > > > > > While reviewing your RFC v2 one of the comments I had [2] was that if we > > > can start from defining such a generic protocol for hyper-dmabuf. > > > It can be a header file, which not only has the description part > > > (which then become a part of Documentation/...rst file), but also defines > > > all the required constants for requests, responses, defines message formats, > > > state diagrams etc. all at one place. Of course this protocol must not be > > > Xen specific, but be OS/hypervisor agnostic. > > > Having that will trigger a new round of discussion, so we have it all > > > designed > > > and discussed before we start implementing. > > > > > > Besides the protocol we have to design UAPI part as well and make sure > > > the hyper-dmabuf is not only accessible from user-space, but there will be > > > number > > > of kernel-space users as well. > > Again, why do you want to create new uapi for this? Given the very strict > > requirements we have for new uapi (see above link), it's the toughest way > > to get any kind of support in. > I do understand that adding new UAPI is not good for many reasons. > But here I was meaning that current hyper-dmabuf design is > only user-space oriented, e.g. it provides number of IOCTLs to do all > the work. But I need a way to access the same from the kernel, so, for > example, > some other para-virtual driver can export/import dma-buf, not only > user-space. If you need an import-export helper library, just merge it. Do not attach any uapi to it, just the internal helpers. Much, much, much easier to land. > > That's why I had essentially zero big questions for xen-front (except some > > implementation improvements, and stuff to make sure xen-front actually > > implements the real uapi semantics instead of its own), and why I'm asking > > much more questions on this stuff here. > > > > > > > > 4. inter VM messaging (hype_dmabuf only) - hyper_dmabuf has inter-vm msg > > > > > > communication defined for dmabuf synchronization and private data (meta > > > > > > info that Matt Roper mentioned) exchange. > > > > > This is true, xen-zcopy has no means for inter VM sync and meta-data, > > > > > simply because it doesn't have any code for inter VM exchange in it, > > > > > e.g. the inter VM protocol is handled by the backend [1]. > > > > > > 5. driver-to-driver notification (hyper_dmabuf only) - importing VM gets > > > > > > notified when newdmabuf is exported from other VM - uevent can be optionally > > > > > > generated when this happens. > > > > > > > > > > > > 6. structure - hyper_dmabuf is targetting to provide a generic solution for > > > > > > inter-domain dmabuf sharing for most hypervisors, which is why it has two > > > > > > layers as mattrope mentioned, front-end that contains standard API and backend > > > > > > that is specific to hypervisor. > > > > > Again, xen-zcopy is decoupled from inter VM communication > > > > > > > > No idea, didn't look at it in detail. > > > > > > > > > > > > > > > > Looks pretty complex from a distant view. Maybe because it tries to > > > > > > > > build a communication framework using dma-bufs instead of a simple > > > > > > > > dma-buf passing mechanism. > > > > > > we started with simple dma-buf sharing but realized there are many > > > > > > things we need to consider in real use-case, so we added communication > > > > > > , notification and dma-buf synchronization then re-structured it to > > > > > > front-end and back-end (this made things more compicated..) since Xen > > > > > > was not our only target. Also, we thought passing the reference for the > > > > > > buffer (hyper_dmabuf_id) is not secure so added uvent mechanism later. > > > > > > > > > > > > > Yes, I am looking at it now, trying to figure out the full story > > > > > > > and its implementation. BTW, Intel guys were about to share some > > > > > > > test application for hyper-dmabuf, maybe I have missed one. > > > > > > > It could probably better explain the use-cases and the complexity > > > > > > > they have in hyper-dmabuf. > > > > > > One example is actually in github. If you want take a look at it, please > > > > > > visit: > > > > > > > > > > > > https://github.com/downor/linux_hyper_dmabuf_test/tree/xen/simple_export > > > > > Thank you, I'll have a look > > > > > > > > Like xen-zcopy it seems to depend on the idea that the hypervisor > > > > > > > > manages all memory it is easy for guests to share pages with the help of > > > > > > > > the hypervisor. > > > > > > > So, for xen-zcopy we were not trying to make it generic, > > > > > > > it just solves display (dumb) zero-copying use-cases for Xen. > > > > > > > We implemented it as a DRM helper driver because we can't see any > > > > > > > other use-cases as of now. > > > > > > > For example, we also have Xen para-virtualized sound driver, but > > > > > > > its buffer memory usage is not comparable to what display wants > > > > > > > and it works somewhat differently (e.g. there is no "frame done" > > > > > > > event, so one can't tell when the sound buffer can be "flipped"). > > > > > > > At the same time, we do not use virtio-gpu, so this could probably > > > > > > > be one more candidate for shared dma-bufs some day. > > > > > > > > Which simply isn't the case on kvm. > > > > > > > > > > > > > > > > hyper-dmabuf and xen-zcopy could maybe share code, or hyper-dmabuf build > > > > > > > > on top of xen-zcopy. > > > > > > > Hm, I can imagine that: xen-zcopy could be a library code for hyper-dmabuf > > > > > > > in terms of implementing all that page sharing fun in multiple directions, > > > > > > > e.g. Host->Guest, Guest->Host, Guest<->Guest. > > > > > > > But I'll let Matt and Dongwon to comment on that. > > > > > > I think we can definitely collaborate. Especially, maybe we are using some > > > > > > outdated sharing mechanism/grant-table mechanism in our Xen backend (thanks > > > > > > for bringing that up Oleksandr). However, the question is once we collaborate > > > > > > somehow, can xen-zcopy's usecase use the standard API that hyper_dmabuf > > > > > > provides? I don't think we need different IOCTLs that do the same in the final > > > > > > solution. > > > > > > > > > > > If you think of xen-zcopy as a library (which implements Xen > > > > > grant references mangling) and DRM PRIME wrapper on top of that > > > > > library, we can probably define proper API for that library, > > > > > so both xen-zcopy and hyper-dmabuf can use it. What is more, I am > > > > > about to start upstreaming Xen para-virtualized sound device driver soon, > > > > > which also uses similar code and gref passing mechanism [3]. > > > > > (Actually, I was about to upstream drm/xen-front, drm/xen-zcopy and > > > > > snd/xen-front and then propose a Xen helper library for sharing big buffers, > > > > > so common code of the above drivers can use the same code w/o code > > > > > duplication) > > > > I think it is possible to use your functions for memory sharing part in > > > > hyper_dmabuf's backend (this 'backend' means the layer that does page sharing > > > > and inter-vm communication with xen-specific way.), so why don't we work on > > > > "Xen helper library for sharing big buffers" first while we continue our > > > > discussion on the common API layer that can cover any dmabuf sharing cases. > > > > > > > Well, I would love we reuse the code that I have, but I also > > > understand that it was limited by my use-cases. So, I do not > > > insist we have to ;) > > > If we start designing and discussing hyper-dmabuf protocol we of course > > > can work on this helper library in parallel. > > Imo code reuse is overrated. Adding new uapi is what freaks me out here > > :-) > > > > If we end up with duplicated implementations, even in upstream, meh, not > > great, but also ok. New uapi, and in a similar way, new hypervisor api > > like the dma-buf forwarding that hyperdmabuf does is the kind of thing > > that will lock us in for 10+ years (if we make a mistake). > > > > > > > Thank you, > > > > > Oleksandr > > > > > > > > > > P.S. All, is it a good idea to move this out of udmabuf thread into a > > > > > dedicated one? > > > > Either way is fine with me. > > > So, if you can start designing the protocol we may have a dedicated mail > > > thread for that. I will try to help with the protocol as much as I can > > Please don't start with the protocol. Instead start with the concrete > > use-cases, and then figure out why exactly you need new uapi. Once we have > > that answered, we can start thinking about fleshing out the details. > On my side there are only 2 use-cases, Guest2 only: > 1. Create a PRIME (dma-buf) from grant references > 2. Create grant references from PRIME (dma-buf) So these grant references, are those userspace visible things? I thought the grant references was just the kernel/hypervisor internal magic to make this all work? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch