On Mon, May 13, 2013 at 1:18 PM, Inki Dae <inki.dae@xxxxxxxxxxx> wrote: > > > 2013/5/13 Rob Clark <robdclark@xxxxxxxxx> >> >> On Mon, May 13, 2013 at 8:21 AM, Inki Dae <inki.dae@xxxxxxxxxxx> wrote: >> > >> >> In that case you still wouldn't give userspace control over the fences. >> >> I >> >> don't see any way that can end well. >> >> What if userspace never signals? What if userspace gets killed by oom >> >> killer. Who keeps track of that? >> >> >> > >> > In all cases, all kernel resources to user fence will be released by >> > kernel >> > once the fence is timed out: never signaling and process killing by oom >> > killer makes the fence timed out. And if we use mmap mechanism you >> > mentioned >> > before, I think user resource could also be freed properly. >> >> >> I tend to agree w/ Maarten here.. there is no good reason for >> userspace to be *signaling* fences. The exception might be some blob >> gpu drivers which don't have enough knowledge in the kernel to figure >> out what to do. (In which case you can add driver private ioctls for >> that.. still not the right thing to do but at least you don't make a >> public API out of it.) >> > > Please do not care whether those are generic or not. Let's see the following > three things. First, it's cache operation. As you know, ARM SoC has ACP > (Accelerator Coherency Port) and can be connected to DMA engine or similar > devices. And this port is used for cache coherency between CPU cache and DMA > device. However, most devices on ARM based embedded systems don't use the > ACP port. So they need proper cache operation before and after of DMA or CPU > access in case of using cachable mapping. Actually, I see many Linux based > platforms call cache control interfaces directly for that. I think the > reason, they do so, is that kernel isn't aware of when and how CPU accessed > memory. I think we had kicked around the idea of giving dmabuf's a prepare/finish ioctl quite some time back. This is probably something that should be at least a bit decoupled from fences. (Possibly 'prepare' waits for dma access to complete, but not the other way around.) And I did implement in omapdrm support for simulating coherency via page fault-in / shoot-down.. It is one option that makes it completely transparent to userspace, although there is some performance const, so I suppose it depends a bit on your use-case. > And second, user process has to do so many things in case of using shared > memory with DMA device. User process should understand how DMA device is > operated and when interfaces for controling the DMA device are called. Such > things would make user application so complicated. > > And third, it's performance optimization to multimedia and graphics devices. > As I mentioned already, we should consider sequential processing for buffer > sharing between CPU and DMA device. This means that CPU should stay with > idle until DMA device is completed and vise versa. > > That is why I proposed such user interfaces. Of course, these interfaces > might be so ugly yet: for this, Maarten pointed already out and I agree with > him. But there must be another better way. Aren't you think we need similar > thing? With such interfaces, cache control and buffer synchronization can be > performed in kernel level. Moreover, user applization doesn't need to > consider DMA device controlling anymore. Therefore, one thread can access a > shared buffer and the other can control DMA device with the shared buffer in > parallel. We can really make the best use of CPU and DMA idle time. In other > words, we can really make the best use of multi tasking OS, Linux. > > So could you please tell me about that there is any reason we don't use > public API for it? I think we can add and use public API if NECESSARY. well, for cache management, I think it is a better idea.. I didn't really catch that this was the motivation from the initial patch, but maybe I read it too quickly. But cache can be decoupled from synchronization, because CPU access is not asynchronous. For userspace/CPU access to buffer, you should: 1) wait for buffer 2) prepare-access 3) ... do whatever cpu access to buffer ... 4) finish-access 5) submit buffer for new dma-operation I suppose you could combine the syscall for #1 and #2.. not sure if that is a good idea or not. But you don't need to. And there is never really any need for userspace to signal a fence. BR, -R > Thanks, > Inki Dae > >> >> BR, >> -R >> _______________________________________________ >> dri-devel mailing list >> dri-devel@xxxxxxxxxxxxxxxxxxxxx >> http://lists.freedesktop.org/mailman/listinfo/dri-devel > > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel