[PATCH/RFC v1 0/1] Buffer sync for non cache-coherent architectures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

this is the initial patch for buffer data synchronization for architectures
with non-coherent cache.


=====================
Rationale
=====================

Architectures with non-coherent CPU cache (e.g. ARM) require a sync both before
and after a hardware operation. Until now, videobuf could work properly for
cache-coherent memory areas only, which for ARM actually means no cache at all.
This is not only reduces functionality, but hinders performance.

We would like to add support for video buffers present in CPU-cached areas as
well, which is especially important for OUTPUT buffers, but valid for CAPTURE
buffers as well (see DMA_FROM_DEVICE below).


We have isolated 4 different types of sync operations. Three of them are the
ones defined in pci.h and dma-mapping.h (enum dma_data_direction):

- DMA_FROM_DEVICE - the buffer is to be used as a destination buffer, the
  data contained within before the operation is unimportant.

  Although the data in memory is not important, we have to prevent a possible
  future writeback to it resulting in an accidental overwrite of contents
  produced by hardware after an operation. This requires cache invalidation
  before the operation.

- DMA_TO_DEVICE - the data in buffer has been touched by (prepared using)
   the CPU and will be used as valid source for an operation. This data
   may still be in cache but not yet in memory. This requires a writeback.

- DMA_BIDIRECTIONAL - the buffer will be used as both source and destination.
  This requires both writeback and cache invalidation.

There is one more operation we are considering, although not really
cache-related:

- FINISH - the operation is finished and the data in buffer (put there by
  hardware) will be used further.

  Operations required here may include:
  * Making the memory pages involved in the operation dirty.
  * Copying data back from a bounce buffer.
  They are not strictly cache-related, but valid from the point of view
  of our proposed approach.



>From the point of view of the videobuf framework, the following scenarios can be
isolated, depending on when the sync has been called and current buffer type:

- before hardware operation
   - OUTPUT: DMA_TO_DEVICE
   - CAPTURE: DMA_FROM_DEVICE

- after hardware operation
   - OUTPUT: nothing
   - CAPTURE: FINISH

DMA_BIDIRECTIONAL would take place for OUTPUT+CAPTURE buffers, which are not
(yet?) used in videobuf.


=====================
sync() in videobuf
=====================

videobuf includes a sync operation, declared in struct videobuf_qtype_ops,
which can be implemented by each memory type-specific module. It is used for
different purposes than cache management though, namely for operations like
copying data back from bounce buffers after an operation.
Only videobuf-dma-sg does anything in its sync currently.

Operations required to be done before hardware operation is started are
currently performed in iolock(), but it is not usable for cache coherency
management. The reason for this is that iolock() is intended to be run once
per buffer, not once per each hardware operation. Cache coherency operations
have to be performed before every operation though, also on previously
iolock()ed buffers.

We believe that the existing videobuf sync() operation can be extended for cache
management. This requires adding sync() calls before each hardware operation
as well. It is left to the callee to determine the kind of sync requested,
taking into account the guidelines below.

No new flags/states/etc. have to be added to videobuf for it to support all
kinds of syncs mentioned above. Current sync type can be determined in
memory-specific code in two steps:

1. CAPTURE or OUTPUT - based on buffer type in struct videobuf_queue.
2. before or after operation - based on buffer state in struct videobuf_buffer:
    VIDEOBUF_DONE or VIDEOBUF_ERROR -> after, otherwise -> before.


Example (pseudo)code for a sync() operation:

#define is_sync_after(vb) \
        (vb->state == VIDEOBUF_DONE || vb->state == VIDEOBUF_ERROR)

int videobuf_foo_sync(struct videobuf_queue *q, struct videobuf_buffer *vb)
{
    /* ... */

    if (is_sync_after(vb) {
        /* Sync after operation */
        if (q->type == V4L2_BUF_TYPE_VIDEO_CAPTURE) {
            do_sync_finish();
        } else if (q->type == V4L2_BUF_TYPE_VIDEO_OUTPUT) {
            /* nothing, unless we are missing something */
        }
    } else {
        /* Sync before operation */
        if (q->type == V4L2_BUF_TYPE_VIDEO_CAPTURE) {
            do_sync_from_device();
        } else if (q->type == V4L2_BUF_TYPE_VIDEO_OUTPUT) {
            do_sync_to_device();
        }
    }

    /* ... */
}



=====================
About this patch
=====================

This is a very small patch, just a starting point for future, memory-specific
development. No existing functionality is changed. These are the only changes
that - in our opinion - are required in the core framework. Each memory-specific
videobuf submodule will have to implement (or ignore) additional sync
functionality depending on its memory type.

A small patch for dma-sg, which is the only type that has its own sync is
included. The only thing it does is recognizing whether the sync is a
"post-operation" one and in that case the same code as previously is called, so
the functionality remains unchanged.

We will be posting cache coherency patches for dma-contig in the near future.


Best regards
--
Pawel Osciak
Linux Platform Group
Samsung Poland R&D Center
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Input]     [Video for Linux]     [Gstreamer Embedded]     [Mplayer Users]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]
  Powered by Linux