The FCam camera control API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I wanted to talk a bit more about our project here at Stanford, because I thought several people on this list might be interested in it (although I know several of you may already have heard about it).

The short version:
We released a high-level camera control API for the Nokia N900 a few months ago, which allows for precise frame-level control of the sensor and associated devices while running at full sensor frame rate. Instead of treating the imaging pipeline as a box with control knobs on it that sends out a flow of images, it's instead a box that transforms image requests into images, with the requests encapsulating all sensor configuration (including resolution and frame duration) inside them. This makes it very easy to write applications such as an HDR viewfinder, which needs to alternate sensor exposure time every frame. We've released a sample application for the N900 with features such as full manual camera control, HDR viewfinding/metering, best-of-8 burst mode, and saving raw sensor data as DNGs. Our co-authors at Nokia have also released an HDR app that does on-device HDR fusion and a low-light photography app that improves image quality for dark scenes.

The home page is here:
http://fcam.garage.maemo.org

The long version:
Over the last two years or so, the research group I'm part of has been working on building our own fully open digital camera platform, aimed at researchers and hobbyists working in computational photography. That covers more mundane things like HDR photography or panorama capture, and more out-there things like handshake removal, post-capture refocusable images, and so on. This has all been in collaboration with Nokia Research Center Palo Alto.

After poking around with existing hardware and APIs, we found all of them to have various limitations from our point of view - there are very few open camera hardware platforms out there, and even if they were, most camera control APIs are not well-suited for computational photography.

Most of the details can be found in our SIGGRAPH 2010 paper, here:
http://graphics.stanford.edu/papers/fcam/

So, we built our own user-space C++ API on top of V4L2, which runs on both the N900 and our home-built F2 Frankencamera (both use the OMAP3). We call it FCam.


What does it do that V4L2 doesn't?

1) Imaging system configuration is done on a per-frame basis - every frame output from the system can have a different set of parameters, deterministically. This is done by moved sensor/imaging pipeline state out of the device, and into requests that are fed to the device. One image is produced for every request fed into the system, and multiple requests can be in flight at once. The only blocking call is the one to pop a completed Frame from the input queue.

2) Metadata (the original request, statistics unit output, state of other associated devices like the lens) and the image data is all tied together into a single packaged Frame that's handed back to the user. This makes it trivial to determine what sensor settings were used to capture the image data, so that one can easily write a metering or autofocus routine, for example.

3) Other devices can be synchronized with sensor operation - for example, the flash can be set to fire N ms into the exposure for a request. The application doesn't have to do the synchronization work itself.

Where did we have problems with V4L2?

Mostly everything was fine, but we ran into some problems that are due to V4L2's design as a video streaming API:
1) Fixed resolution
2) Fixed frame rate
We want to swap between viewfinder and full resolution frames as fast as possible (and in the future, between arbitrary resolutions or regions of interest), and if we're capturing an HDR burst, say, we don't want the frame rate to be constrained by the longest required exposure.

Our current implementation works around #2 by adding a custom V4L2 control to change frame duration while streaming - this is clearly against the spirit of the V4L2 API, but is essential for our system to work well. I'd be interested in knowing if there's a better way to deal with this than circumventing the API's promises.

#1 we couldn't do anything about, so if a request has a different resolution/pixel format than the previous request, the runtime has to simply stop V4L2 streaming, reconfigure, and start streaming again. I don't see this part of V4L2 changing in the future, but hopefully the switching time will be reduced as implementations improve (with the shipping N900 OMA3 ISP driver, this takes about 700 ms).

We're now working on a F3 Frankencamera with a large (24x24 mm) sensor, using the Cypress LUPA-4000 image sensor (The F2 uses the Aptina MT9P031 cell phone sensor). We have an NSF grant to distribute N900s and F2 or F3 Frankencameras to reseachers and classes in the US, to run courses in computational photography and to provide reseachers with a platform to experiment with.

We think we've come up with something that works better than typical application-level still camera APIs, both for writing regular camera applications, and for the more crazy experimental stuff we're interested in. So we're hoping developers and manufacturers take a look at what we've done, and perhaps the capabilities we'd love to have will become commonplace.

Regards,

Eino-Ville (Eddy) Talvala
Camera 2.0 Project, Computer Graphics Lab
Stanford University


--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Input]     [Video for Linux]     [Gstreamer Embedded]     [Mplayer Users]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]
  Powered by Linux