Am Donnerstag, den 22.10.2015, 09:12 +0200 schrieb Daniel Vetter: > On Wed, Oct 21, 2015 at 06:04:01PM +0100, Russell King - ARM Linux wrote: > > On Tue, Oct 20, 2015 at 11:36:27AM +0200, Daniel Vetter wrote: > > > On Fri, Sep 11, 2015 at 04:10:10PM +0200, Lucas Stach wrote: > > > > Hey all, > > > > > > > > this is a new posting of the Etnaviv DRM driver for Vivante embedded GPUs. > > > > This time I've squashed all patches to the DRM driver itself into a single commit > > > > to make it easier for people to look at and review this stuff. > > > > > > > > Aside from squashing of some of the trivial bugfixes I intend to keep all the > > > > individual commits around to retain the authorship of people working on this > > > > driver. If you want to look at the stream of commits please fetch > > > > > > > > git://git.pengutronix.de/git/lst/linux.git etnaviv-for-upstream > > > > > > Finally unlazied and looked at this, assuming it's v2. Piles of comments: > > > > > > - Please don't use dev->struct_mutex in new driver code. With latest > > > linux-next there's no reason at all for driver's to use it, and doing so > > > will massively encumber your driver with everything else. > > > > > > There's a bit a trick involved since as-is struct_mutex allows you to do > > > horrible leaky locking designs around gem_free_object. On a quick look > > > to fix this you probably need a mutex for the drm_mm and various lists, > > > plus per object reservations. Tricky part is the eviction logic in > > > etnaviv_iommu_map_gem, where you need to trylock eviction candidates. If > > > you can't lock them you can't evict them anyway, so no harm done. > > > > > > If that's too much just replace it with your own lock and trylock in > > > gem_free_object, punting to a worker if that fails (ttm has a > > > deferred_free list for that, protected by a spinlock). > > > > > > - ->load and ->unload are deprecated hooks becaus fundamentally racy (and > > > we can't fix it since it would break dri 1 crap). Please use instead: > > > > > > drm_dev_alloc(); > > > /* load code */ > > > drm_dev_register(); > > > > > > and > > > > > > drm_dev_unregister(); > > > /* unload code */ > > > drm_dev_unref(); > > > > > > Laurent just sent a nice patch for rcar to do that conversion. That will > > > also get rid of the deprecated drm_platform_init and drm_dev_put. > > > > > > - check pad for 0 in gem_info ioctl. > > > > My tree already does this, and afaik it was part of Christian's patches. > > I'm not sure whether Lucas' patches are missing something. > > > > +static int etnaviv_ioctl_gem_info(struct drm_device *dev, void *data, > > + struct drm_file *file) > > +{ > > + struct drm_etnaviv_gem_info *args = data; > > + struct drm_gem_object *obj; > > + int ret = 0; > > + > > + if (args->pad) > > + return -EINVAL; > > > > Did you miss it? > > That wasn't there, I guess I looked at an outdated tree :() The tree you've looked at was up-to-date until yesterday when Russell pulled patches into his tree and build atop of that. I've just looked at that tree again and it certainly includes the above check. > > > > > - the flags check for gem_new seems leaky since you only check for flags & > > > ETNA_BO_CACHE_MASK. > > > > Fixed in etnaviv_ioctl_gem_new(). > > > > > - similar input validation issue for op in gem_cpu_prep > > > > Fixed in etnaviv_ioctl_gem_cpu_prep(). > > > > > - maybe add a flags/op to future-proof gem_cpu_fini just in case? Might be > > > useful to optimize cache flushing. > > > > Having just merged Lucas' patch, which carries the op from gem_cpu_prep() > > over to gem_cpu_fini(), I'm wondering if this is hiding a potential > > problem - what if two threads call gem_cpu_prep() but with different 'op' > > arguments. Seems rather fragile, as 'etnaviv_obj->last_cpu_prep_op' > > becomes rather indeterminant. > > Hm, dropping last_cpu_prep_op and adding it to the fini ioctl might be an > option. But I have no idea whether the dma api likes that really. Wrt > safety I don't think there's a concern here - if userspace decides to be > buggy wrt coherency tracking it'll get all the pieces. > Yes, this is unsafe right now with multiple threads. But I think we can't just allow multiple userspace threads to prepare a buffer with different OPs, I'm not even sure it's a good idea to let 2 threads prepare the buffer at the same time at all (even if the OPs are compatible, this gets really funny about when to write back the caches). So I think we need to block the second thread until the first one does a fini. This might have some performance implications, but it makes it a lot easier to get the coherency rules straight. > > > - the naming in gem_submit_reloc confuses me a bit, but that's just a > > > bikeshed ;-) > > > > > > - gem_submit seems to miss a flag, ime definitely needed (just to be able > > > to extend to future hw iterations) > > > > Grumble, another API revision I'll need to maintain in the DDX driver. > > (Even though this isn't in mainline, I'm already up to three different > > kernel APIs for etnadrm.) If we do this, I'll want to lump it together > > with other API changes (like the one below for flags) so I'll wait until > > we've got an answer to the gem_wait_fence question. > > > > > - gem_submit->exec_state doesn't seem to be validated (it's just an if > > > (exec_state == 2D) ... else ... in cmd_select_pipe) > > > > Fixed. > > > > > - all the array allocations aren't checked for integer overflows in > > > gem_submit. Just use kmalloc_array or similar to get this right. That > > > means you need to allocations in submit_create, but imo better safe than > > > security-buggy. Similar problem in submit_reloc, but there > > > copy_from_user will protect you since you only copy individual structs. > > > Still a bit fragile. > > > > I'm not sure kmalloc_array() is the right answer there, but I'll look > > into it - I'd really like to avoid doing lots of small kmalloc()s all > > over the place as each one has a non-zero cost. The more we can lump > > together, the better - but it has to be done safely. > > That was just my preference since I have a hard time reasonining about > overflow checks so like to avoid them. > > > > - flags for gem_wait_fence perhaps? Probably not needed ever. > > > > We could, to be on the safe side, add some padding to the struct, and > > not call it "flags" until we need flags. Christian, Lucas, any thoughts > > from the 3D and VG point of view on this? > > Whatever you call it you must check that it's 0, otherwise it'll be > garbage-filled by some userspace and unuseable. > If we already need to check for 0, we might as well call it flags. I need to think about this a bit more if I can conceive any usage. > > > - gem_userptr should probably check for page alignment since that's the > > > only unit you can map into the iommu. At least I didn't spot that check > > > anywhere. > > > > Added (even though it'll probably break Xvideo...) I'll give it a test > > later this afternoon/evening. > > Same issue on the intel side, we just expect userspace to round things to > pages. Same like mmapping a file. > > > > Random reading all around and looks pretty overall. > > > > > > One more question: Do you have validation tests for the basic ioctl > > > interfaces? I'd like to push igt as the general drm gpu tests suite, and > > > we now have support to run testcases on non-i915 drivers. Some are generic > > > and run everywhere (lots more need to be converted to be generic), but I'd > > > also welcome driver-specific tests, maybe in an etnaviv subdirectory. > > > > I don't think we have - my validation is "does it work with my DDX > > driver without rendering errors" which so far has proven to be good > > at finding bugs. This doesn't use libdrm-etnaviv as I need to maintain > > compatibility with libetnaviv's API (the original open source library > > that talks to Vivante's open source kernel space) to avoid having to > > maintain two near-identical GPU backends. > > > > Christian uses his libdrm-etnaviv library plugged into mesa. So we > > have two independently created and maintained code bases talking to > > this kernel interface. > > Ime with igt the value of a testsuite is more in exercising corner cases, > to avoid too much trouble with security holes (do we validate really > everything). And to make it easier to create specific testcases for bugs > you'll find which are hard to reproduce by just running piglit over mesa > or xts&rendercheck over the ddx. And especially for those bugs it's nice > to have some basic tests in place so it won't take you forever (which > means no one will do it) to write the corner-case tests for when you do > hit some obscure bug. We don't have any tests right now as it's already enough work to keep the existing userspace parts in sync with the changing API. I agree that having simple tests for corner cases and discovered issues is a good idea once the API is hammered out. Regards, Lucas -- Pengutronix e.K. | Lucas Stach | Industrial Linux Solutions | http://www.pengutronix.de/ | _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel