On Fri, Feb 09, 2018 at 06:09:09PM +0200, Oleksandr Andrushchenko wrote: > On 02/09/2018 05:53 PM, Ville Syrjälä wrote: > > On Thu, Feb 08, 2018 at 11:53:30AM +0200, Oleksandr Andrushchenko wrote: > >> Hello, Ville! > >> > >> On 02/06/2018 06:04 PM, Ville Syrjälä wrote: > >>> On Tue, Feb 06, 2018 at 11:59:37AM +0200, Oleksandr Andrushchenko wrote: > >>>> Hello, Ville! > >>>> > >>>> Thank you very much for such a comprehensive answer. > >>>> > >>>> Please see inline > >>>> > >>>> > >>>> On 02/05/2018 06:47 PM, Ville Syrjälä wrote: > >>>>> On Mon, Feb 05, 2018 at 06:03:25PM +0200, Oleksandr Andrushchenko wrote: > >>>>>> Hello, > >>>>>> > >>>>>> > >>>>>> I have a DRM driver which implements display protocol for Xen [1] > >>>>>> and this protocol has a dedicated XENDISPL_OP_PG_FLIP request which > >>>>>> tells the other end that some display buffer needs to be displayed, > >>>>>> e.g. it is issued by the frontend DRM driver to the corresponding > >>>>>> backend when the former wants to display a buffer. > >>>>>> Two notes: > >>>>>> 1. Communication between two remote parties can obviously fail. > >>>>>> 2. At the moment only primary plane is supported, so we can think of > >>>>>> the display buffer as of CRTC's primary fb. > >>>>>> > >>>>>> There are two APIs available for user-space to initiate a page-flip > >>>>>> (please correct me if I am wrong here): legacy via DRM_IOCTL_MODE_PAGE_FLIP > >>>>>> and more recent via DRM_IOCTL_MODE_ATOMIC. My current implementation > >>>>>> (which is 4.9 based) uses drm_crtc_funcs.page_flip callback to send > >>>>>> XENDISPL_OP_PG_FLIP request to the backend, so if communication fails > >>>>>> I can return an error code, so the DRM core knows that page flip > >>>>>> cannot be done. > >>>>>> > >>>>>> But now I am about to enable page flipping via DRM_IOCTL_MODE_ATOMIC for > >>>>>> which this happens without DRM_IOCTL_MODE_PAGE_FLIP, but via .atomic_check + > >>>>>> .atomic_flush callbacks. > >>>>>> > >>>>>> The solution I have in mind is that I move the XENDISPL_OP_PG_FLIP request > >>>>>> to the .atomic_flush callback which seems to be the right place to serve > >>>>>> both DRM_IOCTL_MODE_PAGE_FLIP and DRM_IOCTL_MODE_ATOMIC (is this?). > >>>>>> > >>>>>> The questions I have with this are: > >>>>>> > >>>>>> 1. What is the framebuffer I can send to the backend? > >>>>>> I assume I can use crtc->primary->fb from > >>>>>> drm_crtc_helper_funcs.atomic_flush, > >>>>>> is this assumption correct? > >>>>> Planes are explicit in the atomic world, so you should just deal with > >>>>> the plane if and when it's part of the drm_atomic_state. Look at eg. > >>>>> drm_atomic_helper_commit_planes() how to figure out what's in the > >>>>> state. > >>>> Yes, this is clear. The question was about the frame buffer > >>>> I get in drm_crtc_helper_funcs.atomic_flush which only has > >>>> old_crtc_state, so I assumed I can use crtc->primary->fb there. > >>>> Or you mean I have to dig into old_crtc_state->state to find > >>>> out if there is a plane and if it has a frambuffer and if so > >>>> use it instead of crtc->primary->fb? > >>> You will get the plane explicitly as part of the drm_atomic_state. > >>> Normally you shouldn't need to go find it via other means. > >>> > >>> Oh, and never use the plane->fb etc. pointers in an atomic driver. > >>> Those are for legacy purposes only. Atomic drivers should only ever > >>> deal with proper state objects. > >> I am using fb from corresponding state now, thank you > >>>>> And I guess you're already planning on using that helper since > >>>>> you mention .atomic_flush(). > >>>> Not directly, but via drm_mode_config_funcs.atomic_commit > >>> .atomic_commit() is a hook the driver has to provide. Most drivers use > >>> the helper for it, which in turn requires the driver to provide other > >>> hooks such as .atomic_flush(). That is what you appear to be doing. > >> you are correct > >>>>> One should really think of the drm_atomic_state as more of a transaction > >>>>> rather than as a state (since not all objects need be part of it). > >>>> Thank you > >>>>>> 2. Is the check (either to send or not) for crtc->primary->fb != NULL > >>>>>> enough? > >>>>>> I assume there are other cases when .atomic_flush gets called, > >>>>>> so is there a way I can filter out unneeded cases? E.g. those which > >>>>>> are not for page flipping? > >>>>> Each object taking part in the transaction will have an associated > >>>>> new state and old state. > >>>> As I replied above I only have old state in .atomic_flush > >>> Oh right. That way of passing the old state only dates back to earlier > >>> days of atomic. To find the new state what you should do these days > >>> is something like: > >>> > >>> foo(struct drm_crtc *crtc, struct drm_crtc_state *old_state) > >>> { > >>> struct drm_atomic_state *state = old_state->state; > >>> struct drm_crtc_state *new_state = > >>> drm_atomic_get_new_crtc_state(state, crtc); > >>> ... > >>> > >>> The old way was to use the crtc->state pointer as either the old > >>> or new state depending on where you are in the commit sequence. > >>> But that wasn't very good so Maarten changed things so that we > >>> now have explicit old and new states for each object tracked in > >>> the drm_atomic_state. > >>> > >>> I think what we should really do is change most of the hooks to > >>> pass crtc+drm_atomic_state instead, and let each function grab > >>> the old and/or new crtc state from the drm_atomic_state as needed. > >>> I believe that would avoid confusing people with just the old or > >>> new state. > >> That would be great > >>>>> You can compare the two to figure out what > >>>>> changed, and in addition there may already some flags in the base > >>>>> class for the state that indicate what may have changed (eg. > >>>>> drm_crtc_state::mode_changed etc.). Such flags may be set by the > >>>>> core to help figure out what needs doing. > >>>> Yes, thank you > >>>>>> 3. If communication with the backend fails there is no way for me > >>>>>> to tell the DRM core about this as .atomic_flush is called after > >>>>>> "the point of no return". Do you think I can remember the error > >>>>>> code and report it next time .atomic_check is called? So, other page > >>>>>> flips will not run on broken connection, for example. > >>>>> Do you know when it might fail? > >>>> Not really, this is a software protocol to talk from > >>>> the frontend para-virtual DRM driver to its backend > >>>> in other Xen domain > >>> I see. Well, without evidence to the contrary I'd probably just assume > >>> it'll never fail. > >> I am not sure I can guarantee this > >>> That avoids complicating the code with potentially > >>> useless logic. > >> Unfortunately, I'll have to put something in > >>> Hammering it with something like igt for a while might > >>> serve as a good way to test that assumption. Not sure how many tests in > >>> igt currently run on non-i915 though. > >> The first test suite I tried was IGT indeed, but unfortunately > >> it is coupled with i915 too much, so only very basic tests > >> could be run. Then I found [1] which can be easily extended > >> without much efforts, so I stick to fork of it (hope to contribute > >> there later, for example, by adding ION tests) > >>>>> If so you should implement the appropriate > >>>>> checks in .atomic_check() etc. to try and make sure it never happens > >>>>> (barring a total hardware failure for example). > >>>>> > >>>>> Generally what we (i915) do is try to check everything up front, but if > >>>>> an unexpected error does happen later we just muddle through and log an > >>>>> error. > >>>>> > >>>>> For us I think the most common late failure is DP link training failure. > >>>>> That we can't fully check up front since it depends on external factors: > >>>>> sink, dongles, cabling, etc. For those we added a new connector property > >>>>> to signal to userspace that the link is having issues, allowing > >>>>> userspace to reconfigure things such that we lower the link bandwidth > >>>>> requirements. > >>>> I cannot do that this way, because the driver has to run > >>>> seamlessly for user-space, no specific utilities are > >>>> expected to run. > >>> There must be something running or you never get anything on the screen. > >>> > >> Yes, sorry for not being precise: I meant that nothing > >> driver specific, e.g. which knows internals of the driver. > >>>>> The link_status mechanism could perhaps be used to to work around other > >>>>> "late errors". But in general if you want to somehow fix that error you > >>>>> have to know what caused it, no? > >>>> That is correct, so I will print an error message on > >>>> page flip error so user has at list a clue on what's > >>>> wrong > >>>>> So if you just get a random error for > >>>>> seemingly no reason there's very little you can do apart from blindly > >>>>> retrying and logging an error. For the DP case the fallback mechanism is > >>>>> pretty clear: reduce link rate and/or number of lanes. > >>>>> > >>>>> As for signalling the error in the next ioctl call, that could be done > >>>>> I suppose. > >>>> I tried that approach and it seems to work. > >>>>> But again the question is how would userspace know what > >>>>> (if anything) it can do to fix the problem? > >>>> Well, this would be seen as just an error to user-space > >>>> and unfortunately if it is not prepared to deal with then it will > >>>> close. Not sure I can do anything smart about it > >>>>>> 4. As per my understanding I cannot put XENDISPL_OP_PG_FLIP request > >>>>>> into .atomic_check as the check doesn't guarantee that .atomic_flush > >>>>>> will follow. Is this correct or there is a more neat solution exists? > >>>>> Thou shalt not touch the hardware until .atomic_commit(). Note that > >>>>> .atomic_commit() itself is still allowed to fail, but only if you > >>>>> can fail atomically. > >>>>> > >>>>> The .atomic_flush() & co. helper stuff is designed in a way that > >>>>> expects no failures at that late stage of the commit. If that > >>>>> doesn't suit your hardware design then you may opt to not use the > >>>>> helper. > >>>>> > >>>>> Also note that non-blocking operations can make this even more difficult > >>>>> because in that case even .atomic_commit() doesn't generally touch > >>>>> the hardware and instead that task is delegated to eg. a work queue. > >>>>> So by the time the hardware indicates an error the ioctl has long since > >>>>> returned to userspace. > >>>>> > >>>> This is my case as I send a page flip request to the backend > >>>> and later in time receive the corresponding response. > >>>> > >>>> Do you mind looking at my current WIP implementation of > >>>> atomic commit [1], [2]? This work is done as a preparation for > >>>> upstreaming the driver and is still in progress though. > >>> Usually one would handle page flips from the plane's .atomic_update() > >>> hook instead of the crtc's .atomic_flush(). But if you only have the one > >>> plane then this could work. But you should get at the plane fb via > >>> drm_atomic_get_new_plane_state() etc. > >>> > >>> Generally looks like you need some more work on the plane .atomic_check() > >>> function. Using drm_atomic_helper_check_plane_state() should get you > >>> most of the way there I'd imagine. Not sure if disabling the plane > >>> independently of the crtc makes any sense, but if not you should > >>> look at drm_simple_kms_helper.c for an example. Hmm. Not sure if you > >>> couldn't just use drm_simple_kms_helper.c yourself actually. > >> Great! As I am moving away from legacy stuff this is a > >> perfect fit now. So, I will base on drm_simple_kms_helper. > >> > >> BTW, it has a small issue which can be easily solved in the > >> DRM core I believe: if your driver needs to work with vblanks, > >> then you have to provide drm_driver.enable_vblank/disable_vblank > >> because drm_simple_kms_helper.drm_crtc_funcs do not provide any. > >> The problem here is that drm_driver.enable_vblank/disable_vblank > >> are most probably dummy functions and what is more they are marked > >> as deprecated. > > Hmm. Oh, I think someone just wanted people to move to using the > > corresponding hooks in the crtc funcs. > My understanding is that that was the intention > > I'm not sure how people do things > > when they don't have vblank interrupts, but I think in general if you > > don't initialize vblank support those hooks shouldn't get called. > That is correct, but if the driver is drm_simple_kms_helper > based and wants vblanks, then you'll have to do something > like [1], [2]. > At the same time [3], [4] say enable_vblank/disable_vblank > are deprecated in struct drm_driver. > So, I think that drm_simple_kms_helper and its drm_crtc_funcs > should provide corresponding callbacks in order to avoid using > deprecated functionality. I see. Well, I'm sure people are willing to entertain patches ;) > [1] > http://elixir.bootlin.com/linux/v4.15.2/source/drivers/gpu/drm/pl111/pl111_drv.c#L181 > [2] > http://elixir.bootlin.com/linux/v4.15.2/source/drivers/gpu/drm/mxsfb/mxsfb_drv.c#L336 > [3] > http://elixir.bootlin.com/linux/v4.15.2/source/include/drm/drm_drv.h#L206 > [4] > http://elixir.bootlin.com/linux/v4.15.2/source/include/drm/drm_drv.h#L222 -- Ville Syrjälä Intel OTC _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel