On Tue, Jun 18, 2019 at 11:54 PM Rodrigo Siqueira <rodrigosiqueiramelo@xxxxxxxxx> wrote: > > On Tue, Jun 18, 2019 at 5:56 AM Daniel Vetter <daniel@xxxxxxxx> wrote: > > > > On Mon, Jun 17, 2019 at 11:49:04PM -0300, Rodrigo Siqueira wrote: > > > On 06/12, Daniel Vetter wrote: > > > > On Wed, Jun 12, 2019 at 10:28:41AM -0300, Rodrigo Siqueira wrote: > > > > > Hi Daniel, > > > > > > > > > > First of all, thank you very much for your patchset. > > > > > > > > > > I tried to make a detailed review of your series, and you can see my > > > > > comments in each patch. You’ll notice that I asked many things related > > > > > to the DRM subsystem with the hope that I could learn a little bit > > > > > more about DRM from your comments. > > > > > > > > > > Before you go through the review, I would like to start a discussion > > > > > about the vkms race conditions. First, I have to admit that I did not > > > > > understand the race conditions that you described before because I > > > > > couldn’t reproduce them. Now, I'm suspecting that I could not > > > > > experience the problem because I'm using QEMU with KVM; with this idea > > > > > in mind, I suppose that we have two scenarios for using vkms in a > > > > > virtual machine: > > > > > > > > > > * Scenario 1: The user has hardware virtualization support; in this > > > > > case, it is a little bit harder to experience race conditions with > > > > > vkms. > > > > > > > > > > * Scenario 2: Without hardware virtualization support, it is much > > > > > easier to experience race conditions. > > > > > > > > > > With these two scenarios in mind, I conducted a bunch of experiments > > > > > for trying to shed light on this issue. I did: > > > > > > > > > > 1. Enabled lockdep > > > > > > > > > > 2. Defined two different environments for testing both using QEMU with > > > > > and without kvm. See below the QEMU hardware options: > > > > > > > > > > * env_kvm: -enable-kvm -daemonize -m 1G -smp cores=2,cpus=2 > > > > > * env_no_kvm: -daemonize -m 2G -smp cores=4,cpus=4 > > > > > > > > > > 3. My test protocol: I) turn on the vm, II) clean /proc/lock_stat, > > > > > III) execute kms_cursor_crc, III) save the lock_stat file, and IV) > > > > > turn off the vm. > > > > > > > > > > 4. From the lockdep_stat, I just highlighted the row related to vkms > > > > > and the columns holdtime-total and holdtime-avg > > > > > > > > > > I would like to highlight that the following data does not have any > > > > > statistical approach, and its intention is solely to assist our > > > > > discussion. See below the summary of the collected data: > > > > > > > > > > Summary of the experiment results: > > > > > > > > > > +----------------+----------------+----------------+ > > > > > | | env_kvm | env_no_kvm | > > > > > + +----------------+----------------+ > > > > > | Test | Before | After | Before | After | > > > > > +----------------+--------+-------+--------+-------+ > > > > > | kms_cursor_crc | S1 | S2 | M1 | M2 | > > > > > +----------------+--------+-------+--------+-------+ > > > > > > > > > > * Before: before apply this patchset > > > > > * After: after apply this patchset > > > > > > > > > > -----------------------------------------+------------------+----------- > > > > > S1: Without this patchset and with kvm | holdtime-total | holdtime-avg > > > > > -----------------------------------------+----------------+------------- > > > > > &(&vkms_out->lock)->rlock: | 21983.52 | 6.21 > > > > > (work_completion)(&vkms_state->crc_wo: | 20.47 | 20.47 > > > > > (wq_completion)vkms_crc_workq: | 3999507.87 | 3352.48 > > > > > &(&vkms_out->state_lock)->rlock: | 378.47 | 0.30 > > > > > (work_completion)(&vkms_state->crc_#2: | 3999066.30 | 2848.34 > > > > > -----------------------------------------+----------------+------------- > > > > > S2: With this patchset and with kvm | | > > > > > -----------------------------------------+----------------+------------- > > > > > &(&vkms_out->lock)->rlock: | 23262.83 | 6.34 > > > > > (work_completion)(&vkms_state->crc_wo: | 8.98 | 8.98 > > > > > &(&vkms_out->crc_lock)->rlock: | 307.28 | 0.32 > > > > > (wq_completion)vkms_crc_workq: | 6567727.05 | 12345.35 > > > > > (work_completion)(&vkms_state->crc_#2: | 6567135.15 | 4488.81 > > > > > -----------------------------------------+----------------+------------- > > > > > M1: Without this patchset and without kvm| | > > > > > -----------------------------------------+----------------+------------- > > > > > &(&vkms_out->state_lock)->rlock: | 4994.72 | 1.61 > > > > > &(&vkms_out->lock)->rlock: | 247190.04 | 39.39 > > > > > (work_completion)(&vkms_state->crc_wo: | 31.32 | 31.32 > > > > > (wq_completion)vkms_crc_workq: | 20991073.78 | 13525.18 > > > > > (work_completion)(&vkms_state->crc_#2: | 20988347.18 | 11904.90 > > > > > -----------------------------------------+----------------+------------- > > > > > M2: With this patchset and without kvm | | > > > > > -----------------------------------------+----------------+------------- > > > > > (wq_completion)vkms_crc_workq: | 42819161.68 | 36597.57 > > > > > &(&vkms_out->lock)->rlock: | 251257.06 | 35.80 > > > > > (work_completion)(&vkms_state->crc_wo: | 69.37 | 69.37 > > > > > &(&vkms_out->crc_lock)->rlock: | 3620.92 | 1.54 > > > > > (work_completion)(&vkms_state->crc_#2: | 42803419.59 | 24306.31 > > > > > > > > > > First, I analyzed the scenarios with KVM (S1 and S2); more > > > > > specifically, I focused on these two classes: > > > > > > > > > > 1. (work_completion)(&vkms_state->crc_wo > > > > > 2. (work_completion)(&vkms_state->crc_#2 > > > > > > > > > > After taking a look at the data, it looks like that this patchset > > > > > greatly reduces the hold time average for crc_wo. On the other hand, > > > > > it increases the hold time average for crc_#2. I didn’t understand > > > > > well the reason for the difference. Could you help me here? > > > > > > > > So there's two real locks here from our code, the ones you can see as > > > > spinlocks: > > > > > > > > &(&vkms_out->state_lock)->rlock: | 4994.72 | 1.61 > > > > &(&vkms_out->lock)->rlock: | 247190.04 | 39.39 > > > > > > > > All the others are fake locks that the workqueue adds, which only exist in > > > > lockdep. They are used to catch special kinds of deadlocks like the below: > > > > > > > > thread A: > > > > 1. mutex_lock(mutex_A) > > > > 2. flush_work(work_A) > > > > > > > > thread B > > > > 1. running work_A: > > > > 2. mutex_lock(mutex_A) > > > > > > > > thread B can't continue becuase mutex_A is already held by thread A. > > > > thread A can't continue because thread B is blocked and the work never > > > > finishes. > > > > -> deadlock > > > > > > > > I haven't checked which is which, but essentially what you measure with > > > > the hold times of these fake locks is how long a work execution takes on > > > > average. > > > > > > > > Since my patches are supposed to fix races where the worker can't keep up > > > > with the vblank hrtimer, then the average worker will (probably) do more, > > > > so that going up is expected. I think. > > > > > > > > I'm honestly not sure what's going on here, never looking into this in > > > > detail. > > > > > > > > > When I looked for the second set of scenarios (M1 and M2, both without > > > > > KVM), the results look much more distant; basically, this patchset > > > > > increased the hold time average. Again, could you help me understand a > > > > > little bit better this issue? > > > > > > > > > > Finally, after these tests, I have some questions: > > > > > > > > > > 1. VKMS is a software-only driver; because of this, how about defining > > > > > a minimal system resource for using it? > > > > > > > > No idea, in reality it's probably "if it fails too often, buy faster cpu". > > > > I do think we should make the code robust against a slow cpu, since atm > > > > that's needed even on pretty fast machines (because our blending code is > > > > really, really slow and inefficient). > > > > > > > > > 2. During my experiments, I noticed that running tests with a VM that > > > > > uses KVM had consistent results. For example, kms_flip never fails > > > > > with QEMU+KVM; however, without KVM, two or three tests failed (one of > > > > > them looks random). If we use vkms for test DRM stuff, should we > > > > > recommend the use of KVM? > > > > > > > > What do you mean without kvm? In general running without kvm shouldn't be > > > > slower, so I'm a bit confused ... I'm running vgem directly on the > > > > machine, by booting into new kernels (and controlling the machine over the > > > > network). > > > > > > Sorry, I should have detailed my point. > > > > > > Basically, I do all my testing with vkms in QEMU VM. In that sense, I > > > did some experiments which I enabled and disabled the KVM (i.e., flag > > > '-enable-kvm') to check the vkms behaviour in these two scenarios. I > > > noticed that the tests are consistent when I use the '-enable-kvm' flag, > > > in that context it seemed a good idea to recommend the use of KVM for > > > those users who want to test vkms with igt. Anyway, don't worry about > > > that I'll try to add more documentation for vkms in the future and in > > > that time we discuss about this again. > > > > Ah, qemu without kvm is going to use software emulation for a lot of the > > kernel stuff. That's going to be terribly slow indeed. > > > > > Anyway, from my side, I think we should merge this series. Do you want > > > to prepare a V2 with the fixes and maybe update the commit messages by > > > using some of your explanations? Or, if you want, I can fix the tiny > > > details and merge it. > > > > I'm deeply burried in my prime cleanup/doc series right now, so will take > > a few days until I get around to this again. If you want, please go ahead > > with merging. > > > > btw if you edit a patch when merging, please add a comment about that to > > the commit message. And ime it's best to only augment the commit message > > and maybe delete an unused variable or so that got forgotten. For > > everything more better to do the edits and resubmit. > > First of all, thank you very much for all your reviews and > explanation. I’ll try the exercise that you proposed on Patch 1. > > I’ll merge patch [4] and [7] since they’re not related to these > series. For the other patches, I’ll merge them after I finish the new > version of writeback series. I’ll ping you later. > > 4. https://patchwork.freedesktop.org/patch/309031/?series=61737&rev=1 > 7. https://patchwork.freedesktop.org/patch/309029/?series=61737&rev=1$ Can you merge them quicker? I have another 3 vkms patches here touching that area with some follow-up stuff ... > Finally, not related with this patchset, can I apply the patch > “drm/drm_vblank: Change EINVAL by the correct errno” [1] or do I need > more SoB? I’ll also apply Oleg patch (drm/vkms: add crc sources list). > > 1. https://patchwork.freedesktop.org/patch/310006/?series=50697&rev=4 If you want get some acks from igt maintainers (those patches landed now, right), but this is good enough. -Daniel > Thanks > > > Thanks, Daniel > > > > > > > > > -Daniel > > > > > > > > > Best Regards > > > > > > > > > > On Thu, Jun 6, 2019 at 7:28 PM Daniel Vetter <daniel.vetter@xxxxxxxx> wrote: > > > > > > > > > > > > Hi all, > > > > > > > > > > > > This here is the first part of a rework for the vkms crc worker. I think > > > > > > this should fix all the locking/races/use-after-free issues I spotted in > > > > > > the code. There's more work we can do in the future as a follow-up: > > > > > > > > > > > > - directly access vkms_plane_state->base in the crc worker, with this > > > > > > approach in this series here that should be safe now. Follow-up patches > > > > > > could switch and remove all the separate crc_data infrastructure. > > > > > > > > > > > > - I think some kerneldoc for vkms structures would be nice. Documentation > > > > > > the various functions is probably overkill. > > > > > > > > > > > > - Implementing a more generic blending engine, as prep for adding more > > > > > > pixel formats, more planes, and more features in general. > > > > > > > > > > > > Tested with kms_pipe_crc, kms_flip and kms_cursor_crc. Seems to not make > > > > > > things worse, but I didn't do a full run. > > > > > > > > > > > > Cheers, Daniel > > > > > > > > > > > > Daniel Vetter (10): > > > > > > drm/vkms: Fix crc worker races > > > > > > drm/vkms: Use spin_lock_irq in process context > > > > > > drm/vkms: Rename vkms_output.state_lock to crc_lock > > > > > > drm/vkms: Move format arrays to vkms_plane.c > > > > > > drm/vkms: Add our own commit_tail > > > > > > drm/vkms: flush crc workers earlier in commit flow > > > > > > drm/vkms: Dont flush crc worker when we change crc status > > > > > > drm/vkms: No _irqsave within spin_lock_irq needed > > > > > > drm/vkms: totally reworked crc data tracking > > > > > > drm/vkms: No need for ->pages_lock in crc work anymore > > > > > > > > > > > > drivers/gpu/drm/vkms/vkms_crc.c | 74 +++++++++------------------- > > > > > > drivers/gpu/drm/vkms/vkms_crtc.c | 80 ++++++++++++++++++++++++++----- > > > > > > drivers/gpu/drm/vkms/vkms_drv.c | 35 ++++++++++++++ > > > > > > drivers/gpu/drm/vkms/vkms_drv.h | 24 +++++----- > > > > > > drivers/gpu/drm/vkms/vkms_plane.c | 8 ++++ > > > > > > 5 files changed, 146 insertions(+), 75 deletions(-) > > > > > > > > > > > > -- > > > > > > 2.20.1 > > > > > > > > > > > > _______________________________________________ > > > > > > dri-devel mailing list > > > > > > dri-devel@xxxxxxxxxxxxxxxxxxxxx > > > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Rodrigo Siqueira > > > > > https://siqueira.tech > > > > > > > > -- > > > > Daniel Vetter > > > > Software Engineer, Intel Corporation > > > > http://blog.ffwll.ch > > > > > > -- > > > Rodrigo Siqueira > > > https://siqueira.tech > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch > > > > -- > > Rodrigo Siqueira > https://siqueira.tech -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel