On 16/01/2019 18:04, Lionel Landwerlin
wrote:
On
16/01/2019 16:31, Chris Wilson wrote:
Quoting Lionel Landwerlin (2019-01-16
16:25:26)
On 16/01/2019 16:05, Chris Wilson wrote:
Quoting Lionel Landwerlin (2019-01-16
15:58:00)
On 16/01/2019 15:52, Chris Wilson
wrote:
Quoting Lionel Landwerlin
(2019-01-16 15:36:20)
@@ -1877,6 +1883,21 @@ struct
drm_i915_private {
wait_queue_head_t poll_wq;
bool pollin;
+ /**
+ * Atomic counter incremented
by the interrupt
+ * handling code for each OA
half full interrupt
+ * received.
+ */
+ atomic64_t half_full_count;
+
+ /**
+ * Copy of the atomic
half_full_count that was last
+ * processed in the i915-perf
driver. If both counters
+ * differ, there is data
available to read in the OA
+ * buffer.
+ */
+ u64 half_full_count_last;
Eh? But why a relatively expensive atomic64. You only
need one bit, and
reading the tail pointer from WB memory should just be
cheap. You should
be able to sample the perf ringbuffer pointers very
cheaply... What am I
missing?
-Chris
Fair comment.
The thing is with this series there are 2 mechanism that
notify the poll_wq.
One is the hrtimer that kicks in at regular interval and
polls the
register with the workaround.
The other is the interrupt which doesn't read the
registers and workaround.
What's the complication with the workaround?
It's a bit more than just looking at registers, we actually
have to look
at the content of the buffer to figure out what landed in
memory.
The register values are not synchronized with the memory
writes...
I don't want to look at registers at all for polling, and you
shouldn't
need to since communication is via a ringbuf.
There is a comment in the code
(i915_perf_poll_locked) about not
checking the register after each wakeup because that may be a
very hot path.
The atomic64 sounded like a lesser evil.
I'm clearly not understanding something here...
Does the hardware not do:
update ringbuf data;
wmb() (or post to global observation point in their
parlance)
update ringbuf tail
As far as I understand, the OA unit :
sends its memory write requests to the memory controller
immediately updates the ringbuf tail, without waiting for the
previous requests to complete
By experimentation, I've haven't seen a delta between what is
available in memory and what the OA tail register indicate larger
than 768 bytes, which is about 3 OA reports at the largest size.
There is probably a maximum number of write requests the OA unit can
queue before blocking.
So again maybe you would prefer a 2 stage mechanism :
OA interrupt -----> head/tail pointer worker -----> wake up userspace
hrtimer head/tail pointer --|
Then we just need to sample the ringbuf tail and compare against
how far
we read last time?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
|
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx