function smp_read_barrier_depends() confuses me

Yubin Ruan <ablacktshirt@xxxxxxxxx> · Mon, 13 Feb 2017 21:11:26 +0800

I have just finished Appendix B of perfbook(2017.01.02a), but the 
function smp_read_barrier_depends() and how it make the code below 
correct really confused me.

In B.7, paragraph 2, it say:

(1)  "Yes, this does mean that Alpha can in effect fetch the data 
pointed to before it fetches the pointer itself,..."

and after presenting the code example:

1  struct el *insert(long key, long data)
2  {
3     struct el *p;
4     p = kmalloc(sizeof(*p), GFP_ATOMIC);
5     spin_lock(&mutex);
6     p->next = head.next;
7     p->key = key;
8     p->data = data;
9     smp_wmb();
10    head.next = p;
11    spin_unlock(&mutex);
12 }
13
14 struct el *search(long key)
15 {
16    struct el *p;
17    p = head.next;
18    while (p != &head) {
19        /* BUG ON ALPHA!!! */
20        if (p->key == key) {
21            return (p);
22        }
23        p = p->next;
24    };
25    return (NULL);
26 }

it says:

(2) "On Alpha, the smp_wmb() will guarantee that the cache invalidates 
performed by lines 6-8 will reach the interconnect before that of line 
10 does, but make absolutely no guarantee about the order in which the 
new values will reach the reading CPU's core"

My question is, how exactly does this code break on Alpha and how the 
smp_read_barrier_depends() help make it correct, as follow:

18    while (p != &head) {
19        smp_read_barrier_depends();
20        if (p->key == key) {
21            return (p);

According to (2), I guess that the code breaks because the "new values" 
arrive in reading CPU in disorder, even though "cache invalidation 
messages" arrive in order. That says, in Figure B.10, even though the 
reading CPU core get invalidation message of
   p->next
   p->key
   p->data
before invalidation message of `head.next', it might not get the value of
   p->next
   p->key
   p->data
before `head.next', which resulting in code break. Is that correct ?
The whole paragraphs do not refer to any exact line of code so I really 
confusing.

And, if that is correct, can I infer that all other CPUs except Alpha 
would guarantee that "new values" and "cache invalidation messages" 
would arrive in reading CPU in order, with proper memory barriers like 
that at line 9 ?

Frankly, I consider the some narratives in Appendix B pretty 
confusing(no offense):

1. At paragraph 4 in page 350 of the two-column perfbook.2017.01.02a, it 
says:

    "Figure B.10 shows how ... Assume that the list header `head' will 
be processed by cache bank 0, and that the new element will be processed 
by cache bank 1 ... For example, it is possible that reading CPU's cache 
bank 1 is very busy, but cache bank 0 is idle..."

  As there are bank 0 and bank 1 in both the writing CPU and the 
reading CPU, it is hard to infer which cache bank 0 is processing the 
header `head' and which cache bank 1 is processing the new element, and 
as a result I don't know how that disorder happen.

2. In figure B.10, both CPU have a "(r)mb Sequencing" and "(r)mb 
Sequencing", but not all of this are necessary. So, what do those 
sequencing mean ?

I have read the mail at
    http://h41379.www4.hpe.com/wizard/wiz_2637.html
but cannot find anything directly related to Alpha's weird feature. Can 
anyone provide any hint?(which paragraph...)

regards,
Yubin Ruan
--
To unsubscribe from this list: send the line "unsubscribe perfbook" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html