On Wed, Sep 14, 2016 at 11:27 AM, Byungchul Park <max.byungchul.park@xxxxxxxxx> wrote: > On Wed, Sep 14, 2016 at 4:38 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: >> On Wed, Sep 14, 2016 at 02:12:51AM +0900, Byungchul Park wrote: >>> On Wed, Sep 14, 2016 at 12:05 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: >>> > >>> > >>> > So the idea is to add support for non-owner serialization primitives, >>> > like completions/semaphores, and so far I've not looked at the code yet. >>> > I did spend 2+ hours trying to decipher your documentation thing, but am >>> > still confused, that thing is exceedingly hard to parse/read. >>> > >>> > So the typical scenario would be something like: >>> > >>> > L a L a >>> > U a >>> > W b C b >>> > >>> > where L := lock, U := unlock, W := wait_for_completion and C := >>> > complete. >>> > >>> > On the left, the blocking thread we can easily observe the 'b depends on >>> > a' relation, since we block while holding a. >>> >>> I think 'a depends on b' relation. >>> >>> Why does b depend on a? b depends on a's what? >> >> b blocks while holding a. In any case, for the graph it doesn't matter, >> its not a directed graph as such, all we care about is acyclic. > > It's a directed graph.The meaning of backward traverse is different > from the meaning of forward traverse, isn't it? > >>> > On the right however that same relation is hard to see, since by the >>> >>> Yes, there's no dependency on the right side of your example. >> >> Well, there is, its just not trivially observable. We must be able to >> acquire a in order to complete b, therefore there is a dependency. > > No. We cannot say there is a dependency unconditionally. There can > be a dependency or not. > > L a L a > U a > ~~~~~~~~~ what if serialized by something? > W b C b > > If something we don't recognize serializes locks, which ensures > 'W b' happens after 'L a , U a' in the other context, then there's > no dependency here. > > We should say 'b depends on a' in only case that the sequence > 'W b and then L a and then C b, where last two ops are in same > context' _actually_ happened at least once. Otherwise, it might > add a false dependency. > > It's same as how original lockdep works with typical locks. It adds > a dependency only when a lock is actually hit. > >>> > time we would run complete, a has already been released. >>> >>> I will change your example a little bit. >>> >>> W b >>> ~~~~~~~ <- serialized >>> L a >>> U a >>> C b >>> >>> (Remind crossrelease considers dependencies in only case that >>> lock sequence serialized is observable.) >> >> What does that mean? Any why? This is a random point in time without >> actual meaning. > > It's not random point. We have to consider meaningful sequences among > those which are globally observable. That's why we need to serialize > those locks. For example, > > W b > L a > U a > C b > > Once this sequence is observable globally, we can say 'It's possible to > run in this sequence. Is this sequence problematic or not?'. > > L a > U a > W b > C b > > If only this sequence can be observable, we should not assume > this sequence can be changed. However once the former sequence > happens, it has a possibility to hit the same sequence again later. > So we can check deadlock possibility with the sequence, > > _not randomly_. > >>> There's a dependency in this case, like 'b depends on a' relation. >>> 'C b' may not be hit, depending on the result of 'L a', that is, >>> acquired or not. >> >> With or without a random reference point, the dependency is there. > > The dependency might not be there. > >>> > I _think_ you propose to keep track of all prior held locks and then use >>> > the union of the held list on the block-chain with the prior held list >>> > from the complete context. >>> >>> Almost right. Only thing we need to do to consider the union is to >>> connect two chains of two contexts by adding one dependency 'b -> a'. >> >> Sure, but how do you arrive at which connection to make. The document is >> entirely silent on this crucial point. > > We need to connect between the crosslock and the first lock among > locks having been acquired since the crosslock was held. Others will be > connected each other by original lockdep. > > By the way, does my document miss this description? If so, sorry. > I will check and update it. > >> The union between the held-locks of the blocked and prev-held-locks of >> the release should give a fair indication I think, but then, I've not >> thought too hard on this yet. > > I make the union only If actually the lock sequence happened once so > that it's globally visible. > >>> > The document describes you have a prior held list, and that its a >>> > circular thing, but it then completely fails to specify what gets added >>> > and how its used. >>> >>> Why does it fail? It keeps them in the form of hlock. This data is enough >>> to generate dependencies. >> >> It fails to explain. It just barely mentions you keep them, it doesn't >> mention how they're used or why. > > Okay. I need to explain more about that. > >>> > Also, by it being a circular list (of indeterminate, but finite size), >>> > there is the possibility of missing dependencies if they got spooled out >>> > by recent activity. >>> >>> Yes, right. They might be missed. It means just missing some chances to >>> check a deadlock. It's all. Better than do nothing. >> >> Sure, but you didn't specify. Again, the document is very ambiguous and >> ill specified. > > Yes, I need to supplement the latter half as you said. I will do it. > >> >>> > The document has an example in the section 'How cross release works' >>> > (the 3rd) that simply cannot be. It lists lock action _after_ acquire, >>> >>> Could you tell me more? Why cannot it be? >> >> Once you block you cannot take more locks, that simply doesnt make >> sense. > > No. Peter.. lockdep check possibilities. All lockdep operations are > performed assuming this same sequence can happen again later but in > a little bit different order between more than two contexts so that actual > deadlock can happen then. I can see what you are talking about. you are only talking about wait_for_completion. wait_for_completion is a case more complicated than lock_page. Could we talk about lock_page first if you don't mind? And then move to wait_for_complete next? Anyway even in wait_complete, It's possible like, (even though it's not much meaningful in case of wait_for_completion) (Triger complete context to start) acquire A wait_for_completion A acquire another (e.g. wait.lock) release it ... (actually sleep) Not much meaningful in wait_for_completion, though it's not wrong. And this is much meaningful to another crosslock like lock_page. The target I'm mainly focusing is page lock. > No doubt, it cannot take more locks while blocked at the time. But it's > not important for lockdep to work. > >>> > but you place the acquire in wait_for_completion. We _block_ there. >>> > >>> > Is this the general idea? >>> > >>> > If so, I cannot see how something like: >>> > >>> > W a W b >>> > C b C a >>> >>> I didn't tell it works in this case. But it can be a future work. I'm not >>> sure but I don't think making it work is impossible at all. But anyway >>> current implementation cannot deal with this case. >> >> +4. 'crosslock a -> crosslock b' dependency >> + >> + Creating this kind of dependency directly is unnecessary since it can >> + be covered by other kinds of dependencies. >> >> Says something different, doesn't it? > > Really sorry. I made you much confused. I will update it. > >>> > On the whole 'release context' thing you want to cast lockdep in, please >>> > consider something like: >>> > >>> > L a L b >>> > L b L a >>> > U a U a >>> > U b U b >>> > >>> > Note that the release order of locks is immaterial and can generate >>> > 'wrong' dependencies. Note how on both sides a is released under b, >>> >>> Who said the release order is important? Wrong for what? >> >> Well, again, you mention 'release context' without actually specifying >> what you mean with that. >> >> L a >> L b >> U b >> >> and >> >> L a >> L b >> U a >> >> Here the unlocks obviously have different context. Without further >> specification its simply impossible to tell what you mean. > > I mentioned what the release context means in the document. Or > I don't understand what you are asking now. > > -- > Thanks, > Byungchul -- Thanks, Byungchul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>