On Thu, Feb 23, 2017 at 08:39:17PM +0800, Christopher Li wrote: > On Fri, Feb 17, 2017 at 7:12 AM, Luc Van Oostenryck > <luc.vanoostenryck@xxxxxxxxx> wrote: > > During CSE equivalent instructions should hash to the same value > > but we should also try to *not* hash to the same value instructions > > that cannot be equivalent. > > For commutative ops this means that the hash function should itself > > be commutative/symmetrical regarding the exchange of its operands. > > This is already the case but the current hash function is symmetrical > > for all binops, not only the commutative ones. Thus expressions like > > 'a - b' and 'b - a' hash to the same value while it should be the case > > only when 'a == b'. > > > > Fix this by changing the hashing of non-commutative binops so that it > > is anti-symmetrical regarding the exchange of operands while keeping > > commutative ones symmetrical. > > > > This change have no functional effects (in the sense that it shoudl > > CSE exactly the same instructiosn as before), it should only improve > > the efficiency of the hashing+comparing. > > > > Note: on the 5000+ test set I'm using, I can't see any significant > > speedup which is quite normal since most of the functions therein > > have (much) less instructions than the size of the hash table. > > The effect of this patch should only be on much bigger functions. > > I apply and push this series to sparse-next. I don't see them yet, but it's not important. > Just curious if there is test case can show the performance increase. I don't have such test case. To see any effect you would need a function with a number of instructions quite large relatively to the size of the hash table. It's even worse than that, it's the number of pair of dual instructions (like 'a - b' / 'b - a') that should be large (and to see a significant relative difference, those pairs should also be a significant proportion of the total numbers of instructions). If needed I could try to build some artificial cases to determine when a difference appears but ... (see below). > As far as I can tell, this patch has down side as well, it hash the non > commutative operation *twice*. Sure it has some down fall but I it's not true that they are now hashed twice. The number of operations is exactly the same as for the commutative ones: each of src1 & src2 are 'hashed' together with the opcode and the instruction size (and the hash() function is essentially a no-op, not a costly thing anyway). The difference between the two cases is that for commutative instructions src2 is first hashed and then there is a fall-through to the unary ops to add src1's hash. The down side is only a slight increase in code complexity. > If for most of the case this actually > is the higher cost we might want to keep the old behavior. I can redo my tests and give the numbers but the measurements I made didn't didn't showed any significant differences: no speedup but also no slowdown. That said, I completly agree that there is not much justification to this patch. The only reason I made it was because when I modified the code for the commutative case (which don't give a speedup but eliminate more common expression) I took a look at the non-commutative situation and said to myself something like: "Oh, they are hashed symmetrically, like the commutative ones, that's not very logical". So, I have absolutely no problems if this patch is dropped. (much more interesting is the patch following this in the serie: "[RFC] CSE: relax type checking in hashing/compare"). Luc -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html