> On 19 Feb 2018, at 12:39, Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Mon 19-02-18 12:14:26, Robert Harris wrote: >> >> >>> On 19 Feb 2018, at 08:26, Michal Hocko <mhocko@xxxxxxxxxx> wrote: >>> >>> On Sun 18-02-18 16:47:55, robert.m.harris@xxxxxxxxxx wrote: >>>> From: "Robert M. Harris" <robert.m.harris@xxxxxxxxxx> >>>> >>>> __fragmentation_index() calculates a value used to determine whether >>>> compaction should be favoured over page reclaim in the event of allocation >>>> failure. The calculation itself is opaque and, on inspection, does not >>>> match its existing description. The function purports to return a value >>>> between 0 and 1000, representing units of 1/1000. Barring the case of a >>>> pathological shortfall of memory, the lower bound is instead 500. This is >>>> significant because it is the default value of sysctl_extfrag_threshold, >>>> i.e. the value below which compaction should be avoided in favour of page >>>> reclaim for costly pages. >>>> >>>> This patch implements and documents a modified version of the original >>>> expression that returns a value in the range 0 <= index < 1000. It amends >>>> the default value of sysctl_extfrag_threshold to preserve the existing >>>> behaviour. >>> >>> It is not really clear to me what is the actual problem you are trying >>> to solve by this patch. Is there any bug or are you just trying to >>> improve the current implementation to be more effective? >> >> There is not a significant bug. >> >> The first problem is that the mathematical expression in >> __fragmentation_index() is opaque, particularly given the lack of >> description in the comments or the original commit message. This patch >> provides such a description. >> >> Simply annotating the expression did not make sense since the formula >> doesn't work as advertised. The fragmentation index is described as >> being in the range 0 to 1000 but the bounds of the formula are instead >> 500 to 1000. This patch changes the formula so that its lower bound is >> 0. > > But why do we want to fix that in the first place? Why don't we simply > deprecate the tunable and remove it altogether? Who is relying on tuning > this option. Considering how it doesn't work as advertised and nobody > complaining I have that feeling that it is not really used in wild… I think it's a useful feature. Ignoring any contrived test case, there will always be a lower limit on the degree of fragmentation that can be achieved by compaction. If someone takes the trouble to measure this then it is entirely reasonable that he or she should be able to inhibit compaction for cases when fragmentation falls below some correspondingly sized threshold. I hope to improve upon the decison-making strategy in the allocator slow path but that is not a short term goal. The current patch is an improvement for the interim. Robert Harris-- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html