sched: Fix and optimise calculation of the weight-inverse
authorStephan Baerwolf <stephan.baerwolf@tu-ilmenau.de>
Wed, 11 May 2011 16:03:29 +0000 (18:03 +0200)
committerIngo Molnar <mingo@elte.hu>
Mon, 16 May 2011 09:01:18 +0000 (11:01 +0200)
If the inverse loadweight should be zero, function "calc_delta_mine"
calculates the inverse of "lw->weight" (in 32bit integer ops).

This calculation is actually a little bit impure (because it is
inverting something around "lw-weight"+1), especially when
"lw->weight" becomes smaller.

The correct inverse would be 1/lw->weight multiplied by
"WMULT_CONST" for fixcomma-scaling it into integers.
(So WMULT_CONST/lw->weight ...)

The old, impure algorithm took two divisions for inverting lw->weight,
the new, more exact one only takes one and an additional unlikely-if.

Signed-off-by: Stephan Baerwolf <stephan.baerwolf@tu-ilmenau.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-0pz0wnyalr4tk4ln11xwumdx@git.kernel.org
[ This could explain some aritmetical issues for small shares but nothing
  concrete has been reported yet so we are not confident enough to queue
  this up in sched/urgent and for -stable backport. But if anyone finds
  this commit and sees it to fix some badness then we can certainly
  change our mind! ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
kernel/sched.c

index 70bec4f..c62acf4 100644 (file)
@@ -1330,15 +1330,15 @@ calc_delta_mine(unsigned long delta_exec, unsigned long weight,
 {
        u64 tmp;
 
+       tmp = (u64)delta_exec * weight;
+
        if (!lw->inv_weight) {
                if (BITS_PER_LONG > 32 && unlikely(lw->weight >= WMULT_CONST))
                        lw->inv_weight = 1;
                else
-                       lw->inv_weight = 1 + (WMULT_CONST-lw->weight/2)
-                               / (lw->weight+1);
+                       lw->inv_weight = WMULT_CONST / lw->weight;
        }
 
-       tmp = (u64)delta_exec * weight;
        /*
         * Check whether we'd overflow the 64-bit multiplication:
         */