timer: Reduce timer migration overhead if disabled
authorThomas Gleixner <tglx@linutronix.de>
Tue, 26 May 2015 22:50:33 +0000 (22:50 +0000)
committerThomas Gleixner <tglx@linutronix.de>
Fri, 19 Jun 2015 13:18:28 +0000 (15:18 +0200)
Eric reported that the timer_migration sysctl is not really nice
performance wise as it needs to check at every timer insertion whether
the feature is enabled or not. Further the check does not live in the
timer code, so we have an extra function call which checks an extra
cache line to figure out that it is disabled.

We can do better and store that information in the per cpu (hr)timer
bases. I pondered to use a static key, but that's a nightmare to
update from the nohz code and the timer base cache line is hot anyway
when we select a timer base.

The old logic enabled the timer migration unconditionally if
CONFIG_NO_HZ was set even if nohz was disabled on the kernel command
line.

With this modification, we start off with migration disabled. The user
visible sysctl is still set to enabled. If the kernel switches to NOHZ
migration is enabled, if the user did not disable it via the sysctl
prior to the switch. If nohz=off is on the kernel command line,
migration stays disabled no matter what.

Before:
  47.76%  hog       [.] main
  14.84%  [kernel]  [k] _raw_spin_lock_irqsave
   9.55%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.71%  [kernel]  [k] mod_timer
   6.24%  [kernel]  [k] lock_timer_base.isra.38
   3.76%  [kernel]  [k] detach_if_pending
   3.71%  [kernel]  [k] del_timer
   2.50%  [kernel]  [k] internal_add_timer
   1.51%  [kernel]  [k] get_nohz_timer_target
   1.28%  [kernel]  [k] __internal_add_timer
   0.78%  [kernel]  [k] timerfn
   0.48%  [kernel]  [k] wake_up_nohz_cpu

After:
  48.10%  hog       [.] main
  15.25%  [kernel]  [k] _raw_spin_lock_irqsave
   9.76%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.50%  [kernel]  [k] mod_timer
   6.44%  [kernel]  [k] lock_timer_base.isra.38
   3.87%  [kernel]  [k] detach_if_pending
   3.80%  [kernel]  [k] del_timer
   2.67%  [kernel]  [k] internal_add_timer
   1.33%  [kernel]  [k] __internal_add_timer
   0.73%  [kernel]  [k] timerfn
   0.54%  [kernel]  [k] wake_up_nohz_cpu

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 files changed:
include/linux/hrtimer.h
include/linux/sched.h
include/linux/sched/sysctl.h
include/linux/timer.h
kernel/rcu/tree_plugin.h
kernel/sched/core.c
kernel/sysctl.c
kernel/time/hrtimer.c
kernel/time/tick-internal.h
kernel/time/tick-sched.c
kernel/time/timer.c
kernel/time/timer_list.c

index 5db0558..6955102 100644 (file)
@@ -163,6 +163,7 @@ enum  hrtimer_base_type {
  * @cpu:               cpu number
  * @active_bases:      Bitfield to mark bases with active timers
  * @clock_was_set_seq: Sequence counter of clock was set events
+ * @migration_enabled: The migration of hrtimers to other cpus is enabled
  * @expires_next:      absolute time of the next event which was scheduled
  *                     via clock_set_next_event()
  * @next_timer:                Pointer to the first expiring timer
@@ -186,6 +187,7 @@ struct hrtimer_cpu_base {
        unsigned int                    cpu;
        unsigned int                    active_bases;
        unsigned int                    clock_was_set_seq;
+       bool                            migration_enabled;
 #ifdef CONFIG_HIGH_RES_TIMERS
        unsigned int                    in_hrtirq       : 1,
                                        hres_active     : 1,
Simple merge
Simple merge
Simple merge
Simple merge
Simple merge
diff --cc kernel/sysctl.c
Simple merge
Simple merge
Simple merge
Simple merge
Simple merge
Simple merge