Merge branch 'audit.b32' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit...

[pandora-kernel.git] / Documentation / RCU / checklist.txt
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt

index e118a7c..f4dffad 100644 (file)
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -144,9 +144,47 @@ over a rather long period of time, but improvements are always welcome!
         whether the increased speed is worth it.
  
  8.     Although synchronize_rcu() is a bit slower than is call_rcu(),
-       it usually results in simpler code.  So, unless update performance
-       is important or the updaters cannot block, synchronize_rcu()
-       should be used in preference to call_rcu().
+       it usually results in simpler code.  So, unless update
+       performance is critically important or the updaters cannot block,
+       synchronize_rcu() should be used in preference to call_rcu().
+
+       An especially important property of the synchronize_rcu()
+       primitive is that it automatically self-limits: if grace periods
+       are delayed for whatever reason, then the synchronize_rcu()
+       primitive will correspondingly delay updates.  In contrast,
+       code using call_rcu() should explicitly limit update rate in
+       cases where grace periods are delayed, as failing to do so can
+       result in excessive realtime latencies or even OOM conditions.
+
+       Ways of gaining this self-limiting property when using call_rcu()
+       include:
+
+       a.      Keeping a count of the number of data-structure elements
+               used by the RCU-protected data structure, including those
+               waiting for a grace period to elapse.  Enforce a limit
+               on this number, stalling updates as needed to allow
+               previously deferred frees to complete.
+
+               Alternatively, limit only the number awaiting deferred
+               free rather than the total number of elements.
+
+       b.      Limiting update rate.  For example, if updates occur only
+               once per hour, then no explicit rate limiting is required,
+               unless your system is already badly broken.  The dcache
+               subsystem takes this approach -- updates are guarded
+               by a global lock, limiting their rate.
+
+       c.      Trusted update -- if updates can only be done manually by
+               superuser or some other trusted user, then it might not
+               be necessary to automatically limit them.  The theory
+               here is that superuser already has lots of ways to crash
+               the machine.
+
+       d.      Use call_rcu_bh() rather than call_rcu(), in order to take
+               advantage of call_rcu_bh()'s faster grace periods.
+
+       e.      Periodically invoke synchronize_rcu(), permitting a limited
+               number of updates per grace period.
  
  9.     All RCU list-traversal primitives, which include
         list_for_each_rcu(), list_for_each_entry_rcu(),
@@ -177,3 +215,47 @@ over a rather long period of time, but improvements are always welcome!
  
         If you want to wait for some of these other things, you might
         instead need to use synchronize_irq() or synchronize_sched().
+
+12.    Any lock acquired by an RCU callback must be acquired elsewhere
+       with irq disabled, e.g., via spin_lock_irqsave().  Failing to
+       disable irq on a given acquisition of that lock will result in
+       deadlock as soon as the RCU callback happens to interrupt that
+       acquisition's critical section.
+
+13.    SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu())
+       may only be invoked from process context.  Unlike other forms of
+       RCU, it -is- permissible to block in an SRCU read-side critical
+       section (demarked by srcu_read_lock() and srcu_read_unlock()),
+       hence the "SRCU": "sleepable RCU".  Please note that if you
+       don't need to sleep in read-side critical sections, you should
+       be using RCU rather than SRCU, because RCU is almost always
+       faster and easier to use than is SRCU.
+
+       Also unlike other forms of RCU, explicit initialization
+       and cleanup is required via init_srcu_struct() and
+       cleanup_srcu_struct().  These are passed a "struct srcu_struct"
+       that defines the scope of a given SRCU domain.  Once initialized,
+       the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock()
+       and synchronize_srcu().  A given synchronize_srcu() waits only
+       for SRCU read-side critical sections governed by srcu_read_lock()
+       and srcu_read_unlock() calls that have been passd the same
+       srcu_struct.  This property is what makes sleeping read-side
+       critical sections tolerable -- a given subsystem delays only
+       its own updates, not those of other subsystems using SRCU.
+       Therefore, SRCU is less prone to OOM the system than RCU would
+       be if RCU's read-side critical sections were permitted to
+       sleep.
+
+       The ability to sleep in read-side critical sections does not
+       come for free.  First, corresponding srcu_read_lock() and
+       srcu_read_unlock() calls must be passed the same srcu_struct.
+       Second, grace-period-detection overhead is amortized only
+       over those updates sharing a given srcu_struct, rather than
+       being globally amortized as they are for other forms of RCU.
+       Therefore, SRCU should be used in preference to rw_semaphore
+       only in extremely read-intensive situations, or in situations
+       requiring SRCU's read-side deadlock immunity or low read-side
+       realtime latency.
+
+       Note that, rcu_assign_pointer() and rcu_dereference() relate to
+       SRCU just as they do to other forms of RCU.