11 years agorcu: remove unused __list_for_each_rcu() macro
Paul E. McKenney [Thu, 16 Dec 2010 05:12:15 +0000 (21:12 -0800)]
rcu: remove unused __list_for_each_rcu() macro

Signed-off-by: Paul E. McKenney <>
11 years agorculist: fix borked __list_for_each_rcu() macro
Mariusz Kozlowski [Wed, 15 Dec 2010 22:11:12 +0000 (23:11 +0100)]
rculist: fix borked __list_for_each_rcu() macro

This restores parentheses blance.

Signed-off-by: Mariusz Kozlowski <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: reduce __call_rcu()-induced contention on rcu_node structures
Paul E. McKenney [Wed, 15 Dec 2010 01:36:02 +0000 (17:36 -0800)]
rcu: reduce __call_rcu()-induced contention on rcu_node structures

When the current __call_rcu() function was written, the expedited
APIs did not exist.  The __call_rcu() implementation therefore went
to great lengths to detect the end of old grace periods and to start
new ones, all in the name of reducing grace-period latency.  Now the
expedited APIs do exist, and the usage of __call_rcu() has increased
considerably.  This commit therefore causes __call_rcu() to avoid
worrying about grace periods unless there are a large number of
RCU callbacks stacked up on the current CPU.

Signed-off-by: Paul E. McKenney <>
11 years agorcu: limit rcu_node leaf-level fanout
Paul E. McKenney [Wed, 15 Dec 2010 00:07:52 +0000 (16:07 -0800)]
rcu: limit rcu_node leaf-level fanout

Some recent benchmarks have indicated possible lock contention on the
leaf-level rcu_node locks.  This commit therefore limits the number of
CPUs per leaf-level rcu_node structure to 16, in other words, there
can be at most 16 rcu_data structures fanning into a given rcu_node
structure.  Prior to this, the limit was 32 on 32-bit systems and 64 on
64-bit systems.

Note that the fanout of non-leaf rcu_node structures is unchanged.  The
organization of accesses to the rcu_node tree is such that references
to non-leaf rcu_node structures are much less frequent than to the
leaf structures.

Signed-off-by: Paul E. McKenney <>
11 years agorcu: fine-tune grace-period begin/end checks
Paul E. McKenney [Fri, 10 Dec 2010 23:02:47 +0000 (15:02 -0800)]
rcu: fine-tune grace-period begin/end checks

Use the CPU's bit in rnp->qsmask to determine whether or not the CPU
should try to report a quiescent state.  Handle overflow in the check
for rdp->gpnum having fallen behind.

Signed-off-by: Paul E. McKenney <>
11 years agorcu: Keep gpnum and completed fields synchronized
Frederic Weisbecker [Fri, 10 Dec 2010 21:11:11 +0000 (22:11 +0100)]
rcu: Keep gpnum and completed fields synchronized

When a CPU that was in an extended quiescent state wakes
up and catches up with grace periods that remote CPUs
completed on its behalf, we update the completed field
but not the gpnum that keeps a stale value of a backward
grace period ID.

Later, note_new_gpnum() will interpret the shift between
the local CPU and the node grace period ID as some new grace
period to handle and will then start to hunt quiescent state.

But if every grace periods have already been completed, this
interpretation becomes broken. And we'll be stuck in clusters
of spurious softirqs because rcu_report_qs_rdp() will make
this broken state run into infinite loop.

The solution, as suggested by Lai Jiangshan, is to ensure that
the gpnum and completed fields are well synchronized when we catch
up with completed grace periods on their behalf by other cpus.
This way we won't start noting spurious new grace periods.

Suggested-by: Lai Jiangshan <>
Signed-off-by: Frederic Weisbecker <>
Cc: Paul E. McKenney <>
Cc: Ingo Molnar <>
Cc: Thomas Gleixner <>
Cc: Peter Zijlstra <>
Cc: Steven Rostedt <
Signed-off-by: Paul E. McKenney <>
11 years agorcu: Stop chasing QS if another CPU did it for us
Frederic Weisbecker [Fri, 10 Dec 2010 21:11:10 +0000 (22:11 +0100)]
rcu: Stop chasing QS if another CPU did it for us

When a CPU is idle and others CPUs handled its extended
quiescent state to complete grace periods on its behalf,
it will catch up with completed grace periods numbers
when it wakes up.

But at this point there might be no more grace period to
complete, but still the woken CPU always keeps its stale
qs_pending value and will then continue to chase quiescent
states even if its not needed anymore.

This results in clusters of spurious softirqs until a new
real grace period is started. Because if we continue to
chase quiescent states but we have completed every grace
periods, rcu_report_qs_rdp() is puzzled and makes that
state run into infinite loops.

As suggested by Lai Jiangshan, just reset qs_pending if
someone completed every grace periods on our behalf.

Suggested-by: Lai Jiangshan <>
Signed-off-by: Frederic Weisbecker <>
Cc: Paul E. McKenney <>
Cc: Ingo Molnar <>
Cc: Thomas Gleixner <>
Cc: Peter Zijlstra <>
Cc: Steven Rostedt <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: increase synchronize_sched_expedited() batching
Tejun Heo [Tue, 23 Nov 2010 05:36:11 +0000 (21:36 -0800)]
rcu: increase synchronize_sched_expedited() batching

The fix in commit #6a0cc49 requires more than three concurrent instances
of synchronize_sched_expedited() before batching is possible.  This
patch uses a ticket-counter-like approach that is also not unrelated to
Lai Jiangshan's Ring RCU to allow sharing of expedited grace periods even
when there are only two concurrent instances of synchronize_sched_expedited().

This commit builds on Tejun's original posting, which may be found at, adding memory barriers, avoiding
overflow of signed integers (other than via atomic_t), and fixing the
detection of batching.

Signed-off-by: Tejun Heo <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: Make synchronize_srcu_expedited() fast if running readers
Paul E. McKenney [Tue, 26 Oct 2010 09:11:40 +0000 (02:11 -0700)]
rcu: Make synchronize_srcu_expedited() fast if running readers

The synchronize_srcu_expedited() function is currently quick if there
are no active readers, but will delay a full jiffy if there are any.
If these readers leave their SRCU read-side critical sections quickly,
this is way too long to wait.  So this commit first waits ten microseconds,
and only then falls back to jiffy-at-a-time waiting.

Reported-by: Avi Kivity <>
Reported-by: Marcelo Tosatti <>
Tested-by: Takuya Yoshikawa <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: fix race condition in synchronize_sched_expedited()
Paul E. McKenney [Mon, 25 Oct 2010 14:39:22 +0000 (07:39 -0700)]
rcu: fix race condition in synchronize_sched_expedited()

The new (early 2010) implementation of synchronize_sched_expedited() uses
try_stop_cpu() to force a context switch on every CPU.  It also permits
concurrent calls to synchronize_sched_expedited() to share a single call
to try_stop_cpu() through use of an atomically incremented
synchronize_sched_expedited_count variable.  Unfortunately, this is
subject to failure as follows:

o Task A invokes synchronize_sched_expedited(), try_stop_cpus()
succeeds, but Task A is preempted before getting to the atomic
increment of synchronize_sched_expedited_count.

o Task B also invokes synchronize_sched_expedited(), with exactly
the same outcome as Task A.

o Task C also invokes synchronize_sched_expedited(), again with
exactly the same outcome as Tasks A and B.

o Task D also invokes synchronize_sched_expedited(), but only
gets as far as acquiring the mutex within try_stop_cpus()
before being preempted, interrupted, or otherwise delayed.

o Task E also invokes synchronize_sched_expedited(), but only
gets to the snapshotting of synchronize_sched_expedited_count.

o Tasks A, B, and C all increment synchronize_sched_expedited_count.

o Task E fails to get the mutex, so checks the new value
of synchronize_sched_expedited_count.  It finds that the
value has increased, so (wrongly) assumes that its work
has been done, returning despite there having been no
expedited grace period since it began.

The solution is to have the lowest-numbered CPU atomically increment
the synchronize_sched_expedited_count variable within the
synchronize_sched_expedited_cpu_stop() function, which is under
the protection of the mutex acquired by try_stop_cpus().  However, this
also requires that piggybacking tasks wait for three rather than two
instances of try_stop_cpu(), because we cannot control the order in
which the per-CPU callback function occur.

Cc: Tejun Heo <>
Cc: Lai Jiangshan <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: update documentation/comments for Lai's adoption patch
Paul E. McKenney [Wed, 20 Oct 2010 19:06:18 +0000 (12:06 -0700)]
rcu: update documentation/comments for Lai's adoption patch

Lai's RCU-callback immediate-adoption patch changes the RCU tracing
output, so update tracing.txt.  Also update a few comments to clarify
the synchronization design.

Signed-off-by: Paul E. McKenney <>
11 years agorcu,cleanup: simplify the code when cpu is dying
Lai Jiangshan [Wed, 20 Oct 2010 06:13:06 +0000 (14:13 +0800)]
rcu,cleanup: simplify the code when cpu is dying

When we handle the CPU_DYING notifier, the whole system is stopped except
for the current CPU.  We therefore need no synchronization with the other
CPUs.  This allows us to move any orphaned RCU callbacks directly to the
list of any online CPU without needing to run them through the global
orphan lists.  These global orphan lists can therefore be dispensed with.
This commit makes thes changes, though currently victimizes CPU 0 @@@.

Signed-off-by: Lai Jiangshan <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu,cleanup: move synchronize_sched_expedited() out of sched.c
Lai Jiangshan [Thu, 21 Oct 2010 03:29:05 +0000 (11:29 +0800)]
rcu,cleanup: move synchronize_sched_expedited() out of sched.c

The first version of synchronize_sched_expedited() used the migration
code in the scheduler, and was therefore implemented in kernel/sched.c.
However, the more recent version of this code no longer uses the
migration code, so this commit moves it to the main RCU source files.

Signed-off-by: Lai Jiangshan <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: get rid of obsolete "classic" names in TREE_RCU tracing
Paul E. McKenney [Fri, 1 Oct 2010 04:33:32 +0000 (21:33 -0700)]
rcu: get rid of obsolete "classic" names in TREE_RCU tracing

The TREE_RCU tracing had obsolete rcuclassic_trace_init() and
rcuclassic_trace_cleanup() function names.  This commit brings them
up to date: rcutree_trace_init() and rcutree_trace_cleanup(),

Signed-off-by: Paul E. McKenney <>
11 years agorcu: Distinguish between boosting and boosted
Paul E. McKenney [Thu, 4 Nov 2010 21:55:26 +0000 (14:55 -0700)]
rcu: Distinguish between boosting and boosted

RCU priority boosting's tracing did not distinguish between ongoing
boosting and completion of boosting.  This commit therefore adds this

Signed-off-by: Paul E. McKenney <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: document TINY_RCU and TINY_PREEMPT_RCU tracing.
Paul E. McKenney [Thu, 4 Nov 2010 21:31:19 +0000 (14:31 -0700)]
rcu: document TINY_RCU and TINY_PREEMPT_RCU tracing.

Add the required verbiage to Documentation/RCU/trace.txt.

Signed-off-by: Paul E. McKenney <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: add tracing for TINY_RCU and TINY_PREEMPT_RCU
Paul E. McKenney [Fri, 1 Oct 2010 04:26:52 +0000 (21:26 -0700)]
rcu: add tracing for TINY_RCU and TINY_PREEMPT_RCU

Add tracing for the tiny RCU implementations, including statistics on
boosting in the case of TINY_PREEMPT_RCU and RCU_BOOST.

Signed-off-by: Paul E. McKenney <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: priority boosting for TINY_PREEMPT_RCU
Paul E. McKenney [Tue, 28 Sep 2010 00:25:23 +0000 (17:25 -0700)]
rcu: priority boosting for TINY_PREEMPT_RCU

Add priority boosting, but only for TINY_PREEMPT_RCU.  This is enabled
by the default-off RCU_BOOST kernel parameter.  The priority to which to
boost preempted RCU readers is controlled by the RCU_BOOST_PRIO kernel
parameter (defaulting to real-time priority 1) and the time to wait
before boosting the readers blocking a given grace period is controlled
by the RCU_BOOST_DELAY kernel parameter (defaulting to 500 milliseconds).

Signed-off-by: Paul E. McKenney <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: move TINY_RCU from softirq to kthread
Paul E. McKenney [Thu, 9 Sep 2010 20:40:39 +0000 (13:40 -0700)]
rcu: move TINY_RCU from softirq to kthread

If RCU priority boosting is to be meaningful, callback invocation must
be boosted in addition to preempted RCU readers.  Otherwise, in presence
of CPU real-time threads, the grace period ends, but the callbacks don't
get invoked.  If the callbacks don't get invoked, the associated memory
doesn't get freed, so the system is still subject to OOM.

But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
moves the callback invocations to a kthread, which can be boosted easily.

Signed-off-by: Paul E. McKenney <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: add priority-inversion testing to rcutorture
Paul E. McKenney [Thu, 2 Sep 2010 23:16:14 +0000 (16:16 -0700)]
rcu: add priority-inversion testing to rcutorture

Add an optional test to force long-term preemption of RCU read-side
critical sections, controlled by new test_boost, test_boost_interval,
and test_boost_duration module parameters.  This is to be used to
test RCU priority boosting.

Signed-off-by: Paul E. McKenney <>
11 years agosched: fix RCU lockdep splat from task_group()
Peter Zijlstra [Thu, 16 Sep 2010 15:50:31 +0000 (17:50 +0200)]
sched: fix RCU lockdep splat from task_group()

This addresses the following RCU lockdep splat:

[0.051203] CPU0: AMD QEMU Virtual CPU version 0.12.4 stepping 03
[0.052999] lockdep: fixing up alternatives.
[0.054106] ===================================================
[0.054999] [ INFO: suspicious rcu_dereference_check() usage. ]
[0.054999] ---------------------------------------------------
[0.054999] kernel/sched.c:616 invoked rcu_dereference_check() without protection!
[0.054999] other info that might help us debug this:
[0.054999] rcu_scheduler_active = 1, debug_locks = 1
[0.054999] 3 locks held by swapper/1:
[0.054999]  #0:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff814be933>] cpu_up+0x42/0x6a
[0.054999]  #1:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff810400d8>] cpu_hotplug_begin+0x2a/0x51
[0.054999]  #2:  (&rq->lock){-.-...}, at: [<ffffffff814be2f7>] init_idle+0x2f/0x113
[0.054999] stack backtrace:
[0.054999] Pid: 1, comm: swapper Not tainted 2.6.35 #1
[0.054999] Call Trace:
[0.054999]  [<ffffffff81068054>] lockdep_rcu_dereference+0x9b/0xa3
[0.054999]  [<ffffffff810325c3>] task_group+0x7b/0x8a
[0.054999]  [<ffffffff810325e5>] set_task_rq+0x13/0x40
[0.054999]  [<ffffffff814be39a>] init_idle+0xd2/0x113
[0.054999]  [<ffffffff814be78a>] fork_idle+0xb8/0xc7
[0.054999]  [<ffffffff81068717>] ? mark_held_locks+0x4d/0x6b
[0.054999]  [<ffffffff814bcebd>] do_fork_idle+0x17/0x2b
[0.054999]  [<ffffffff814bc89b>] native_cpu_up+0x1c1/0x724
[0.054999]  [<ffffffff814bcea6>] ? do_fork_idle+0x0/0x2b
[0.054999]  [<ffffffff814be876>] _cpu_up+0xac/0x127
[0.054999]  [<ffffffff814be946>] cpu_up+0x55/0x6a
[0.054999]  [<ffffffff81ab562a>] kernel_init+0xe1/0x1ff
[0.054999]  [<ffffffff81003854>] kernel_thread_helper+0x4/0x10
[0.054999]  [<ffffffff814c353c>] ? restore_args+0x0/0x30
[0.054999]  [<ffffffff81ab5549>] ? kernel_init+0x0/0x1ff
[0.054999]  [<ffffffff81003850>] ? kernel_thread_helper+0x0/0x10
[0.056074] Booting Node   0, Processors  #1lockdep: fixing up alternatives.
[0.130045]  #2lockdep: fixing up alternatives.
[0.203089]  #3 Ok.
[0.275286] Brought up 4 CPUs
[0.276005] Total of 4 processors activated (16017.17 BogoMIPS).

The cgroup_subsys_state structures referenced by idle tasks are never
freed, because the idle tasks should be part of the root cgroup,
which is not removable.

The problem is that while we do in-fact hold rq->lock, the newly spawned
idle thread's cpu is not yet set to the correct cpu so the lockdep check
in task_group():


will fail.

But this is a chicken and egg problem.  Setting the CPU's runqueue requires
that the CPU's runqueue already be set.  ;-)

So insert an RCU read-side critical section to avoid the complaint.

Signed-off-by: Peter Zijlstra <>
Signed-off-by: Paul E. McKenney <>
11 years agorcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value
Dongdong Deng [Tue, 28 Sep 2010 08:32:43 +0000 (16:32 +0800)]
rcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value

Using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value
due to the caller didn't hold the root_rcu/rnp node's lock.  Although
use without ACCESS_ONCE() is safe due to the value loaded being used
but once, the ACCESS_ONCE() is a good documentation aid -- the variables
are being loaded without the services of a lock.

Signed-off-by: Dongdong Deng <>
CC: Dipankar Sarma <>
CC: Paul E. McKenney <>
Signed-off-by: Paul E. McKenney <>
11 years agosched: suppress RCU lockdep splat in task_fork_fair
Paul E. McKenney [Thu, 7 Oct 2010 00:32:51 +0000 (17:32 -0700)]
sched: suppress RCU lockdep splat in task_fork_fair

> ===================================================
> [ INFO: suspicious rcu_dereference_check() usage. ]
> ---------------------------------------------------
> /home/greearb/git/linux.wireless-testing/kernel/sched.c:618 invoked rcu_dereference_check() without protection!
> other info that might help us debug this:
> rcu_scheduler_active = 1, debug_locks = 1
> 1 lock held by ifup/23517:
>   #0:  (&rq->lock){-.-.-.}, at: [<c042f782>] task_fork_fair+0x3b/0x108
> stack backtrace:
> Pid: 23517, comm: ifup Not tainted 2.6.36-rc6-wl+ #5
> Call Trace:
>   [<c075e219>] ? printk+0xf/0x16
>   [<c0455842>] lockdep_rcu_dereference+0x74/0x7d
>   [<c0426854>] task_group+0x6d/0x79
>   [<c042686e>] set_task_rq+0xe/0x57
>   [<c042f79e>] task_fork_fair+0x57/0x108
>   [<c042e965>] sched_fork+0x82/0xf9
>   [<c04334b3>] copy_process+0x569/0xe8e
>   [<c0433ef0>] do_fork+0x118/0x262
>   [<c076302f>] ? do_page_fault+0x16a/0x2cf
>   [<c044b80c>] ? up_read+0x16/0x2a
>   [<c04085ae>] sys_clone+0x1b/0x20
>   [<c04030a5>] ptregs_clone+0x15/0x30
>   [<c0402f1c>] ? sysenter_do_call+0x12/0x38

Here a newly created task is having its runqueue assigned.  The new task
is not yet on the tasklist, so cannot go away.  This is therefore a false
positive, suppress with an RCU read-side critical section.

Reported-by: Ben Greear <
Signed-off-by: Paul E. McKenney <>
Tested-by: Ben Greear <
11 years agonet: suppress RCU lockdep false positive in sock_update_classid
Paul E. McKenney [Thu, 7 Oct 2010 00:15:35 +0000 (17:15 -0700)]
net: suppress RCU lockdep false positive in sock_update_classid

> ===================================================
> [ INFO: suspicious rcu_dereference_check() usage. ]
> ---------------------------------------------------
> include/linux/cgroup.h:542 invoked rcu_dereference_check() without protection!
> other info that might help us debug this:
> rcu_scheduler_active = 1, debug_locks = 0
> 1 lock held by swapper/1:
>  #0:  (net_mutex){+.+.+.}, at: [<ffffffff813e9010>]
> register_pernet_subsys+0x1f/0x47
> stack backtrace:
> Pid: 1, comm: swapper Not tainted #1
> Call Trace:
>  [<ffffffff8107bd3a>] lockdep_rcu_dereference+0xaa/0xb3
>  [<ffffffff813e04b9>] sock_update_classid+0x7c/0xa2
>  [<ffffffff813e054a>] sk_alloc+0x6b/0x77
>  [<ffffffff8140b281>] __netlink_create+0x37/0xab
>  [<ffffffff813f941c>] ? rtnetlink_rcv+0x0/0x2d
>  [<ffffffff8140cee1>] netlink_kernel_create+0x74/0x19d
>  [<ffffffff8149c3ca>] ? __mutex_lock_common+0x339/0x35b
>  [<ffffffff813f7e9c>] rtnetlink_net_init+0x2e/0x48
>  [<ffffffff813e8d7a>] ops_init+0xe9/0xff
>  [<ffffffff813e8f0d>] register_pernet_operations+0xab/0x130
>  [<ffffffff813e901f>] register_pernet_subsys+0x2e/0x47
>  [<ffffffff81db7bca>] rtnetlink_init+0x53/0x102
>  [<ffffffff81db835c>] netlink_proto_init+0x126/0x143
>  [<ffffffff81db8236>] ? netlink_proto_init+0x0/0x143
>  [<ffffffff810021b8>] do_one_initcall+0x72/0x186
>  [<ffffffff81d78ebc>] kernel_init+0x23b/0x2c9
>  [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
>  [<ffffffff8149e2d0>] ? restore_args+0x0/0x30
>  [<ffffffff81d78c81>] ? kernel_init+0x0/0x2c9
>  [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10

The sock_update_classid() function calls task_cls_classid(current),
but the calling task cannot go away, so there is no danger of
the associated structures disappearing.  Insert an RCU read-side
critical section to suppress the false positive.

Reported-by: Subrata Modak <>
Signed-off-by: Paul E. McKenney <>
11 years agoMerge commit 'v2.6.36-rc7' into core/rcu
Ingo Molnar [Thu, 7 Oct 2010 07:43:38 +0000 (09:43 +0200)]
Merge commit 'v2.6.36-rc7' into core/rcu

Merge reason: Update from -rc3 to -rc7.

Signed-off-by: Ingo Molnar <>
11 years agoMerge branch 'rcu/urgent' of git://
Ingo Molnar [Thu, 7 Oct 2010 07:43:11 +0000 (09:43 +0200)]
Merge branch 'rcu/urgent' of git://git./linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu

11 years agoLinux 2.6.36-rc7 v2.6.36-rc7
Linus Torvalds [Wed, 6 Oct 2010 20:39:52 +0000 (13:39 -0700)]
Linux 2.6.36-rc7

11 years agoMerge branch 'upstream' of git://
Linus Torvalds [Wed, 6 Oct 2010 20:27:19 +0000 (13:27 -0700)]
Merge branch 'upstream' of git://

* 'upstream' of git://
  MIPS: Octeon: Place cnmips_cu2_setup in __init memory.
  MIPS: Don't place cu2 notifiers in __cpuinitdata
  MIPS: Calculate VMLINUZ_LOAD_ADDRESS based on the length of vmlinux.bin
  MIPS: Alchemy: Resolve prom section mismatches
  MIPS: Fix syscall 64 bit number comments.
  MIPS: Hookup fanotify_init, fanotify_mark, and prlimit64 syscalls.
  MIPS: N32: Fix getdents64 syscall for n32
  MIPS: Remove pr_<level> uses of KERN_<level>
  MIPS: PNX8550: Sort out machine halt, restart and powerdown functions.
  MIPS: GIC: Remove dependencies from Malta files.
  MIPS: Kconfig: Fix and clarify kconfig help text for VSMP and SMTC.
  MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.
  MIPS: Audit: Fix hang in entry.S.
  MIPS: Document why RELOC_HIDE is there.
  MIPS: Octeon: Determine if helper needs to be built
  MIPS: Use generic atomic64 for 32-bit kernels
  MIPS: RM7000: Symbol should be static
  MIPS: kspd: Adjust confusing if indentation
  MIPS: Fix a typo.

11 years agoMerge branch 'for-linus' of git://
Linus Torvalds [Wed, 6 Oct 2010 18:11:18 +0000 (11:11 -0700)]
Merge branch 'for-linus' of git://

* 'for-linus' of git://
  writeback: always use sb->s_bdi for writeback purposes

11 years agoMerge branch 'v2.6.36-rc6-urgent-fixes' of git://
Linus Torvalds [Wed, 6 Oct 2010 16:51:28 +0000 (09:51 -0700)]
Merge branch 'v2.6.36-rc6-urgent-fixes' of git://

* 'v2.6.36-rc6-urgent-fixes' of git://
  xen: do not initialize PV timers on HVM if !xen_have_vector_callback
  xen: do not set xenstored_ready before xenbus_probe on hvm

11 years agoMerge branch 'for-linus' of git://
Linus Torvalds [Wed, 6 Oct 2010 16:50:41 +0000 (09:50 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mszeredi/fuse

* 'for-linus' of git://
  fuse: Initialize total_len in fuse_retrieve()

11 years agopowerpc: remove unused variable
Stephen Rothwell [Wed, 6 Oct 2010 00:06:44 +0000 (11:06 +1100)]
powerpc: remove unused variable

Since powerpc uses -Werror on arch powerpc, the build was broken like

  cc1: warnings being treated as errors
  arch/powerpc/kernel/module.c: In function 'module_finalize':
  arch/powerpc/kernel/module.c:66: error: unused variable 'err'

Signed-off-by: Stephen Rothwell <>
Signed-off-by: Linus Torvalds <>
11 years agorcu: move check from rcu_dereference_bh to rcu_read_lock_bh_held
Paul E. McKenney [Tue, 5 Oct 2010 21:03:02 +0000 (14:03 -0700)]
rcu: move check from rcu_dereference_bh to rcu_read_lock_bh_held

As suggested by Linus, push the irqs_disabled() down to the
rcu_read_lock_bh_held() level so that all callers get the benefit
of the correct check.

Signed-off-by: Paul E. McKenney <>
11 years agoMerge branch 'core-fixes-for-linus' of git://
Linus Torvalds [Tue, 5 Oct 2010 20:07:43 +0000 (13:07 -0700)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://
  rcu: rcu_read_lock_bh_held(): disabling irqs also disables bh
  generic-ipi: Fix deadlock in __smp_call_function_single

11 years agoMerge branch 'perf-fixes-for-linus' of git://
Linus Torvalds [Tue, 5 Oct 2010 18:57:37 +0000 (11:57 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://
  perf trace scripting: Fix extern struct definitions
  perf ui hist browser: Fix segfault on 'a' for annotate
  perf tools: Fix build breakage
  perf, x86: Handle in flight NMIs on P4 platform
  oprofile, ARM: Release resources on failure
  oprofile: Add Support for Intel CPU Family 6 / Model 29

11 years agowait: using uninitialized member of wait queue
Evgeny Kuznetsov [Tue, 5 Oct 2010 08:47:57 +0000 (12:47 +0400)]
wait: using uninitialized member of wait queue

The "flags" member of "struct wait_queue_t" is used in several places in
the kernel code without beeing initialized by init_wait().  "flags" is
used in bitwise operations.

If "flags" not initialized then unexpected behaviour may take place.
Incorrect flags might used later in code.

Added initialization of "wait_queue_t.flags" with zero value into

Signed-off-by: Evgeny Kuznetsov <>
[ The bit we care about does end up being initialized by both
   prepare_to_wait() and add_to_wait_queue(), so this doesn't seem to
   cause actual bugs, but is definitely the right thing to do -Linus ]
Signed-off-by: Linus Torvalds <>
11 years agomodules: Fix module_bug_list list corruption race
Linus Torvalds [Tue, 5 Oct 2010 18:29:27 +0000 (11:29 -0700)]
modules: Fix module_bug_list list corruption race

With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.

However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling.  That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.

Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.

So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.

Future fixups:
 - move the module list handling code into kernel/module.c where it
 - get rid of 'module_bug_list' and just use the regular list of modules
   (called 'modules' - imagine that) that we already create and maintain
   for other reasons.

Reported-and-tested-by: Thomas Gleixner <>
Cc: Rusty Russell <>
Cc: Adrian Bunk <>
Cc: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
11 years agoxen: do not initialize PV timers on HVM if !xen_have_vector_callback
Stefano Stabellini [Fri, 1 Oct 2010 16:35:46 +0000 (17:35 +0100)]
xen: do not initialize PV timers on HVM if !xen_have_vector_callback

if !xen_have_vector_callback do not initialize PV timer unconditionally
because we still don't know how many cpus are available and if there is
more than one we won't be able to receive the timer interrupts on
cpu > 0.

This patch fixes an hang at boot when Xen does not support vector
callbacks and the guest has multiple vcpus.

Signed-off-by: Stefano Stabellini <>
Acked-by: Jeremy Fitzhardinge <>
11 years agoxen: do not set xenstored_ready before xenbus_probe on hvm
Stefano Stabellini [Mon, 4 Oct 2010 15:10:06 +0000 (16:10 +0100)]
xen: do not set xenstored_ready before xenbus_probe on hvm

Register_xenstore_notifier should guarantee that the caller gets
notified even if xenstore is already up.
Therefore we revert "do not notify callers from
register_xenstore_notifier" and set xenstored_read at the right time for
PV on HVM guests too.
In fact in case of PV on HVM guests xenstored is ready only after the
platform pci driver has completed the initialization, so do not set
xenstored_ready before the call to xenbus_probe().

This patch fixes a shutdown_event watcher registration bug that causes
"xm shutdown" not to work properly.

Signed-off-by: Stefano Stabellini <>
Acked-by: Jeremy Fitzhardinge <>
11 years agoMerge branch 'for-linus' of git://
Linus Torvalds [Mon, 4 Oct 2010 20:35:48 +0000 (13:35 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/lrg/voltage-2.6

* 'for-linus' of git://
  regulator: max8649 - fix setting extclk_freq
  regulator: fix typo in current units
  regulator: fix device_register() error handling

11 years agoMerge branch 'merge-powerpc' of git://
Linus Torvalds [Mon, 4 Oct 2010 18:45:35 +0000 (11:45 -0700)]
Merge branch 'merge-powerpc' of git://

* 'merge-powerpc' of git://
  powerpc/5200: tighten up ac97 reset timing
  powerpc/5200: efika.c: Add of_node_put to avoid memory leak
  powerpc/512x: fix clk_get() return value

11 years agoMerge branch 'fix/misc' of git://
Linus Torvalds [Mon, 4 Oct 2010 18:15:59 +0000 (11:15 -0700)]
Merge branch 'fix/misc' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'fix/misc' of git://
  ALSA: i2c/other/ak4xx-adda: Fix a compile warning with CONFIG_PROCFS=n
  ALSA: prevent heap corruption in snd_ctl_new()

11 years agoMerge branch 'hwmon-for-linus' of git://
Linus Torvalds [Mon, 4 Oct 2010 18:15:06 +0000 (11:15 -0700)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/staging

* 'hwmon-for-linus' of git://
  hwmon: f71882fg: use a muxed resource lock for the Super I/O port

11 years agoMerge branch 'fixes' of git://
Linus Torvalds [Mon, 4 Oct 2010 18:14:21 +0000 (11:14 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/davej/cpufreq

* 'fixes' of git://
  [CPUFREQ] Fix memory leaks in pcc_cpufreq_do_osc
  [CPUFREQ] acpi-cpufreq: add missing __percpu markup

11 years agoMerge branch 'merge-spi' of git://
Linus Torvalds [Mon, 4 Oct 2010 18:13:22 +0000 (11:13 -0700)]
Merge branch 'merge-spi' of git://

* 'merge-spi' of git://
  of/spi: Fix OF-style driver binding of spi devices
  spi: spi-gpio.c tests SPI_MASTER_NO_RX bit twice, but not SPI_MASTER_NO_TX
  spi/mpc8xxx: fix buffer overrun on large transfers

11 years agoMerge git://
Linus Torvalds [Mon, 4 Oct 2010 18:11:01 +0000 (11:11 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6

* git://
  vlan: dont drop packets from unknown vlans in promiscuous mode
  Phonet: Correct header retrieval after pskb_may_pull
  um: Proper Fix for f25c80a4: remove duplicate structure field initialization
  ip_gre: Fix dependencies wrt. ipv6.
  net-2.6: SYN retransmits: Add new parameter to retransmits_timed_out()
  iwl3945: queue the right work if the scan needs to be aborted
  mac80211: fix use-after-free

11 years agoMerge branch 'drm-intel-fixes' of git://
Linus Torvalds [Mon, 4 Oct 2010 18:10:26 +0000 (11:10 -0700)]
Merge branch 'drm-intel-fixes' of git://git./linux/kernel/git/ickle/drm-intel

* 'drm-intel-fixes' of git://
  drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow
  drm/i915: Sanity check pread/pwrite
  drm/i915: Use pipe state to tell when pipe is off
  drm/i915: vblank status not valid while training display port
  drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code
  drm/i915: Fix refleak during eviction.
  drm/i915: fix GMCH power reporting

11 years agoksm: fix bad user data when swapping
Hugh Dickins [Sun, 3 Oct 2010 00:49:08 +0000 (17:49 -0700)]
ksm: fix bad user data when swapping

Building under memory pressure, with KSM on 2.6.36-rc5, collapsed with
an internal compiler error: typically indicating an error in swapping.

Perhaps there's a timing issue which makes it now more likely, perhaps
it's just a long time since I tried for so long: this bug goes back to
KSM swapping in 2.6.33.

Notice how reuse_swap_page() allows an exclusive page to be reused, but
only does SetPageDirty if it can delete it from swap cache right then -
if it's currently under Writeback, it has to be left in cache and we
don't SetPageDirty, but the page can be reused.  Fine, the dirty bit
will get set in the pte; but notice how zap_pte_range() does not bother
to transfer pte_dirty to page_dirty when unmapping a PageAnon.

If KSM chooses to share such a page, it will look like a clean copy of
swapcache, and not be written out to swap when its memory is needed;
then stale data read back from swap when it's needed again.

We could fix this in reuse_swap_page() (or even refuse to reuse a
page under writeback), but it's more honest to fix my oversight in
KSM's write_protect_page().  Several days of testing on three machines
confirms that this fixes the issue they showed.

Signed-off-by: Hugh Dickins <>
Cc: Andrew Morton <>
Cc: Andrea Arcangeli <>
Signed-off-by: Linus Torvalds <>
11 years agoksm: fix page_address_in_vma anon_vma oops
Hugh Dickins [Sun, 3 Oct 2010 00:46:06 +0000 (17:46 -0700)]
ksm: fix page_address_in_vma anon_vma oops

2.6.36-rc1 commit 21d0d443cdc1658a8c1484fdcece4803f0f96d0e "rmap:
resurrect page_address_in_vma anon_vma check" was right to resurrect
that check; but now that it's comparing anon_vma->roots instead of
just anon_vmas, there's a danger of oopsing on a NULL anon_vma.

In most cases no NULL anon_vma ever gets here; but it turns out that
occasionally KSM, when enabled on a forked or forking process, will
itself call page_address_in_vma() on a "half-KSM" page left over from
an earlier failed attempt to merge - whose page_anon_vma() is NULL.

It's my bug that those should be getting here at all: I thought they
were already dealt with, this oops proves me wrong, I'll fix it in
the next release - such pages are effectively pinned until their
process exits, since rmap cannot find their ptes (though swapoff can).

For now just work around it by making page_address_in_vma() safe (and
add a comment on why that check is wanted anyway).  A similar check
in __page_check_anon_rmap() is safe because do_page_add_anon_rmap()
already excluded KSM pages.

Signed-off-by: Hugh Dickins <>
Cc: Andrew Morton <>
Cc: Andrea Arcangeli <>
Cc: Rik van Riel <>
Signed-off-by: Linus Torvalds <>
11 years agoMIPS: Octeon: Place cnmips_cu2_setup in __init memory.
David Daney [Thu, 23 Sep 2010 18:24:09 +0000 (11:24 -0700)]
MIPS: Octeon: Place cnmips_cu2_setup in __init memory.

It is an early_initcall, so it should be in __init memory.

Signed-off-by: David Daney <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Don't place cu2 notifiers in __cpuinitdata
David Daney [Thu, 23 Sep 2010 18:23:29 +0000 (11:23 -0700)]
MIPS: Don't place cu2 notifiers in __cpuinitdata

The notifiers may be called at any time, so the notifier_block cannot
be in init memory.

Signed-off-by: David Daney <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Calculate VMLINUZ_LOAD_ADDRESS based on the length of vmlinux.bin
Shmulik Ladkani [Tue, 31 Aug 2010 10:24:19 +0000 (13:24 +0300)]
MIPS: Calculate VMLINUZ_LOAD_ADDRESS based on the length of vmlinux.bin

Fix VMLINUZ_LOAD_ADDRESS calculation to be based on the length of
vmlinux.bin, the actual uncompressed kernel binary.

Previously it was based on the length of KBUILD_IMAGE (the unstripped ELF
vmlinux), which is bigger than vmlinux.bin.  As a result, vmlinuz was
loaded into a memory address higher then actually needed - a problem for
small memory platforms.

Signed-off-by: Shmulik Ladkani <>
Acked-by: Wu Zhangjin <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Alchemy: Resolve prom section mismatches
Manuel Lauss [Thu, 19 Aug 2010 11:37:13 +0000 (13:37 +0200)]
MIPS: Alchemy: Resolve prom section mismatches

The function prom_init_cmdline() references the variable __initdata

The function prom_get_ethernet_addr() references the variable __initdata

Annotate prom_init_cmdline() as __init, unexport and annotate
prom_get_ethernet_addr() since it's no longer called from within
driver code.

Signed-off-by: Manuel Lauss <>
To: Linux-MIPS <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Fix syscall 64 bit number comments.
Ralf Baechle [Mon, 20 Sep 2010 14:00:19 +0000 (15:00 +0100)]
MIPS: Fix syscall 64 bit number comments.

Noticed and original patch by Philby John <>.

Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Hookup fanotify_init, fanotify_mark, and prlimit64 syscalls.
David Daney [Mon, 23 Aug 2010 21:10:37 +0000 (14:10 -0700)]
MIPS: Hookup fanotify_init, fanotify_mark, and prlimit64 syscalls.

Signed-off-by: David Daney <>
Signed-off-by: Ralf Baechle <>
FUJITA Tomonori [Sat, 14 Aug 2010 07:02:37 +0000 (16:02 +0900)]

Architectures need to set ARCH_DMA_MINALIGN to the minimum DMA
alignment (commit a6eb9fe105d5de0053b261148cee56c94b4720ca). Defining
ARCH_KMALLOC_MINALIGN doesn't work anymore.

Signed-off-by: FUJITA Tomonori <>
Acked-by: Atsushi Nemoto <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: N32: Fix getdents64 syscall for n32
Bernhard Walle [Fri, 3 Sep 2010 08:15:34 +0000 (10:15 +0200)]
MIPS: N32: Fix getdents64 syscall for n32

Commit 31c984a5acabea5d8c7224dc226453022be46f33 introduced a new syscall
getdents64. However, in the syscall table, the new syscall still refers to
the old getdents which doesn't work.

The problem appeared with a system that uses the eglibc 2.12-r11187 (that
utilizes that new syscall) is very confused. The fix has been tested with
that eglibc version.

Signed-off-by: Bernhard Walle <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Remove pr_<level> uses of KERN_<level>
Joe Perches [Sun, 12 Sep 2010 05:10:52 +0000 (22:10 -0700)]
MIPS: Remove pr_<level> uses of KERN_<level>

These would result in KERN_<level> actually getting printed.

Signed-off-by: Joe Perches <>
To: Jiri Kosina <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: PNX8550: Sort out machine halt, restart and powerdown functions.
Ralf Baechle [Sat, 18 Sep 2010 23:09:09 +0000 (00:09 +0100)]
MIPS: PNX8550: Sort out machine halt, restart and powerdown functions.

No rubbish printks - those belong to userspace.  The halt function now
actually halts the system and the poweroff function was deleted because
it didn't actually power down the system.

Signed-off-by: Ralf Baechle <>
11 years agoMIPS: GIC: Remove dependencies from Malta files.
Ralf Baechle [Fri, 17 Sep 2010 16:07:48 +0000 (17:07 +0100)]
MIPS: GIC: Remove dependencies from Malta files.

This prevents the GIC code from being reusable sanely.

Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Kconfig: Fix and clarify kconfig help text for VSMP and SMTC.
Ralf Baechle [Thu, 16 Sep 2010 10:40:41 +0000 (11:40 +0100)]
MIPS: Kconfig: Fix and clarify kconfig help text for VSMP and SMTC.

Only VSMP was known as SMVP and generally the help text was too short to
be helpful.

Signed-off-by: Ralf Baechle <>
11 years agoMIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.
Ralf Baechle [Thu, 2 Sep 2010 21:22:23 +0000 (23:22 +0200)]
MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.

This only matters for ISA devices with a 24-bit DMA limit or for devices
with a 32-bit DMA limit on systems with ZONE_DMA32 enabled.  The latter
currently only affects 32-bit PCI cards on Sibyte-based systems with more
than 1GB RAM installed.

Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Audit: Fix hang in entry.S.
Ralf Baechle [Thu, 2 Sep 2010 20:59:58 +0000 (22:59 +0200)]
MIPS: Audit: Fix hang in entry.S.

_TIF_WORK_MASK false had _TIF_SYSCALL_AUDIT set.  If a thread's
_TIF_SYSCALL_AUDIT is ever set this will lead to an endless loop on the
way out from a syscall.

Currently this is only a theoretic bug as init/Kconfig doesn't allow
AUDIT_SYSCALL to be enabled for MIPS.

Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Document why RELOC_HIDE is there.
Ralf Baechle [Tue, 17 Aug 2010 15:01:59 +0000 (16:01 +0100)]
MIPS: Document why RELOC_HIDE is there.

Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Octeon: Determine if helper needs to be built
Andreas Bießmann [Wed, 11 Aug 2010 16:49:53 +0000 (18:49 +0200)]
MIPS: Octeon: Determine if helper needs to be built

This patch adds an config switch to determine if we need to build some
workaround helper files.

The staging driver octeon-ethernet references some symbols which are only
built when PCI is enabled. The new config switch enables these symbols in
bothe cases.

Signed-off-by: Andreas Bießmann <>
Cc: Andreas Bießmann <>
Acked-by: David Daney <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Use generic atomic64 for 32-bit kernels
Deng-Cheng Zhu [Wed, 9 Jun 2010 04:35:25 +0000 (12:35 +0800)]
MIPS: Use generic atomic64 for 32-bit kernels

The 64-bit kernel has already had its atomic64 functions. Except for that,
we use the generic spinlocked version. The atomic64 types and related
functions are needed for the Linux performance counter subsystem.

Signed-off-by: Deng-Cheng Zhu <>
Acked-by: David Daney <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: RM7000: Symbol should be static
Ricardo Mendoza [Fri, 6 Aug 2010 15:42:57 +0000 (11:12 -0430)]
MIPS: RM7000: Symbol should be static

Signed-off-by: Ricardo Mendoza <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: kspd: Adjust confusing if indentation
Julia Lawall [Thu, 5 Aug 2010 20:17:22 +0000 (22:17 +0200)]
MIPS: kspd: Adjust confusing if indentation

Indent the branch of an if.

The semantic match that finds this problem is as follows:

// <smpl>
@r disable braces4@
position p1,p2;
statement S1,S2;

if (...) { ... }
if (...) S1@p1 S2@p2

p1 << r.p1;
p2 << r.p2;

if (p1[0].column == p2[0].column):
// </smpl>

Signed-off-by: Julia Lawall <>
Signed-off-by: Ralf Baechle <>
11 years agoMIPS: Fix a typo.
Andrea Gelmini [Thu, 5 Aug 2010 13:51:25 +0000 (15:51 +0200)]
MIPS: Fix a typo.

"Userpace" -> "Userspace"

Signed-off-by: Andrea Gelmini <>
Cc: Andrea Gelmini <>
Cc: Jason Wessel <>
Cc: Martin Hicks <>
Signed-off-by: Ralf Baechle <>
11 years agoperf trace scripting: Fix extern struct definitions
Stephane Eranian [Mon, 20 Sep 2010 22:45:01 +0000 (00:45 +0200)]
perf trace scripting: Fix extern struct definitions

Both python_scripting_ops and perl_scripting_ops have two global definitions.
One in trace-event-scripting.c and one in their respective scripting-engine

The issue is that depending on the linker order one definition or the other
is chosen. One is uninitialized (bss), while the other is initialized. If
the uninitialized version is chosen, then perf does not function properly.

This patch fixes this by adding the extern prefix to the definitions in

Cc: David S. Miller <>
Cc: Frederic Weisbecker <>
Cc: Ingo Molnar <>
Cc: Paul Mackerras <>
Cc: Peter Zijlstra <>
Cc: Robert Richter <>
LKML-Reference: <>
Signed-off-by: Stephane Eranian <>
Signed-off-by: Arnaldo Carvalho de Melo <>
11 years agoperf ui hist browser: Fix segfault on 'a' for annotate
Frederik Deweerdt [Thu, 23 Sep 2010 20:19:01 +0000 (22:19 +0200)]
perf ui hist browser: Fix segfault on 'a' for annotate

There a typo in util/ui/browsers/hists.c that leads to a segfault when you
press the 'a' key on a non-resolved symbol (plain hex address).

LKML-Reference: <20100923201901.GE31726@gambetta>
Signed-off-by: Frederik Deweerdt <>
Signed-off-by: Arnaldo Carvalho de Melo <>
11 years agoperf tools: Fix build breakage
Kusanagi Kouichi [Sun, 26 Sep 2010 17:17:42 +0000 (14:17 -0300)]
perf tools: Fix build breakage

The patch ecafda6 introduced a problem where all object files would be
always rebuilt, fix it by using:

Reported-by: Arnaldo Carvalho de Melo <>
Cc: Bernd Petrovitsch <>
Signed-off-by: Kusanagi Kouichi <>
Signed-off-by: Arnaldo Carvalho de Melo <>
11 years agowriteback: always use sb->s_bdi for writeback purposes
Christoph Hellwig [Mon, 4 Oct 2010 12:25:33 +0000 (14:25 +0200)]
writeback: always use sb->s_bdi for writeback purposes

We currently use struct backing_dev_info for various different purposes.
Originally it was introduced to describe a backing device which includes
an unplug and congestion function and various bits of readahead information
and VM-relevant flags.  We're also using for tracking dirty inodes for

To make writeback properly find all inodes we need to only access the
per-filesystem backing_device pointed to by the superblock in ->s_bdi
inside the writeback code, and not the instances pointeded to by
inode->i_mapping->backing_dev which can be overriden by special devices
or might not be set at all by some filesystems.

Long term we should split out the writeback-relevant bits of struct
backing_device_info (which includes more than the current bdi_writeback)
and only point to it from the superblock while leaving the traditional
backing device as a separate structure that can be overriden by devices.

The one exception for now is the block device filesystem which really
wants different writeback contexts for it's different (internal) inodes
to handle the writeout more efficiently.  For now we do this with
a hack in fs-writeback.c because we're so late in the cycle, but in
the future I plan to replace this with a superblock method that allows
for multiple writeback contexts per filesystem.

Signed-off-by: Christoph Hellwig <>
Signed-off-by: Jens Axboe <>
11 years agofuse: Initialize total_len in fuse_retrieve()
Geert Uytterhoeven [Thu, 30 Sep 2010 20:06:21 +0000 (22:06 +0200)]
fuse: Initialize total_len in fuse_retrieve()

fs/fuse/dev.c:1357: warning: ‘total_len’ may be used uninitialized in this

Initialize total_len to zero, else its value will be undefined.

Signed-off-by: Geert Uytterhoeven <>
Signed-off-by: Miklos Szeredi <>
11 years agodrm/i915: Rephrase pwrite bounds checking to avoid any potential overflow
Chris Wilson [Sun, 26 Sep 2010 19:21:44 +0000 (20:21 +0100)]
drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow

... and do the same for pread.

Signed-off-by: Chris Wilson <>
11 years agodrm/i915: Sanity check pread/pwrite
Chris Wilson [Sun, 26 Sep 2010 19:50:05 +0000 (20:50 +0100)]
drm/i915: Sanity check pread/pwrite

Move the access control up from the fast paths, which are no longer
universally taken first, up into the caller. This then duplicates some
sanity checking along the slow paths, but is much simpler.
Tracked as CVE-2010-2962.

Reported-by: Kees Cook <>
Signed-off-by: Chris Wilson <>
11 years agohwmon: f71882fg: use a muxed resource lock for the Super I/O port
Giel van Schijndel [Sun, 3 Oct 2010 12:09:49 +0000 (08:09 -0400)]
hwmon: f71882fg: use a muxed resource lock for the Super I/O port

Sleep while acquiring a resource lock on the Super I/O port. This should
prevent collisions from causing the hardware probe to fail with -EBUSY.

Signed-off-by: Giel van Schijndel <>
Acked-by: Hans de Goede <>
Signed-off-by: Guenter Roeck <>
11 years agodrm/i915: Use pipe state to tell when pipe is off
Keith Packard [Sun, 3 Oct 2010 07:33:06 +0000 (00:33 -0700)]
drm/i915: Use pipe state to tell when pipe is off

Instead of waiting for the display line value to settle, we can simply
wait for the pipe configuration register 'state' bit to turn off.

Contrarywise, disabling the plane will not cause the display line
value to stop changing, so instead we wait for the vblank interrupt
bit to get set. And, we only do this when we're not about to wait for
the pipe to turn off.

Signed-off-by: Keith Packard <>
Signed-off-by: Chris Wilson <>
11 years agodrm/i915: vblank status not valid while training display port
Keith Packard [Sun, 3 Oct 2010 07:33:05 +0000 (00:33 -0700)]
drm/i915: vblank status not valid while training display port

While the display port is in training mode, vblank interrupts don't
occur. Because we have to wait for the display port output to turn on
before starting the training sequence, enable the output in 'normal'
mode so that we can tell when a vblank has occurred, then start the
training sequence.

Signed-off-by: Keith Packard <>
Signed-off-by: Chris Wilson <>
11 years agoof/spi: Fix OF-style driver binding of spi devices
Sinan Akman [Sun, 3 Oct 2010 03:28:29 +0000 (21:28 -0600)]
of/spi: Fix OF-style driver binding of spi devices

This patch adds the OF hook to the spi core so that devices
can automatically be registered based on device tree data.  This fixes
a problem with spi devices not binding to drivers after the cleanup of
the spi & i2c binding code.

Signed-off-by: Sinan Akman <>
Signed-off-by: Grant Likely <>
11 years agospi: spi-gpio.c tests SPI_MASTER_NO_RX bit twice, but not SPI_MASTER_NO_TX
Roel Kluin [Sat, 2 Oct 2010 12:03:32 +0000 (14:03 +0200)]
spi: spi-gpio.c tests SPI_MASTER_NO_RX bit twice, but not SPI_MASTER_NO_TX

The SPI_MASTER_NO_TX bit (can't do buffer write) wasn't tested.  This
code was introduced in commit 3c8e1a84 (spi/spi-gpio: add support for
controllers without MISO or MOSI pin).  This patch fixes a bug in
choosing which transfer ops to use.

Signed-off-by: Roel Kluin <>
Signed-off-by: Grant Likely <>
11 years agodrivers/gpu/drm/i915/i915_gem.c: Add missing error handling code
Julia Lawall [Sat, 2 Oct 2010 13:59:17 +0000 (15:59 +0200)]
drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code

Extend the error handling code with operations found in other nearby error
handling code

A simplified version of the sematic match that finds this problem is as
follows: (

// <smpl>
@r exists@
statement S1,S2,S3;
constant C1,C2,C3;

*if (...)
 {... S1 return -C1;}
*if (...)
 {... when != S1
    return -C2;}
*if (...)
 {... S1 return -C3;}
// </smpl>

Signed-off-by: Julia Lawall <>
Signed-off-by: Chris Wilson <>
11 years agoregulator: max8649 - fix setting extclk_freq
Axel Lin [Fri, 1 Oct 2010 05:56:27 +0000 (13:56 +0800)]
regulator: max8649 - fix setting extclk_freq

The SYNC bits are BIT6 and BIT7 of MAX8649_SYNC register.
pdata->extclk_freq could be [0|1|2].
It requires to left shift 6 bits to properly set extclk_freq.

Signed-off-by: Axel Lin <>
Acked-by: Mark Brown <>
Signed-off-by: Liam Girdwood <>
11 years agoregulator: fix typo in current units
Cyril Chemparathy [Wed, 22 Sep 2010 16:30:15 +0000 (12:30 -0400)]
regulator: fix typo in current units

This patch fixes a typo that incorrectly reports mA numbers as uA.

Signed-off-by: Cyril Chemparathy <>
Acked-by: Mark Brown <>
Signed-off-by: Liam Girdwood <>
11 years agoregulator: fix device_register() error handling
Vasiliy Kulikov [Sun, 19 Sep 2010 12:55:01 +0000 (16:55 +0400)]
regulator: fix device_register() error handling

If device_register() fails then call put_device().
See comment to device_register.

Signed-off-by: Vasiliy Kulikov <>
Acked-by: Mark Brown <>
Signed-off-by: Liam Girdwood <>
11 years agoMerge git://
Linus Torvalds [Fri, 1 Oct 2010 22:03:37 +0000 (15:03 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://
  cifs: prevent infinite recursion in cifs_reconnect_tcon
  cifs: set backing_dev_info on new S_ISREG inodes

11 years agoMerge branch 'x86-fixes-for-linus' of git://
Linus Torvalds [Fri, 1 Oct 2010 22:02:41 +0000 (15:02 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://
  x86, hpet: Fix bogus error check in hpet_assign_irq()
  x86, irq: Plug memory leak in sparse irq
  x86, cpu: After uncapping CPUID, re-run CPU feature detection

11 years agoMN10300: Fix flush_icache_range()
David Howells [Fri, 1 Oct 2010 09:31:03 +0000 (10:31 +0100)]
MN10300: Fix flush_icache_range()

flush_icache_range() is given virtual addresses to describe the region.  It
deals with these by attempting to translate them through the current set of
page tables.

This is fine for userspace memory and vmalloc()'d areas as they are governed by
page tables.  However, since the regions above 0x80000000 aren't translated
through the page tables by the MMU, the kernel doesn't bother to set up page
tables for them (see paging_init()).

This means flush_icache_range() as it stands cannot be used to flush regions of
the VM area between 0x80000000 and 0x9fffffff where the kernel resides if the
data cache is operating in WriteBack mode.

To fix this, make flush_icache_range() first check for addresses in the upper
half of VM space and deal with them appropriately, before dealing with any
range in the page table mapped area.

Ordinarily, this is not a problem, but it has the capacity to make kprobes and
kgdb malfunction.  It should not affect gdbstub, signal frame setup or module
loading as gdb has its own flush functions, and the others take place in the
page table mapped area only.

Signed-off-by: David Howells <>
Acked-by: Akira Takeuchi <>
Signed-off-by: Linus Torvalds <>
11 years agoMerge branch 'drm-fixes' of git://
Linus Torvalds [Fri, 1 Oct 2010 17:58:31 +0000 (10:58 -0700)]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://
  vmwgfx: Fix fb VRAM pinning failure due to fragmentation
  vmwgfx: Remove initialisation of dev::devname
  vmwgfx: Enable use of the vblank system
  vmwgfx: vt-switch (master drop) fixes
  drm/vmwgfx: Fix breakage introduced by commit "drm: block userspace under allocating buffer and having drivers overwrite it (v2)"
  drm: Hold the mutex when dropping the last GEM reference (v2)
  drm/gem: handlecount isn't really a kref so don't make it one.
  drm: i810/i830: fix locked ioctl variant
  drm/radeon/kms: add quirk for MSI K9A2GM motherboard
  drm/radeon/kms: fix potential segfault in r600_ioctl_wait_idle
  drm: Prune GEM vma entries
  drm/radeon/kms: fix up encoder info messages for DFP6
  drm/radeon: fix PCI ID 5657 to be an RV410

11 years agoMerge branch 'for-linus/i2c/2636-rc5' of git://
Linus Torvalds [Fri, 1 Oct 2010 17:55:54 +0000 (10:55 -0700)]
Merge branch 'for-linus/i2c/2636-rc5' of git://

* 'for-linus/i2c/2636-rc5' of git://
  i2c-s3c2410: fix calculation of SDA line delay
  i2c-davinci: Fix race when setting up for TX
  i2c-octeon: Return -ETIMEDOUT in octeon_i2c_wait() on timeout

11 years agoMerge branch 'release' of git://
Linus Torvalds [Fri, 1 Oct 2010 17:54:58 +0000 (10:54 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6

* 'release' of git://
  ACPI: invoke DSDT corruption workaround on all Toshiba Satellite
  ACPI, APEI, Fix ERST MOVE_DATA instruction implementation
  ACPI: fan: Fix more unbalanced code block
  ACPI: acpi_pad: simplify code to avoid false gcc build warning
  ACPI, APEI, Fix error path for memory allocation
  ACPI, APEI, HEST Fix the unsuitable usage of platform_data
  ACPI, APEI, Fix acpi_pre_map() return value
  ACPI, APEI, Fix APEI related table size checking
  ACPI: Disable Windows Vista compatibility for Toshiba P305D
  ACPI: Kconfig: fix typo.
  ACPI: add missing __percpu markup in arch/x86/kernel/acpi/cstate.c
  ACPI: Fix typos
  ACPI video: fix a poor warning message
  ACPI: fix build warnings resulting from merge window conflict
  ACPI: EC: add Vista incompatibility DMI entry for Toshiba Satellite L355
  ACPI: expand Vista blacklist to include SP1 and SP2
  ACPI: delete ZEPTO idle=nomwait DMI quirk
  ACPI: enable repeated PCIEXP wakeup by clearing PCIEXP_WAKE_STS on resume
  PM / ACPI: Blacklist systems known to require acpi_sleep=nonvs
  ACPI: Don't report current_now if battery reports in mWh

11 years agoMerge branch 'idle-release' of git://
Linus Torvalds [Fri, 1 Oct 2010 17:53:45 +0000 (10:53 -0700)]
Merge branch 'idle-release' of git://git./linux/kernel/git/lenb/linux-idle-2.6

* 'idle-release' of git://
  intel_idle: Voluntary leave_mm before entering deeper
  acpi_idle: add missing \n to printk
  intel_idle: add missing __percpu markup
  intel_idle: Change mode 755 => 644
  cpuidle: Fix typos
  intel_idle: PCI quirk to prevent Lenovo Ideapad s10-3 boot hang

11 years agoMerge branch 'omap-fixes-for-linus' of git://
Linus Torvalds [Fri, 1 Oct 2010 17:53:06 +0000 (10:53 -0700)]
Merge branch 'omap-fixes-for-linus' of git://git./linux/kernel/git/tmlind/linux-omap-2.6

* 'omap-fixes-for-linus' of git://
  omap: McBSP: tx_irq_completion used in rx_irq_handler
  omap: Fix compile dependency to LEDS_CLASS

11 years agoreiserfs: fix unwanted reiserfs lock recursion
Frederic Weisbecker [Thu, 30 Sep 2010 22:15:38 +0000 (15:15 -0700)]
reiserfs: fix unwanted reiserfs lock recursion

Prevent from recursively locking the reiserfs lock in reiserfs_unpack()
because we may call journal_begin() that requires the lock to be taken
only once, otherwise it won't be able to release the lock while taking
other mutexes, ending up in inverted dependencies between the journal
mutex and the reiserfs lock for example.

This fixes:

  [ INFO: possible circular locking dependency detected ] #3
  lilo/1620 is trying to acquire lock:
   (&journal->j_mutex){+.+...}, at: [<d0325bff>] do_journal_begin_r+0x7f/0x340 [reiserfs]

  but task is already holding lock:
   (&REISERFS_SB(s)->lock){+.+.+.}, at: [<d032a278>] reiserfs_write_lock+0x28/0x40 [reiserfs]

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
         [<c10562b7>] lock_acquire+0x67/0x80
         [<c12facad>] __mutex_lock_common+0x4d/0x410
         [<c12fb0c8>] mutex_lock_nested+0x18/0x20
         [<d032a278>] reiserfs_write_lock+0x28/0x40 [reiserfs]
         [<d0325c06>] do_journal_begin_r+0x86/0x340 [reiserfs]
         [<d0325f77>] journal_begin+0x77/0x140 [reiserfs]
         [<d0315be4>] reiserfs_remount+0x224/0x530 [reiserfs]
         [<c10b6a20>] do_remount_sb+0x60/0x110
         [<c10cee25>] do_mount+0x625/0x790
         [<c10cf014>] sys_mount+0x84/0xb0
         [<c12fca3d>] syscall_call+0x7/0xb

  -> #0 (&journal->j_mutex){+.+...}:
         [<c10560f6>] __lock_acquire+0x1026/0x1180
         [<c10562b7>] lock_acquire+0x67/0x80
         [<c12facad>] __mutex_lock_common+0x4d/0x410
         [<c12fb0c8>] mutex_lock_nested+0x18/0x20
         [<d0325bff>] do_journal_begin_r+0x7f/0x340 [reiserfs]
         [<d0325f77>] journal_begin+0x77/0x140 [reiserfs]
         [<d0326271>] reiserfs_persistent_transaction+0x41/0x90 [reiserfs]
         [<d030d06c>] reiserfs_get_block+0x22c/0x1530 [reiserfs]
         [<c10db9db>] __block_prepare_write+0x1bb/0x3a0
         [<c10dbbe6>] block_prepare_write+0x26/0x40
         [<d030b738>] reiserfs_prepare_write+0x88/0x170 [reiserfs]
         [<d03294d6>] reiserfs_unpack+0xe6/0x120 [reiserfs]
         [<d0329782>] reiserfs_ioctl+0x272/0x320 [reiserfs]
         [<c10c3188>] vfs_ioctl+0x28/0xa0
         [<c10c3bbd>] do_vfs_ioctl+0x32d/0x5c0
         [<c10c3eb3>] sys_ioctl+0x63/0x70
         [<c12fca3d>] syscall_call+0x7/0xb

  other info that might help us debug this:

  2 locks held by lilo/1620:
   #0:  (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<d032945a>] reiserfs_unpack+0x6a/0x120 [reiserfs]
   #1:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<d032a278>] reiserfs_write_lock+0x28/0x40 [reiserfs]

  stack backtrace:
  Pid: 1620, comm: lilo Not tainted #3
  Call Trace:
   [<c10560f6>] __lock_acquire+0x1026/0x1180
   [<c10562b7>] lock_acquire+0x67/0x80
   [<c12facad>] __mutex_lock_common+0x4d/0x410
   [<c12fb0c8>] mutex_lock_nested+0x18/0x20
   [<d0325bff>] do_journal_begin_r+0x7f/0x340 [reiserfs]
   [<d0325f77>] journal_begin+0x77/0x140 [reiserfs]
   [<d0326271>] reiserfs_persistent_transaction+0x41/0x90 [reiserfs]
   [<d030d06c>] reiserfs_get_block+0x22c/0x1530 [reiserfs]
   [<c10db9db>] __block_prepare_write+0x1bb/0x3a0
   [<c10dbbe6>] block_prepare_write+0x26/0x40
   [<d030b738>] reiserfs_prepare_write+0x88/0x170 [reiserfs]
   [<d03294d6>] reiserfs_unpack+0xe6/0x120 [reiserfs]
   [<d0329782>] reiserfs_ioctl+0x272/0x320 [reiserfs]
   [<c10c3188>] vfs_ioctl+0x28/0xa0
   [<c10c3bbd>] do_vfs_ioctl+0x32d/0x5c0
   [<c10c3eb3>] sys_ioctl+0x63/0x70
   [<c12fca3d>] syscall_call+0x7/0xb

Reported-by: Jarek Poplawski <>
Tested-by: Jarek Poplawski <>
Signed-off-by: Frederic Weisbecker <>
Cc: Jeff Mahoney <>
Cc: All since 2.6.32 <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
11 years agoreiserfs: fix dependency inversion between inode and reiserfs mutexes
Frederic Weisbecker [Thu, 30 Sep 2010 22:15:37 +0000 (15:15 -0700)]
reiserfs: fix dependency inversion between inode and reiserfs mutexes

The reiserfs mutex already depends on the inode mutex, so we can't lock
the inode mutex in reiserfs_unpack() without using the safe locking API,
because reiserfs_unpack() is always called with the reiserfs mutex locked.

This fixes:

  [ INFO: possible circular locking dependency detected ]
  2.6.35c #13
  lilo/1606 is trying to acquire lock:
   (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [<d0329450>] reiserfs_unpack+0x60/0x110 [reiserfs]

  but task is already holding lock:
   (&REISERFS_SB(s)->lock){+.+.+.}, at: [<d032a268>] reiserfs_write_lock+0x28/0x40 [reiserfs]

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
         [<c1056347>] lock_acquire+0x67/0x80
         [<c12f083d>] __mutex_lock_common+0x4d/0x410
         [<c12f0c58>] mutex_lock_nested+0x18/0x20
         [<d032a268>] reiserfs_write_lock+0x28/0x40 [reiserfs]
         [<d0329e9a>] reiserfs_lookup_privroot+0x2a/0x90 [reiserfs]
         [<d0316b81>] reiserfs_fill_super+0x941/0xe60 [reiserfs]
         [<c10b7d17>] get_sb_bdev+0x117/0x170
         [<d0313e21>] get_super_block+0x21/0x30 [reiserfs]
         [<c10b74ba>] vfs_kern_mount+0x6a/0x1b0
         [<c10b7659>] do_kern_mount+0x39/0xe0
         [<c10cebe0>] do_mount+0x340/0x790
         [<c10cf0b4>] sys_mount+0x84/0xb0
         [<c12f25cd>] syscall_call+0x7/0xb

  -> #0 (&sb->s_type->i_mutex_key#8){+.+.+.}:
         [<c1056186>] __lock_acquire+0x1026/0x1180
         [<c1056347>] lock_acquire+0x67/0x80
         [<c12f083d>] __mutex_lock_common+0x4d/0x410
         [<c12f0c58>] mutex_lock_nested+0x18/0x20
         [<d0329450>] reiserfs_unpack+0x60/0x110 [reiserfs]
         [<d0329772>] reiserfs_ioctl+0x272/0x320 [reiserfs]
         [<c10c3228>] vfs_ioctl+0x28/0xa0
         [<c10c3c5d>] do_vfs_ioctl+0x32d/0x5c0
         [<c10c3f53>] sys_ioctl+0x63/0x70
         [<c12f25cd>] syscall_call+0x7/0xb

  other info that might help us debug this:

  1 lock held by lilo/1606:
   #0:  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<d032a268>] reiserfs_write_lock+0x28/0x40 [reiserfs]

  stack backtrace:
  Pid: 1606, comm: lilo Not tainted 2.6.35c #13
  Call Trace:
   [<c1056186>] __lock_acquire+0x1026/0x1180
   [<c1056347>] lock_acquire+0x67/0x80
   [<c12f083d>] __mutex_lock_common+0x4d/0x410
   [<c12f0c58>] mutex_lock_nested+0x18/0x20
   [<d0329450>] reiserfs_unpack+0x60/0x110 [reiserfs]
   [<d0329772>] reiserfs_ioctl+0x272/0x320 [reiserfs]
   [<c10c3228>] vfs_ioctl+0x28/0xa0
   [<c10c3c5d>] do_vfs_ioctl+0x32d/0x5c0
   [<c10c3f53>] sys_ioctl+0x63/0x70
   [<c12f25cd>] syscall_call+0x7/0xb

Reported-by: Jarek Poplawski <>
Tested-by: Jarek Poplawski <>
Signed-off-by: Frederic Weisbecker <>
Cc: Jeff Mahoney <>
Cc: <> [2.6.32 and later]
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
11 years agoMAINTAINERS: update maintainer for S5P ARM ARCHITECTURES
Kukjin Kim [Thu, 30 Sep 2010 22:15:35 +0000 (15:15 -0700)]

Signed-off-by: Kukjin Kim <>
Acked-by: Ben Dooks <>
Acked-by: Russell King <>
Cc: Kyungmin Park <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
11 years agoMAINTAINERS: update matroxfb & ncpfs status
Petr Vandrovec [Thu, 30 Sep 2010 22:15:34 +0000 (15:15 -0700)]
MAINTAINERS: update matroxfb & ncpfs status

I moved couple years ago, so let's update my email and snail mail.

And I do not have any access to Matrox hardware anymore, and I'm quite
unresponsive to matroxfb bug reports (sorry Alan), so saying that I'm
maintainer is a bit far fetched.

For ncpfs I do not use ncpfs in my daily life either, but at least I can
test that one, so I can stay listed here for odd fixes.

Signed-off-by: Petr Vandrovec <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
11 years agoproc: make /proc/pid/limits world readable
Jiri Olsa [Thu, 30 Sep 2010 22:15:33 +0000 (15:15 -0700)]
proc: make /proc/pid/limits world readable

Having the limits file world readable will ease the task of system
management on systems where root privileges might be restricted.

Having admin restricted with root priviledges, he/she could not check
other users process' limits.

Also it'd align with most of the /proc stat files.

Signed-off-by: Jiri Olsa <>
Acked-by: Neil Horman <>
Cc: Eugene Teo <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
11 years agolib/list_sort: do not pass bad pointers to cmp callback
Don Mullis [Thu, 30 Sep 2010 22:15:32 +0000 (15:15 -0700)]
lib/list_sort: do not pass bad pointers to cmp callback

If the original list is a POT in length, the first callback from line 73
will pass a==b both pointing to the original list_head.  This is dangerous
because the 'list_sort()' user can use 'container_of()' and accesses the
"containing" object, which does not necessary exist for the list head.  So
the user can access RAM which does not belong to him.  If this is a write
access, we can end up with memory corruption.

Signed-off-by: Don Mullis <>
Tested-by: Artem Bityutskiy <>
Signed-off-by: Artem Bityutskiy <>
Cc: <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
11 years agosys_semctl: fix kernel stack leakage
Dan Rosenberg [Thu, 30 Sep 2010 22:15:31 +0000 (15:15 -0700)]
sys_semctl: fix kernel stack leakage

The semctl syscall has several code paths that lead to the leakage of
uninitialized kernel stack memory (namely the IPC_INFO, SEM_INFO,
IPC_STAT, and SEM_STAT commands) during the use of the older, obsolete
version of the semid_ds struct.

The copy_semid_to_user() function declares a semid_ds struct on the stack
and copies it back to the user without initializing or zeroing the
"sem_base", "sem_pending", "sem_pending_last", and "undo" pointers,
allowing the leakage of 16 bytes of kernel stack memory.

The code is still reachable on 32-bit systems - when calling semctl()
newer glibc's automatically OR the IPC command with the IPC_64 flag, but
invoking the syscall directly allows users to use the older versions of
the struct.

Signed-off-by: Dan Rosenberg <>
Cc: Manfred Spraul <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>