9 years agoLinux 3.2.28 v3.2.28
Ben Hutchings [Sun, 19 Aug 2012 17:15:38 +0000 (18:15 +0100)]
Linux 3.2.28

9 years agodrm/radeon: do not reenable crtc after moving vram start address
Jerome Glisse [Fri, 27 Jul 2012 20:32:24 +0000 (16:32 -0400)]
drm/radeon: do not reenable crtc after moving vram start address

commit 81ee8fb6b52ec69eeed37fe7943446af1dccecc5 upstream.

It seems we can not update the crtc scanout address. After disabling
crtc, update to base address do not take effect after crtc being
reenable leading to at least frame being scanout from the old crtc
base address. Disabling crtc display request lead to same behavior.

So after changing the vram address if we don't keep crtc disabled
we will have the GPU trying to read some random system memory address
with some iommu this will broke the crtc engine and will lead to
broken display and iommu error message.

So to avoid this, disable crtc. For flicker less boot we will need
to avoid moving the vram start address.

This patch should also fix :

Signed-off-by: Jerome Glisse <>
Signed-off-by: Ben Hutchings <>
9 years agodrm/radeon: fix bank tiling parameters on cayman
Alex Deucher [Tue, 31 Jul 2012 15:05:11 +0000 (11:05 -0400)]
drm/radeon: fix bank tiling parameters on cayman

commit 5b23c9045a8b61352986270b2d109edf5085e113 upstream.

Handle the 16 bank case.

Signed-off-by: Alex Deucher <>
[bwh: Backported to 3.2: adjust context, indentation]
Signed-off-by: Ben Hutchings <>
9 years agodrm/radeon: fix bank tiling parameters on evergreen
Alex Deucher [Tue, 31 Jul 2012 15:01:10 +0000 (11:01 -0400)]
drm/radeon: fix bank tiling parameters on evergreen

commit c8d15edc17d836686d1f071e564800e1a2724fa6 upstream.

Handle the 16 bank case.

Signed-off-by: Alex Deucher <>
Signed-off-by: Ben Hutchings <>
9 years agos390/compat: fix mmap compat system calls
Heiko Carstens [Wed, 8 Aug 2012 07:32:20 +0000 (09:32 +0200)]
s390/compat: fix mmap compat system calls

commit e85871218513c54f7dfdb6009043cb638f2fecbe upstream.

The native 31 bit and the compat behaviour for the mmap system calls differ:

In native 31 bit mode the passed in address for the mmap system call will be
unmodified passed to sys_mmap_pgoff().
In compat mode however the passed in address will be modified with
compat_ptr() which masks out the most significant bit.

The result is that in native 31 bit mode each mmap request (with MAP_FIXED)
will fail where the most significat bit is set, while in compat mode it
may succeed.

This odd behaviour was introduced with d3815898 "[S390] mmap: add missing
compat_ptr conversion to both mmap compat syscalls".

To restore a consistent behaviour accross native and compat mode this
patch functionally reverts the above mentioned commit.

Signed-off-by: Heiko Carstens <>
Signed-off-by: Martin Schwidefsky <>
Signed-off-by: Ben Hutchings <>
9 years agos390/compat: fix compat wrappers for process_vm system calls
Heiko Carstens [Tue, 7 Aug 2012 07:48:13 +0000 (09:48 +0200)]
s390/compat: fix compat wrappers for process_vm system calls

commit 82aabdb6f1eb61e0034ec23901480f5dd23db7c4 upstream.

The compat wrappers incorrectly called the non compat versions of
the system process_vm system calls.

Signed-off-by: Heiko Carstens <>
Signed-off-by: Martin Schwidefsky <>
Signed-off-by: Ben Hutchings <>
9 years agodrm/i915: correctly order the ring init sequence
Daniel Vetter [Tue, 7 Aug 2012 07:54:14 +0000 (09:54 +0200)]
drm/i915: correctly order the ring init sequence

commit 0d8957c8a90bbb5d34fab9a304459448a5131e06 upstream.

We may only start to set up the new register values after having
confirmed that the ring is truely off. Otherwise the hw might lose the
newly written register values. This is caught later on in the init
sequence, when we check whether the register writes have stuck.

Reviewed-by: Jani Nikula <>
Tested-by: Yang Guang <>
Signed-off-by: Daniel Vetter <>
Signed-off-by: Ben Hutchings <>
9 years agodrm/i915: Add wait_for in init_ring_common
Sean Paul [Fri, 16 Mar 2012 16:43:22 +0000 (12:43 -0400)]
drm/i915: Add wait_for in init_ring_common

commit f01db988ef6f6c70a6cc36ee71e4a98a68901229 upstream.

I have seen a number of "blt ring initialization failed" messages
where the ctl or start registers are not the correct value. Upon further
inspection, if the code just waited a little bit, it would read the
correct value. Adding the wait_for to these reads should eliminate the

Signed-off-by: Sean Paul <>
Reviewed-by: Ben Widawsky <>
Signed-off-by: Daniel Vetter <>
Signed-off-by: Ben Hutchings <>
9 years agoInput: eeti_ts: pass gpio value instead of IRQ
Arnd Bergmann [Mon, 30 Apr 2012 16:21:37 +0000 (16:21 +0000)]
Input: eeti_ts: pass gpio value instead of IRQ

commit 4eef6cbfcc03b294d9d334368a851b35b496ce53 upstream.

The EETI touchscreen asserts its IRQ line as soon as it has data in its
internal buffers. The line is automatically deasserted once all data has
been read via I2C. Hence, the driver has to monitor the GPIO line and
cannot simply rely on the interrupt handler reception.

In the current implementation of the driver, irq_to_gpio() is used to
determine the GPIO number from the i2c_client's IRQ value.

As irq_to_gpio() is not available on all platforms, this patch changes
this and makes the driver ignore the passed in IRQ. Instead, a GPIO is
added to the platform_data struct and gpio_to_irq is used to derive the
IRQ from that GPIO. If this fails, bail out. The driver is only able to
work in environments where the touchscreen GPIO can be mapped to an

Without this patch, building raumfeld_defconfig results in:

drivers/input/touchscreen/eeti_ts.c: In function 'eeti_ts_irq_active':
drivers/input/touchscreen/eeti_ts.c:65:2: error: implicit declaration of function 'irq_to_gpio' [-Werror=implicit-function-declaration]

Signed-off-by: Daniel Mack <>
Signed-off-by: Arnd Bergmann <>
Cc: Dmitry Torokhov <>
Cc: Sven Neumann <>
Cc: Haojian Zhuang <>
[bwh: Backported to 3.2: raumfeld_controller_i2c_board_info.irq was
 initialised using gpio_to_irq(), but this doesn't seem to matter]
Signed-off-by: Ben Hutchings <>
9 years agoARM: pxa: remove irq_to_gpio from ezx-pcap driver
Arnd Bergmann [Sun, 5 Aug 2012 14:58:37 +0000 (14:58 +0000)]
ARM: pxa: remove irq_to_gpio from ezx-pcap driver

commit 59ee93a528b94ef4e81a08db252b0326feff171f upstream.

The irq_to_gpio function was removed from the pxa platform
in linux-3.2, and this driver has been broken since.

There is actually no in-tree user of this driver that adds
this platform device, but the driver can and does get enabled
on some platforms.

Without this patch, building ezx_defconfig results in:

drivers/mfd/ezx-pcap.c: In function 'pcap_isr_work':
drivers/mfd/ezx-pcap.c:205:2: error: implicit declaration of function 'irq_to_gpio' [-Werror=implicit-function-declaration]

Signed-off-by: Arnd Bergmann <>
Acked-by: Haojian Zhuang <>
Cc: Samuel Ortiz <>
Cc: Daniel Ribeiro <>
Signed-off-by: Ben Hutchings <>
9 years agoALSA: hda - Fix double quirk for Quanta FL1 / Lenovo Ideapad
David Henningsson [Wed, 8 Aug 2012 06:43:37 +0000 (08:43 +0200)]
ALSA: hda - Fix double quirk for Quanta FL1 / Lenovo Ideapad

commit 012e7eb1e501d0120e0383b81477f63091f5e365 upstream.

The same ID is twice in the quirk table, so the second one is not used.

Signed-off-by: David Henningsson <>
Signed-off-by: Takashi Iwai <>
Signed-off-by: Ben Hutchings <>
9 years agoALSA: hda - remove quirk for Dell Vostro 1015
David Henningsson [Tue, 7 Aug 2012 12:03:29 +0000 (14:03 +0200)]
ALSA: hda - remove quirk for Dell Vostro 1015

commit e9fc83cb2e5877801a255a37ddbc5be996ea8046 upstream.

This computer is confirmed working with model=auto on kernel 3.2.
Also, parsing fails with hda-emu with the current model.

Signed-off-by: David Henningsson <>
Signed-off-by: Takashi Iwai <>
Signed-off-by: Ben Hutchings <>
9 years agoe1000e: NIC goes up and immediately goes down
Tushar Dave [Tue, 31 Jul 2012 02:02:43 +0000 (02:02 +0000)]
e1000e: NIC goes up and immediately goes down

commit b7ec70be01a87f2c85df3ae11046e74f9b67e323 upstream.

Found that commit d478eb44 was a bad commit.
If the link partner is transmitting codeword (even if NULL codeword),
then the RXCW.C bit will be set so check for RXCW.CW is unnecessary.
Ref: RH BZ 840642

Reported-by: Fabio Futigami <>
Signed-off-by: Tushar Dave <>
CC: Marcelo Ricardo Leitner <>
Tested-by: Aaron Brown <>
Signed-off-by: Peter P Waskiewicz Jr <>
Signed-off-by: Ben Hutchings <>
9 years agoALSA: hda - add dock support for Thinkpad X230
Felix Kaechele [Mon, 6 Aug 2012 21:02:01 +0000 (23:02 +0200)]
ALSA: hda - add dock support for Thinkpad X230

commit c8415a48fcb7a29889f4405d38c57db351e4b50a upstream.

As with the ThinkPad Models X230 Tablet and T530 the X230 needs a qurik to
correctly set up the pins for the dock port.

Signed-off-by: Felix Kaechele <>
Signed-off-by: Takashi Iwai <>
Signed-off-by: Ben Hutchings <>
9 years agoiwlwifi: disable greenfield transmissions as a workaround
Johannes Berg [Sun, 5 Aug 2012 16:31:46 +0000 (18:31 +0200)]
iwlwifi: disable greenfield transmissions as a workaround

commit 50e2a30cf6fcaeb2d27360ba614dd169a10041c5 upstream.

There's a bug that causes the rate scaling to get stuck
when it has to use single-stream rates with a peer that
can do GF and SGI; the two are incompatible so we can't
use them together, but that causes the algorithm to not
work at all, it always rejects updates.

Disable greenfield for now to prevent that problem.

Reviewed-by: Emmanuel Grumbach <>
Tested-by: Cesar Eduardo Barros <>
Signed-off-by: Johannes Berg <>
Signed-off-by: John W. Linville <>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <>
9 years agort61pci: fix NULL pointer dereference in config_lna_gain
Stanislaw Gruszka [Fri, 3 Aug 2012 10:49:14 +0000 (12:49 +0200)]
rt61pci: fix NULL pointer dereference in config_lna_gain

commit deee0214def5d8a32b8112f11d9c2b1696e9c0cb upstream.

We can not pass NULL libconf->conf->channel to rt61pci_config() as it
is dereferenced unconditionally in rt61pci_config_lna_gain() subroutine.


Reported-and-tested-by: <>
Signed-off-by: Stanislaw Gruszka <>
Signed-off-by: John W. Linville <>
Signed-off-by: Ben Hutchings <>
9 years agocfg80211: process pending events when unregistering net device
Daniel Drake [Thu, 2 Aug 2012 17:41:48 +0000 (18:41 +0100)]
cfg80211: process pending events when unregistering net device

commit 1f6fc43e621167492ed4b7f3b4269c584c3d6ccc upstream.

libertas currently calls cfg80211_disconnected() when it is being
brought down. This causes an event to be allocated, but since the
wdev is already removed from the rdev by the time that the event
processing work executes, the event is never processed or freed.

Fix this leak, and other possible situations, by processing the event
queue when a device is being unregistered. Thanks to Johannes Berg for
the suggestion.

Signed-off-by: Daniel Drake <>
Reviewed-by: Johannes Berg <>
Signed-off-by: John W. Linville <>
Signed-off-by: Ben Hutchings <>
9 years agoALSA: hda - add dock support for Thinkpad T430s
Philipp A. Mohrenweiser [Mon, 6 Aug 2012 11:14:18 +0000 (13:14 +0200)]
ALSA: hda - add dock support for Thinkpad T430s

commit 4407be6ba217514b1bc01488f8b56467d309e416 upstream.

Add a model/fixup string "lenovo-dock", for Thinkpad T430s, to allow
sound in docking station.

Tested on Lenovo T430s with ThinkPad Mini Dock Plus Series 3

Signed-off-by: Philipp A. Mohrenweiser <>
Signed-off-by: Takashi Iwai <>
Signed-off-by: Ben Hutchings <>
9 years agoARM: mxs: Remove MMAP_MIN_ADDR setting from mxs_defconfig
Marek Vasut [Fri, 3 Aug 2012 18:54:48 +0000 (20:54 +0200)]
ARM: mxs: Remove MMAP_MIN_ADDR setting from mxs_defconfig

commit 3bed491c8d28329e34f8a31e3fe64d03f3a350f1 upstream.

The CONFIG_DEFAULT_MMAP_MIN_ADDR was set to 65536 in mxs_defconfig,
this caused severe breakage of userland applications since the upper
limit for ARM is 32768. By default CONFIG_DEFAULT_MMAP_MIN_ADDR is
set to 4096 and can also be changed via /proc/sys/vm/mmap_min_addr
if needed.

Quoting Russell King [1]:

"4096 is also fine for ARM too. There's not much point in having
defconfigs change it - that would just be pure noise in the config

the CONFIG_DEFAULT_MMAP_MIN_ADDR can be removed from the defconfig

This problem was introduced by commit cde7c41 (ARM: configs: add
defconfig for mach-mxs).


Signed-off-by: Marek Vasut <>
Cc: Russell King <>
Cc: Wolfgang Denk <>
Signed-off-by: Shawn Guo <>
Signed-off-by: Ben Hutchings <>
9 years agoath9k: Add PID/VID support for AR1111
Mohammed Shafi Shajakhan [Thu, 2 Aug 2012 06:28:50 +0000 (11:58 +0530)]
ath9k: Add PID/VID support for AR1111

commit d4e5979c0da95791aa717c18e162540c7a596360 upstream.

AR1111 is same as AR9485. The h/w
difference between them is quite insignificant,
Felix suggests only very few baseband features
may not be available in AR1111. The h/w code for
AR9485 is already present, so AR1111 should
work fine with the addition of its PID/VID.

Cc: Felix Bitterli <>
Reported-by: Tim Bentley <>
Signed-off-by: Mohammed Shafi Shajakhan <>
Tested-by: Tim Bentley <>
Signed-off-by: John W. Linville <>
Signed-off-by: Ben Hutchings <>
9 years agomac80211: cancel mesh path timer
Johannes Berg [Wed, 1 Aug 2012 19:03:21 +0000 (21:03 +0200)]
mac80211: cancel mesh path timer

commit dd4c9260e7f23f2e951cbfb2726e468c6d30306c upstream.

The mesh path timer needs to be canceled when
leaving the mesh as otherwise it could fire
after the interface has been removed already.

Signed-off-by: Johannes Berg <>
Signed-off-by: Ben Hutchings <>
9 years agoKVM: VMX: Advertise CPU_BASED_RDPMC_EXITING for nested guests
Stefan Bader [Thu, 9 Aug 2012 09:33:12 +0000 (12:33 +0300)]
KVM: VMX: Advertise CPU_BASED_RDPMC_EXITING for nested guests

Based on commit fee84b079d5ddee2247b5c1f53162c330c622902 upstream.

  Intercept RDPMC and forward it to the PMU emulation code.

Newer vmx support will only allow to load the kvm_intel module
if RDPMC_EXITING is supported. Even without the actual support
this part of the change is required on 3.2 hosts.

Signed-off-by: Stefan Bader <>
Signed-off-by: Avi Kivity <>
Signed-off-by: Ben Hutchings <>
9 years agodrm/i915: fixup seqno allocation logic for lazy_request
Daniel Vetter [Wed, 25 Jan 2012 15:32:49 +0000 (16:32 +0100)]
drm/i915: fixup seqno allocation logic for lazy_request

commit 53d227f282eb9fa4c7cdbfd691fa372b7ca8c4c3 upstream.

Currently we reserve seqnos only when we emit the request to the ring
(by bumping dev_priv->next_seqno), but start using it much earlier for
ring->oustanding_lazy_request. When 2 threads compete for the gpu and
run on two different rings (e.g. ddx on blitter vs. compositor)
hilarity ensued, especially when we get constantly interrupted while
reserving buffers.

Breakage seems to have been introduced in

commit 6f392d548658a17600da7faaf8a5df25ee5f01f6
Author: Chris Wilson <>
Date:   Sat Aug 7 11:01:22 2010 +0100

    drm/i915: Use a common seqno for all rings.

This patch fixes up the seqno reservation logic by moving it into
i915_gem_next_request_seqno. The ring->add_request functions now
superflously still return the new seqno through a pointer, that will
be refactored in the next patch.

Note that with this change we now unconditionally allocate a seqno,
even when ->add_request might fail because the rings are full and the
gpu died. But this does not open up a new can of worms because we can
already leave behind an outstanding_request_seqno if e.g. the caller
gets interrupted with a signal while stalling for the gpu in the
eviciton paths. And with the bugfix we only ever have one seqno
allocated per ring (and only that ring), so there are no ordering
issues with multiple outstanding seqnos on the same ring.

v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it
clear that we only have one seqno counter for all rings. Suggested by
Chris Wilson.

v3: As suggested by Chris Wilson use i915_gem_next_request_seqno
instead of ring->oustanding_lazy_request to make the follow-up
refactoring more clearly correct. Also improve the commit message
with issues discussed on irc.

Tested-by: Nicolas Kalkhof nkalkhof()at()
Reviewed-by: Chris Wilson <>
Signed-Off-by: Daniel Vetter <>
Signed-off-by: Ben Hutchings <>
9 years agohfsplus: fix overflow in sector calculations in hfsplus_submit_bio
Janne Kalliomäki [Sun, 17 Jun 2012 21:05:24 +0000 (17:05 -0400)]
hfsplus: fix overflow in sector calculations in hfsplus_submit_bio

commit a6dc8c04218eb752ff79cdc24a995cf51866caed upstream.

The variable io_size was unsigned int, which caused the wrong sector number
to be calculated after aligning it. This then caused mount to fail with big
volumes, as backup volume header information was searched from a
wrong sector.

Signed-off-by: Janne Kalliomäki <>
Signed-off-by: Christoph Hellwig <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Ben Hutchings <>
9 years agortlwifi: rtl8192cu: Change buffer allocation for synchronous reads
Larry Finger [Wed, 11 Jul 2012 19:37:28 +0000 (14:37 -0500)]
rtlwifi: rtl8192cu: Change buffer allocation for synchronous reads

commit 3ce4d85b76010525adedcc2555fa164bf706a2f3 upstream.

In commit a7959c1, the USB part of rtlwifi was switched to convert
_usb_read_sync() to using a preallocated buffer rather than one
that has been acquired using kmalloc. Although this routine is named
as though it were synchronous, there seem to be simultaneous users,
and the selection of the index to the data buffer is not multi-user
safe. This situation is addressed by adding a new spinlock. The routine
cannot sleep, thus a mutex is not allowed.

Signed-off-by: Larry Finger <>
Signed-off-by: John W. Linville <>
Signed-off-by: Ben Hutchings <>
9 years agoe1000: add dropped DMA receive enable back in for WoL
Dean Nelson [Thu, 19 Jan 2012 17:47:24 +0000 (17:47 +0000)]
e1000: add dropped DMA receive enable back in for WoL

commit b868179c47e9e8eadcd04c1f3105998e528988a3 upstream.

Commit d5bc77a223b0e9b9dfb002048d2b34a79e7d0b48 broke Wake-on-LAN by
inadvertently dropping the enabling of DMA receives.

Restore the enabling of DMA receives for WoL.

This is applicable to 3.1+ stable trees.

Reported-by: Tobias Klausmann <>
Signed-off-by: Dean Nelson <>
Tested-by: Tobias Klausmann <>
Tested-by: Aaron Brown <>
Signed-off-by: Jeff Kirsher <>
Signed-off-by: Ben Hutchings <>
9 years agonet/tun: fix ioctl() based info leaks
Mathias Krause [Sun, 29 Jul 2012 19:45:14 +0000 (19:45 +0000)]
net/tun: fix ioctl() based info leaks

[ Upstream commits a117dacde0288f3ec60b6e5bcedae8fa37ee0dfc
  and 8bbb181308bc348e02bfdbebdedd4e4ec9d452ce ]

The tun module leaks up to 36 bytes of memory by not fully initializing
a structure located on the stack that gets copied to user memory by the

Signed-off-by: Mathias Krause <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agotcp: perform DMA to userspace only if there is a task waiting for it
Jiri Kosina [Fri, 27 Jul 2012 10:38:50 +0000 (10:38 +0000)]
tcp: perform DMA to userspace only if there is a task waiting for it

[ Upstream commit 59ea33a68a9083ac98515e4861c00e71efdc49a1 ]

Back in 2006, commit 1a2449a87b ("[I/OAT]: TCP recv offload to I/OAT")
added support for receive offloading to IOAT dma engine if available.

The code in tcp_rcv_established() tries to perform early DMA copy if
applicable. It however does so without checking whether the userspace
task is actually expecting the data in the buffer.

This is not a problem under normal circumstances, but there is a corner
case where this doesn't work -- and that's when MSG_TRUNC flag to
recvmsg() is used.

If the IOAT dma engine is not used, the code properly checks whether
there is a valid ucopy.task and the socket is owned by userspace, but
misses the check in the dmaengine case.

This problem can be observed in real trivially -- for example 'tbench' is a
good reproducer, as it makes a heavy use of MSG_TRUNC. On systems utilizing
IOAT, you will soon find tbench waiting indefinitely in sk_wait_data(), as they
have been already early-copied in tcp_rcv_established() using dma engine.

This patch introduces the same check we are performing in the simple
iovec copy case to the IOAT case as well. It fixes the indefinite
recvmsg(MSG_TRUNC) hangs.

Signed-off-by: Jiri Kosina <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agonet: fix rtnetlink IFF_PROMISC and IFF_ALLMULTI handling
Jiri Benc [Fri, 27 Jul 2012 02:58:22 +0000 (02:58 +0000)]
net: fix rtnetlink IFF_PROMISC and IFF_ALLMULTI handling

[ Upstream commit b1beb681cba5358f62e6187340660ade226a5fcc ]

When device flags are set using rtnetlink, IFF_PROMISC and IFF_ALLMULTI
flags are handled specially. Function dev_change_flags sets IFF_PROMISC and
IFF_ALLMULTI bits in dev->gflags according to the passed value but
do_setlink passes a result of rtnl_dev_combine_flags which takes those bits
from dev->flags.

This can be easily trigerred by doing:

tcpdump -i eth0 &
ip l s up eth0

ip sets IFF_UP flag in ifi_flags and ifi_change, which is combined with
IFF_PROMISC by rtnl_dev_combine_flags, causing __dev_change_flags to set
IFF_PROMISC in gflags.

Reported-by: Max Matveev <>
Signed-off-by: Jiri Benc <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agoUSB: kaweth.c: use GFP_ATOMIC under spin_lock
Dan Carpenter [Fri, 27 Jul 2012 01:46:51 +0000 (01:46 +0000)]
USB: kaweth.c: use GFP_ATOMIC under spin_lock

[ Upstream commit e4c7f259c5be99dcfc3d98f913590663b0305bf8 ]

The problem is that we call this with a spin lock held.  The call tree
kaweth_start_xmit() holds kaweth->device_lock.
-> kaweth_async_set_rx_mode()
   -> kaweth_control()
      -> kaweth_internal_control_msg()

The kaweth_internal_control_msg() function is only called from
kaweth_control() which used GFP_ATOMIC for its allocations.

Signed-off-by: Dan Carpenter <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agotcp: Add TCP_USER_TIMEOUT negative value check
Hangbin Liu [Thu, 26 Jul 2012 22:52:21 +0000 (22:52 +0000)]
tcp: Add TCP_USER_TIMEOUT negative value check

[ Upstream commit 42493570100b91ef663c4c6f0c0fdab238f9d3c2 ]

TCP_USER_TIMEOUT is a TCP level socket option that takes an unsigned int. But
patch "tcp: Add TCP_USER_TIMEOUT socket option"(dca43c75) didn't check the negative
values. If a user assign -1 to it, the socket will set successfully and wait
for 4294967295 miliseconds. This patch add a negative value check to avoid
this issue.

Signed-off-by: Hangbin Liu <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agowanmain: comparing array with NULL
Alan Cox [Tue, 24 Jul 2012 08:16:25 +0000 (08:16 +0000)]
wanmain: comparing array with NULL

[ Upstream commit 8b72ff6484fe303e01498b58621810a114f3cf09 ]

gcc really should warn about these !

Signed-off-by: Alan Cox <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agocaif: fix NULL pointer check
Alan Cox [Tue, 24 Jul 2012 02:42:14 +0000 (02:42 +0000)]
caif: fix NULL pointer check

[ Upstream commit c66b9b7d365444b433307ebb18734757cb668a02 ]

Reported-by: <>
Signed-off-by: Alan Cox <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agocipso: don't follow a NULL pointer when setsockopt() is called
Paul Moore [Tue, 17 Jul 2012 11:07:47 +0000 (11:07 +0000)]
cipso: don't follow a NULL pointer when setsockopt() is called

[ Upstream commit 89d7ae34cdda4195809a5a987f697a517a2a3177 ]

As reported by Alan Cox, and verified by Lin Ming, when a user
attempts to add a CIPSO option to a socket using the CIPSO_V4_TAG_LOCAL
tag the kernel dies a terrible death when it attempts to follow a NULL
pointer (the skb argument to cipso_v4_validate() is NULL when called via
the setsockopt() syscall).

This patch fixes this by first checking to ensure that the skb is
non-NULL before using it to find the incoming network interface.  In
the unlikely case where the skb is NULL and the user attempts to add
a CIPSO option with the _TAG_LOCAL tag we return an error as this is
not something we want to allow.

A simple reproducer, kindly supplied by Lin Ming, although you must
have the CIPSO DOI #3 configure on the system first or you will be
caught early in cipso_v4_validate():

#include <sys/types.h>
#include <sys/socket.h>
#include <linux/ip.h>
#include <linux/in.h>
#include <string.h>

struct local_tag {
char type;
char length;
char info[4];

struct cipso {
char type;
char length;
char doi[4];
struct local_tag local;

int main(int argc, char **argv)
int sockfd;
struct cipso cipso = {
.type = IPOPT_CIPSO,
.length = sizeof(struct cipso),
.local = {
.type = 128,
.length = sizeof(struct local_tag),

memset(cipso.doi, 0, 4);
cipso.doi[3] = 3;

sockfd = socket(AF_INET, SOCK_DGRAM, 0);
#define SOL_IP 0
setsockopt(sockfd, SOL_IP, IP_OPTIONS,
&cipso, sizeof(struct cipso));

return 0;

CC: Lin Ming <>
Reported-by: Alan Cox <>
Signed-off-by: Paul Moore <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agocaif: Fix access to freed pernet memory
Sjur Brændeland [Sun, 15 Jul 2012 10:10:14 +0000 (10:10 +0000)]
caif: Fix access to freed pernet memory

[ Upstream commit 96f80d123eff05c3cd4701463786b87952a6c3ac ]

unregister_netdevice_notifier() must be called before
unregister_pernet_subsys() to avoid accessing already freed
pernet memory. This fixes the following oops when doing rmmod:

Call Trace:
 [<ffffffffa0f802bd>] caif_device_notify+0x4d/0x5a0 [caif]
 [<ffffffff81552ba9>] unregister_netdevice_notifier+0xb9/0x100
 [<ffffffffa0f86dcc>] caif_device_exit+0x1c/0x250 [caif]
 [<ffffffff810e7734>] sys_delete_module+0x1a4/0x300
 [<ffffffff810da82d>] ? trace_hardirqs_on_caller+0x15d/0x1e0
 [<ffffffff813517de>] ? trace_hardirqs_on_thunk+0x3a/0x3
 [<ffffffff81696bad>] system_call_fastpath+0x1a/0x1f

 [<ffffffffa0f7f561>] caif_get+0x51/0xb0 [caif]

Signed-off-by: Sjur Brændeland <>
Acked-by: "Eric W. Biederman" <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agosctp: Fix list corruption resulting from freeing an association on a list
Neil Horman [Mon, 16 Jul 2012 09:13:51 +0000 (09:13 +0000)]
sctp: Fix list corruption resulting from freeing an association on a list

[ Upstream commit 2eebc1e188e9e45886ee00662519849339884d6d ]

A few days ago Dave Jones reported this oops:

[22766.294255] general protection fault: 0000 [#1] PREEMPT SMP
[22766.295376] CPU 0
[22766.295384] Modules linked in:
[22766.387137]  ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90
[22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000
[22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000,
[22766.387137] Stack:
[22766.387140]  ffff880147c03a10
[22766.387140]  ffffffffa169f2b6
[22766.387140]  ffff88013ed95728
[22766.387143]  0000000000000002
[22766.387143]  0000000000000000
[22766.387143]  ffff880003fad062
[22766.387144]  ffff88013c120000
[22766.387145] Call Trace:
[22766.387145]  <IRQ>
[22766.387150]  [<ffffffffa169f292>] ? __sctp_lookup_association+0x62/0xd0
[22766.387154]  [<ffffffffa169f2b6>] __sctp_lookup_association+0x86/0xd0 [sctp]
[22766.387157]  [<ffffffffa169f597>] sctp_rcv+0x207/0xbb0 [sctp]
[22766.387161]  [<ffffffff810d4da8>] ? trace_hardirqs_off_caller+0x28/0xd0
[22766.387163]  [<ffffffff815827e3>] ? nf_hook_slow+0x133/0x210
[22766.387166]  [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
[22766.387168]  [<ffffffff8159043d>] ip_local_deliver_finish+0x18d/0x4c0
[22766.387169]  [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
[22766.387171]  [<ffffffff81590a07>] ip_local_deliver+0x47/0x80
[22766.387172]  [<ffffffff8158fd80>] ip_rcv_finish+0x150/0x680
[22766.387174]  [<ffffffff81590c54>] ip_rcv+0x214/0x320
[22766.387176]  [<ffffffff81558c07>] __netif_receive_skb+0x7b7/0x910
[22766.387178]  [<ffffffff8155856c>] ? __netif_receive_skb+0x11c/0x910
[22766.387180]  [<ffffffff810d423e>] ? put_lock_stats.isra.25+0xe/0x40
[22766.387182]  [<ffffffff81558f83>] netif_receive_skb+0x23/0x1f0
[22766.387183]  [<ffffffff815596a9>] ? dev_gro_receive+0x139/0x440
[22766.387185]  [<ffffffff81559280>] napi_skb_finish+0x70/0xa0
[22766.387187]  [<ffffffff81559cb5>] napi_gro_receive+0xf5/0x130
[22766.387218]  [<ffffffffa01c4679>] e1000_receive_skb+0x59/0x70 [e1000e]
[22766.387242]  [<ffffffffa01c5aab>] e1000_clean_rx_irq+0x28b/0x460 [e1000e]
[22766.387266]  [<ffffffffa01c9c18>] e1000e_poll+0x78/0x430 [e1000e]
[22766.387268]  [<ffffffff81559fea>] net_rx_action+0x1aa/0x3d0
[22766.387270]  [<ffffffff810a495f>] ? account_system_vtime+0x10f/0x130
[22766.387273]  [<ffffffff810734d0>] __do_softirq+0xe0/0x420
[22766.387275]  [<ffffffff8169826c>] call_softirq+0x1c/0x30
[22766.387278]  [<ffffffff8101db15>] do_softirq+0xd5/0x110
[22766.387279]  [<ffffffff81073bc5>] irq_exit+0xd5/0xe0
[22766.387281]  [<ffffffff81698b03>] do_IRQ+0x63/0xd0
[22766.387283]  [<ffffffff8168ee2f>] common_interrupt+0x6f/0x6f
[22766.387283]  <EOI>
[22766.387285]  [<ffffffff8168eed9>] ? retint_swapgs+0x13/0x1b
[22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48
89 e5 48 83
ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 <0f> b7 87 98 00 00 00
48 89 fb
49 89 f5 66 c1 c0 08 66 39 46 02
[22766.387307] RIP
[22766.387311]  [<ffffffffa168a2c9>] sctp_assoc_is_match+0x19/0x90 [sctp]
[22766.387311]  RSP <ffff880147c039b0>
[22766.387142]  ffffffffa16ab120
[22766.599537] ---[ end trace 3f6dae82e37b17f5 ]---
[22766.601221] Kernel panic - not syncing: Fatal exception in interrupt

It appears from his analysis and some staring at the code that this is likely
occuring because an association is getting freed while still on the
sctp_assoc_hashtable.  As a result, we get a gpf when traversing the hashtable
while a freed node corrupts part of the list.

Nominally I would think that an mibalanced refcount was responsible for this,
but I can't seem to find any obvious imbalance.  What I did note however was
that the two places where we create an association using
sctp_primitive_ASSOCIATE (__sctp_connect and sctp_sendmsg), have failure paths
which free a newly created association after calling sctp_primitive_ASSOCIATE.
sctp_primitive_ASSOCIATE brings us into the sctp_sf_do_prm_asoc path, which
issues a SCTP_CMD_NEW_ASOC side effect, which in turn adds a new association to
the aforementioned hash table.  the sctp command interpreter that process side
effects has not way to unwind previously processed commands, so freeing the
association from the __sctp_connect or sctp_sendmsg error path would lead to a
freed association remaining on this hash table.

I've fixed this but modifying sctp_[un]hash_established to use hlist_del_init,
which allows us to proerly use hlist_unhashed to check if the node is on a
hashlist safely during a delete.  That in turn alows us to safely call
sctp_unhash_established in the __sctp_connect and sctp_sendmsg error paths
before freeing them, regardles of what the associations state is on the hash

I noted, while I was doing this, that the __sctp_unhash_endpoint was using
hlist_unhsashed in a simmilar fashion, but never nullified any removed nodes
pointers to make that function work properly, so I fixed that up in a simmilar

I attempted to test this using a virtual guest running the SCTP_RR test from
netperf in a loop while running the trinity fuzzer, both in a loop.  I wasn't
able to recreate the problem prior to this fix, nor was I able to trigger the
failure after (neither of which I suppose is suprising).  Given the trace above
however, I think its likely that this is what we hit.

Signed-off-by: Neil Horman <>
CC: "David S. Miller" <>
CC: Vlad Yasevich <>
CC: Sridhar Samudrala <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agosch_sfb: Fix missing NULL check
Alan Cox [Thu, 12 Jul 2012 03:39:11 +0000 (03:39 +0000)]
sch_sfb: Fix missing NULL check

[ Upstream commit 7ac2908e4b2edaec60e9090ddb4d9ceb76c05e7d ]


Signed-off-by: Alan Cox <>
Acked-by: Eric Dumazet <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agobnx2: Fix bug in bnx2_free_tx_skbs().
Michael Chan [Tue, 10 Jul 2012 10:04:40 +0000 (10:04 +0000)]
bnx2: Fix bug in bnx2_free_tx_skbs().

[ Upstream commit c1f5163de417dab01fa9daaf09a74bbb19303f3c ]

In rare cases, bnx2x_free_tx_skbs() can unmap the wrong DMA address
when it gets to the last entry of the tx ring.  We were not using
the proper macro to skip the last entry when advancing the tx index.

Reported-by: Zongyun Lai <>
Reviewed-by: Jeffrey Huang <>
Signed-off-by: Michael Chan <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agoLinux 3.2.27 v3.2.27
Ben Hutchings [Thu, 9 Aug 2012 23:25:22 +0000 (00:25 +0100)]
Linux 3.2.27

9 years agopch_uart: Fix parity setting issue
Tomoya MORINAGA [Fri, 6 Jul 2012 08:19:43 +0000 (17:19 +0900)]
pch_uart: Fix parity setting issue

commit 38bd2a1ac736901d1cf4971c78ef952ba92ef78b upstream.

Parity Setting value is reverse.
E.G. In case of setting ODD parity, EVEN value is set.
This patch inverts "if" condition.

Signed-off-by: Tomoya MORINAGA <>
Acked-by: Alan Cox <>
Signed-off-by: Greg Kroah-Hartman <>
Signed-off-by: Ben Hutchings <>
9 years agopch_uart: Fix rx error interrupt setting issue
Tomoya MORINAGA [Fri, 6 Jul 2012 08:19:42 +0000 (17:19 +0900)]
pch_uart: Fix rx error interrupt setting issue

commit 9539dfb7ac1c84522fe1f79bb7dac2990f3de44a upstream.

Rx Error interrupt(E.G. parity error) is not enabled.
So, when parity error occurs, error interrupt is not occurred.
As a result, the received data is not dropped.

This patch adds enable/disable rx error interrupt code.

Signed-off-by: Tomoya MORINAGA <>
Acked-by: Alan Cox <>
Signed-off-by: Greg Kroah-Hartman <>
[Backported by Tomoya MORINGA: adjusted context]
Signed-off-by: Ben Hutchings <>
9 years agopch_uart: Fix missing break for 16 byte fifo
Alan Cox [Mon, 2 Jul 2012 17:51:38 +0000 (18:51 +0100)]
pch_uart: Fix missing break for 16 byte fifo

commit 9bc03743fff0770dc5a5324ba92e67cc377f16ca upstream.

Otherwise we fall back to the wrong value.

Reported-by: <>
Signed-off-by: Alan Cox <>
Signed-off-by: Greg Kroah-Hartman <>
Signed-off-by: Ben Hutchings <>
9 years agodrop_monitor: dont sleep in atomic context
Eric Dumazet [Mon, 4 Jun 2012 00:18:19 +0000 (00:18 +0000)]
drop_monitor: dont sleep in atomic context

commit bec4596b4e6770c7037f21f6bd27567b152dc0d6 upstream.

drop_monitor calls several sleeping functions while in atomic context.

 BUG: sleeping function called from invalid context at mm/slub.c:943
 in_atomic(): 1, irqs_disabled(): 0, pid: 2103, name: kworker/0:2
 Pid: 2103, comm: kworker/0:2 Not tainted 3.5.0-rc1+ #55
 Call Trace:
  [<ffffffff810697ca>] __might_sleep+0xca/0xf0
  [<ffffffff811345a3>] kmem_cache_alloc_node+0x1b3/0x1c0
  [<ffffffff8105578c>] ? queue_delayed_work_on+0x11c/0x130
  [<ffffffff815343fb>] __alloc_skb+0x4b/0x230
  [<ffffffffa00b0360>] ? reset_per_cpu_data+0x160/0x160 [drop_monitor]
  [<ffffffffa00b022f>] reset_per_cpu_data+0x2f/0x160 [drop_monitor]
  [<ffffffffa00b03ab>] send_dm_alert+0x4b/0xb0 [drop_monitor]
  [<ffffffff810568e0>] process_one_work+0x130/0x4c0
  [<ffffffff81058249>] worker_thread+0x159/0x360
  [<ffffffff810580f0>] ? manage_workers.isra.27+0x240/0x240
  [<ffffffff8105d403>] kthread+0x93/0xa0
  [<ffffffff816be6d4>] kernel_thread_helper+0x4/0x10
  [<ffffffff8105d370>] ? kthread_freezable_should_stop+0x80/0x80
  [<ffffffff816be6d0>] ? gs_change+0xb/0xb

Rework the logic to call the sleeping functions in right context.

Use standard timer/workqueue api to let system chose any cpu to perform
the allocation and netlink send.

Also avoid a loop if reset_per_cpu_data() cannot allocate memory :
use mod_timer() to wait 1/10 second before next try.

Signed-off-by: Eric Dumazet <>
Cc: Neil Horman <>
Reviewed-by: Neil Horman <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agodrop_monitor: prevent init path from scheduling on the wrong cpu
Neil Horman [Tue, 1 May 2012 08:18:02 +0000 (08:18 +0000)]
drop_monitor: prevent init path from scheduling on the wrong cpu

commit 4fdcfa12843bca38d0c9deff70c8720e4e8f515f upstream.

I just noticed after some recent updates, that the init path for the drop
monitor protocol has a minor error.  drop monitor maintains a per cpu structure,
that gets initalized from a single cpu.  Normally this is fine, as the protocol
isn't in use yet, but I recently made a change that causes a failed skb
allocation to reschedule itself .  Given the current code, the implication is
that this workqueue reschedule will take place on the wrong cpu.  If drop
monitor is used early during the boot process, its possible that two cpus will
access a single per-cpu structure in parallel, possibly leading to data

This patch fixes the situation, by storing the cpu number that a given instance
of this per-cpu data should be accessed from.  In the case of a need for a
reschedule, the cpu stored in the struct is assigned the rescheule, rather than
the currently executing cpu

Tested successfully by myself.

Signed-off-by: Neil Horman <>
CC: David Miller <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agodrop_monitor: Make updating data->skb smp safe
Neil Horman [Fri, 27 Apr 2012 10:11:49 +0000 (10:11 +0000)]
drop_monitor: Make updating data->skb smp safe

commit 3885ca785a3618593226687ced84f3f336dc3860 upstream.

Eric Dumazet pointed out to me that the drop_monitor protocol has some holes in
its smp protections.  Specifically, its possible to replace data->skb while its
being written.  This patch corrects that by making data->skb an rcu protected
variable.  That will prevent it from being overwritten while a tracepoint is
modifying it.

Signed-off-by: Neil Horman <>
Reported-by: Eric Dumazet <>
CC: David Miller <>
Acked-by: Eric Dumazet <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agodrop_monitor: fix sleeping in invalid context warning
Neil Horman [Fri, 27 Apr 2012 10:11:48 +0000 (10:11 +0000)]
drop_monitor: fix sleeping in invalid context warning

commit cde2e9a651b76d8db36ae94cd0febc82b637e5dd upstream.

Eric Dumazet pointed out this warning in the drop_monitor protocol to me:

[   38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85
[   38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch
[   38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71
[   38.352582] Call Trace:
[   38.352592]  [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
[   38.352599]  [<ffffffff81063f2a>] __might_sleep+0xca/0xf0
[   38.352606]  [<ffffffff81655b16>] mutex_lock+0x26/0x50
[   38.352610]  [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
[   38.352616]  [<ffffffff810b72d9>] tracepoint_probe_register+0x29/0x90
[   38.352621]  [<ffffffff8153a585>] set_all_monitor_traces+0x105/0x170
[   38.352625]  [<ffffffff8153a8ca>] net_dm_cmd_trace+0x2a/0x40
[   38.352630]  [<ffffffff8154a81a>] genl_rcv_msg+0x21a/0x2b0
[   38.352636]  [<ffffffff810f8029>] ? zone_statistics+0x99/0xc0
[   38.352640]  [<ffffffff8154a600>] ? genl_rcv+0x30/0x30
[   38.352645]  [<ffffffff8154a059>] netlink_rcv_skb+0xa9/0xd0
[   38.352649]  [<ffffffff8154a5f0>] genl_rcv+0x20/0x30
[   38.352653]  [<ffffffff81549a7e>] netlink_unicast+0x1ae/0x1f0
[   38.352658]  [<ffffffff81549d76>] netlink_sendmsg+0x2b6/0x310
[   38.352663]  [<ffffffff8150824f>] sock_sendmsg+0x10f/0x130
[   38.352668]  [<ffffffff8150abe0>] ? move_addr_to_kernel+0x60/0xb0
[   38.352673]  [<ffffffff81515f04>] ? verify_iovec+0x64/0xe0
[   38.352677]  [<ffffffff81509c46>] __sys_sendmsg+0x386/0x390
[   38.352682]  [<ffffffff810ffaf9>] ? handle_mm_fault+0x139/0x210
[   38.352687]  [<ffffffff8165b5bc>] ? do_page_fault+0x1ec/0x4f0
[   38.352693]  [<ffffffff8106ba4d>] ? set_next_entity+0x9d/0xb0
[   38.352699]  [<ffffffff81310b49>] ? tty_ldisc_deref+0x9/0x10
[   38.352703]  [<ffffffff8106d363>] ? pick_next_task_fair+0x63/0x140
[   38.352708]  [<ffffffff8150b8d4>] sys_sendmsg+0x44/0x80
[   38.352713]  [<ffffffff8165f8e2>] system_call_fastpath+0x16/0x1b

It stems from holding a spinlock (trace_state_lock) while attempting to register
or unregister tracepoint hooks, making in_atomic() true in this context, leading
to the warning when the tracepoint calls might_sleep() while its taking a mutex.
Since we only use the trace_state_lock to prevent trace protocol state races, as
well as hardware stat list updates on an rcu write side, we can just convert the
spinlock to a mutex to avoid this problem.

Signed-off-by: Neil Horman <>
Reported-by: Eric Dumazet <>
CC: David Miller <>
Acked-by: Eric Dumazet <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
9 years agort2x00: Add support for BUFFALO WLI-UC-GNM2 to rt2800usb.
Jeongdo Son [Thu, 14 Jun 2012 17:28:01 +0000 (02:28 +0900)]
rt2x00: Add support for BUFFALO WLI-UC-GNM2 to rt2800usb.

commit a769f9577232afe2c754606a83aad85127e7052a upstream.

This is a RT3070 based device.

Signed-off-by: Jeongdo Son <>
Signed-off-by: John W. Linville <>
Signed-off-by: Ben Hutchings <>
9 years agodrm/i915: prefer wide & slow to fast & narrow in DP configs
Jesse Barnes [Thu, 21 Jun 2012 22:13:50 +0000 (15:13 -0700)]
drm/i915: prefer wide & slow to fast & narrow in DP configs

commit 2514bc510d0c3aadcc5204056bb440fa36845147 upstream.

High frequency link configurations have the potential to cause trouble
with long and/or cheap cables, so prefer slow and wide configurations
instead.  This patch has the potential to cause trouble for eDP
configurations that lie about available lanes, so if we run into that we
can make it conditional on eDP.

Signed-off-by: Jesse Barnes <>
Signed-off-by: Daniel Vetter <>
Signed-off-by: Ben Hutchings <>
9 years agom68k: Make sys_atomic_cmpxchg_32 work on classic m68k
Andreas Schwab [Fri, 27 Jul 2012 22:20:34 +0000 (00:20 +0200)]
m68k: Make sys_atomic_cmpxchg_32 work on classic m68k

commit 9e2760d18b3cf179534bbc27692c84879c61b97c upstream.

User space access must always go through uaccess accessors, since on
classic m68k user space and kernel space are completely separate.

Signed-off-by: Andreas Schwab <>
Tested-by: Thorsten Glaser <>
Signed-off-by: Geert Uytterhoeven <>
Signed-off-by: Ben Hutchings <>
9 years agoore: Fix out-of-bounds access in _ios_obj()
Boaz Harrosh [Wed, 1 Aug 2012 14:48:36 +0000 (17:48 +0300)]
ore: Fix out-of-bounds access in _ios_obj()

commit 9e62bb4458ad2cf28bd701aa5fab380b846db326 upstream.

_ios_obj() is accessed by group_index not device_table index.

The oc->comps array is only a group_full of devices at a time
it is not like ore_comp_dev() which is indexed by a global
device_table index.

This did not BUG until now because exofs only uses a single
COMP for all devices. But with other FSs like PanFS this is
not true.

This bug was only in the write_path, all other users were
using it correctly

[This is a bug since 3.2 Kernel]

Signed-off-by: Boaz Harrosh <>
Signed-off-by: Ben Hutchings <>
9 years agoALSA: hda - Support dock on Lenovo Thinkpad T530 with ALC269VC
Takashi Iwai [Thu, 2 Aug 2012 07:04:39 +0000 (09:04 +0200)]
ALSA: hda - Support dock on Lenovo Thinkpad T530 with ALC269VC

commit 707fba3fa76a4c8855552f5d4c1a12430c09bce8 upstream.

Lenovo Thinkpad T530 with ALC269VC codec has a dock port but BIOS
doesn't set up the pins properly.  Enable the pins as well as on
Thinkpad X230 Tablet.

Reported-and-tested-by: Mario <>
Signed-off-by: Takashi Iwai <>
Signed-off-by: Ben Hutchings <>
9 years agoALSA: snd-usb: fix clock source validity index
Daniel Mack [Wed, 1 Aug 2012 08:16:53 +0000 (10:16 +0200)]
ALSA: snd-usb: fix clock source validity index

commit aff252a848ce21b431ba822de3dab9c4c94571cb upstream.

uac_clock_source_is_valid() uses the control selector value to access
the bmControls bitmap of the clock source unit. This is wrong, as
control selector values start from 1, while the bitmap uses all
available bits.

In other words, "Clock Validity Control" is stored in D3..2, not D5..4
of the clock selector unit's bmControls.

Signed-off-by: Daniel Mack <>
Reported-by: Andreas Koch <>
Signed-off-by: Takashi Iwai <>
Signed-off-by: Ben Hutchings <>
9 years agomm: hugetlbfs: close race during teardown of hugetlbfs shared page tables
Mel Gorman [Tue, 31 Jul 2012 23:46:20 +0000 (16:46 -0700)]
mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables

commit d833352a4338dc31295ed832a30c9ccff5c7a183 upstream.

If a process creates a large hugetlbfs mapping that is eligible for page
table sharing and forks heavily with children some of whom fault and
others which destroy the mapping then it is possible for page tables to
get corrupted.  Some teardowns of the mapping encounter a "bad pmd" and
output a message to the kernel log.  The final teardown will trigger a
BUG_ON in mm/filemap.c.

This was reproduced in 3.4 but is known to have existed for a long time
and goes back at least as far as 2.6.37.  It was probably was introduced
in 2.6.20 by [39dde65c: shared page table for hugetlb page].  The messages
look like this;

[  ..........] Lots of bad pmd messages followed by this
[  127.164256] mm/memory.c:391: bad pmd ffff880412e04fe8(80000003de4000e7).
[  127.164257] mm/memory.c:391: bad pmd ffff880412e04ff0(80000003de6000e7).
[  127.164258] mm/memory.c:391: bad pmd ffff880412e04ff8(80000003de0000e7).
[  127.186778] ------------[ cut here ]------------
[  127.186781] kernel BUG at mm/filemap.c:134!
[  127.186782] invalid opcode: 0000 [#1] SMP
[  127.186783] CPU 7
[  127.186784] Modules linked in: af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf ext3 jbd dm_mod coretemp crc32c_intel usb_storage ghash_clmulni_intel aesni_intel i2c_i801 r8169 mii uas sr_mod cdrom sg iTCO_wdt iTCO_vendor_support shpchp serio_raw cryptd aes_x86_64 e1000e pci_hotplug dcdbas aes_generic container microcode ext4 mbcache jbd2 crc16 sd_mod crc_t10dif i915 drm_kms_helper drm i2c_algo_bit ehci_hcd ahci libahci usbcore rtc_cmos usb_common button i2c_core intel_agp video intel_gtt fan processor thermal thermal_sys hwmon ata_generic pata_atiixp libata scsi_mod
[  127.186801]
[  127.186802] Pid: 9017, comm: hugetlbfs-test Not tainted 3.4.0-autobuild #53 Dell Inc. OptiPlex 990/06D7TR
[  127.186804] RIP: 0010:[<ffffffff810ed6ce>]  [<ffffffff810ed6ce>] __delete_from_page_cache+0x15e/0x160
[  127.186809] RSP: 0000:ffff8804144b5c08  EFLAGS: 00010002
[  127.186810] RAX: 0000000000000001 RBX: ffffea000a5c9000 RCX: 00000000ffffffc0
[  127.186811] RDX: 0000000000000000 RSI: 0000000000000009 RDI: ffff88042dfdad00
[  127.186812] RBP: ffff8804144b5c18 R08: 0000000000000009 R09: 0000000000000003
[  127.186813] R10: 0000000000000000 R11: 000000000000002d R12: ffff880412ff83d8
[  127.186814] R13: ffff880412ff83d8 R14: 0000000000000000 R15: ffff880412ff83d8
[  127.186815] FS:  00007fe18ed2c700(0000) GS:ffff88042dce0000(0000) knlGS:0000000000000000
[  127.186816] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  127.186817] CR2: 00007fe340000503 CR3: 0000000417a14000 CR4: 00000000000407e0
[  127.186818] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  127.186819] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  127.186820] Process hugetlbfs-test (pid: 9017, threadinfo ffff8804144b4000, task ffff880417f803c0)
[  127.186821] Stack:
[  127.186822]  ffffea000a5c9000 0000000000000000 ffff8804144b5c48 ffffffff810ed83b
[  127.186824]  ffff8804144b5c48 000000000000138a 0000000000001387 ffff8804144b5c98
[  127.186825]  ffff8804144b5d48 ffffffff811bc925 ffff8804144b5cb8 0000000000000000
[  127.186827] Call Trace:
[  127.186829]  [<ffffffff810ed83b>] delete_from_page_cache+0x3b/0x80
[  127.186832]  [<ffffffff811bc925>] truncate_hugepages+0x115/0x220
[  127.186834]  [<ffffffff811bca43>] hugetlbfs_evict_inode+0x13/0x30
[  127.186837]  [<ffffffff811655c7>] evict+0xa7/0x1b0
[  127.186839]  [<ffffffff811657a3>] iput_final+0xd3/0x1f0
[  127.186840]  [<ffffffff811658f9>] iput+0x39/0x50
[  127.186842]  [<ffffffff81162708>] d_kill+0xf8/0x130
[  127.186843]  [<ffffffff81162812>] dput+0xd2/0x1a0
[  127.186845]  [<ffffffff8114e2d0>] __fput+0x170/0x230
[  127.186848]  [<ffffffff81236e0e>] ? rb_erase+0xce/0x150
[  127.186849]  [<ffffffff8114e3ad>] fput+0x1d/0x30
[  127.186851]  [<ffffffff81117db7>] remove_vma+0x37/0x80
[  127.186853]  [<ffffffff81119182>] do_munmap+0x2d2/0x360
[  127.186855]  [<ffffffff811cc639>] sys_shmdt+0xc9/0x170
[  127.186857]  [<ffffffff81410a39>] system_call_fastpath+0x16/0x1b
[  127.186858] Code: 0f 1f 44 00 00 48 8b 43 08 48 8b 00 48 8b 40 28 8b b0 40 03 00 00 85 f6 0f 88 df fe ff ff 48 89 df e8 e7 cb 05 00 e9 d2 fe ff ff <0f> 0b 55 83 e2 fd 48 89 e5 48 83 ec 30 48 89 5d d8 4c 89 65 e0
[  127.186868] RIP  [<ffffffff810ed6ce>] __delete_from_page_cache+0x15e/0x160
[  127.186870]  RSP <ffff8804144b5c08>
[  127.186871] ---[ end trace 7cbac5d1db69f426 ]---

The bug is a race and not always easy to reproduce.  To reproduce it I was
doing the following on a single socket I7-based machine with 16G of RAM.

$ hugeadm --pool-pages-max DEFAULT:13G
$ echo $((18*1048576*1024)) > /proc/sys/kernel/shmmax
$ echo $((18*1048576*1024)) > /proc/sys/kernel/shmall
$ for i in `seq 1 9000`; do ./hugetlbfs-test; done

On my particular machine, it usually triggers within 10 minutes but
enabling debug options can change the timing such that it never hits.
Once the bug is triggered, the machine is in trouble and needs to be
rebooted.  The machine will respond but processes accessing proc like "ps
aux" will hang due to the BUG_ON.  shutdown will also hang and needs a
hard reset or a sysrq-b.

The basic problem is a race between page table sharing and teardown.  For
the most part page table sharing depends on i_mmap_mutex.  In some cases,
it is also taking the mm->page_table_lock for the PTE updates but with
shared page tables, it is the i_mmap_mutex that is more important.

Unfortunately it appears to be also insufficient. Consider the following

Process A Process B
--------- ---------
hugetlb_fault shmdt
    huge_pmd_unshare/unmap tables <--- (1)
  huge_pte_alloc       ...
    Lock(i_mmap_mutex)       ...
    vma_prio_walk, find svma, spte       ...
    Lock(mm->page_table_lock)       ...
    share spte       ...
    Unlock(mm->page_table_lock)       ...
    Unlock(i_mmap_mutex)       ...
  hugetlb_no_page   <--- (2)

In this scenario, it is possible for Process A to share page tables with
Process B that is trying to tear them down.  The i_mmap_mutex on its own
does not prevent Process A walking Process B's page tables.  At (1) above,
the page tables are not shared yet so it unmaps the PMDs.  Process A sets
up page table sharing and at (2) faults a new entry.  Process B then trips
up on it in free_pgtables.

This patch fixes the problem by adding a new function
__unmap_hugepage_range_final that is only called when the VMA is about to
be destroyed.  This function clears VM_MAYSHARE during
unmap_hugepage_range() under the i_mmap_mutex.  This makes the VMA
ineligible for sharing and avoids the race.  Superficially this looks like
it would then be vunerable to truncate and madvise issues but hugetlbfs
has its own truncate handlers so does not use unmap_mapping_range() and
does not support madvise(DONTNEED).

This should be treated as a -stable candidate if it is merged.

Test program is as follows. The test case was mostly written by Michal
Hocko with a few minor changes to reproduce this bug.

==== CUT HERE ====

static size_t huge_page_size = (2UL << 20);
static size_t nr_huge_page_A = 512;
static size_t nr_huge_page_B = 5632;

unsigned int get_random(unsigned int max)
struct timeval tv;

gettimeofday(&tv, NULL);
return random() % max;

static void play(void *addr, size_t size)
unsigned char *start = addr,
      *end = start + size,
start += get_random(size/2);

/* we could itterate on huge pages but let's give it more time. */
for (a = start; a < end; a += 4096)
*a = 0;

int main(int argc, char **argv)
key_t key = IPC_PRIVATE;
size_t sizeA = nr_huge_page_A * huge_page_size;
size_t sizeB = nr_huge_page_B * huge_page_size;
int shmidA, shmidB;
void *addrA = NULL, *addrB = NULL;
int nr_children = 300, n = 0;

if ((shmidA = shmget(key, sizeA, IPC_CREAT|SHM_HUGETLB|0660)) == -1) {
return 1;

if ((addrA = shmat(shmidA, addrA, SHM_R|SHM_W)) == (void *)-1UL) {
return 1;
if ((shmidB = shmget(key, sizeB, IPC_CREAT|SHM_HUGETLB|0660)) == -1) {
return 1;

if ((addrB = shmat(shmidB, addrB, SHM_R|SHM_W)) == (void *)-1UL) {
return 1;

switch(fork()) {
case 0:
switch (n%3) {
case 0:
play(addrA, sizeA);
case 1:
play(addrB, sizeB);
case 2:
case -1:
if (++n < nr_children)
goto fork_child;
play(addrA, sizeA);
do {
} while (--n > 0);
shmctl(shmidA, IPC_RMID, NULL);
shmctl(shmidB, IPC_RMID, NULL);
return 0;

[ name the declaration's args, fix CONFIG_HUGETLBFS=n build]
Signed-off-by: Hugh Dickins <>
Reviewed-by: Michal Hocko <>
Signed-off-by: Mel Gorman <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
[bwh: Backported to 3.2:
 - Adjust context
 - Drop the mmu_gather * parameters]
Signed-off-by: Ben Hutchings <>
9 years agomm: mmu_notifier: fix freed page still mapped in secondary MMU
Xiao Guangrong [Tue, 31 Jul 2012 23:45:52 +0000 (16:45 -0700)]
mm: mmu_notifier: fix freed page still mapped in secondary MMU

commit 3ad3d901bbcfb15a5e4690e55350db0899095a68 upstream.

mmu_notifier_release() is called when the process is exiting.  It will
delete all the mmu notifiers.  But at this time the page belonging to the
process is still present in page tables and is present on the LRU list, so
this race will happen:

      CPU 0                 CPU 1
mmu_notifier_release:    try_to_unmap:
                                  mmu nofifler not found
                            free page  !!!!!!
                             * At the point, the page has been
                             * freed, but it is still mapped in
                             * the secondary MMU.

  mn->ops->release(mn, mm);

Then the box is not stable and sometimes we can get this bug:

[  738.075923] BUG: Bad page state in process migrate-perf  pfn:03bec
[  738.075931] page:ffffea00000efb00 count:0 mapcount:0 mapping:          (null) index:0x8076
[  738.075936] page flags: 0x20000000000014(referenced|dirty)

The same issue is present in mmu_notifier_unregister().

We can call ->release before deleting the notifier to ensure the page has
been unmapped from the secondary MMU before it is freed.

Signed-off-by: Xiao Guangrong <>
Cc: Avi Kivity <>
Cc: Marcelo Tosatti <>
Cc: Paul Gortmaker <>
Cc: Andrea Arcangeli <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Ben Hutchings <>
9 years agomm: setup pageblock_order before it's used by sparsemem
Xishi Qiu [Tue, 31 Jul 2012 23:43:19 +0000 (16:43 -0700)]
mm: setup pageblock_order before it's used by sparsemem

commit ca57df79d4f64e1a4886606af4289d40636189c5 upstream.

On architectures with CONFIG_HUGETLB_PAGE_SIZE_VARIABLE set, such as
Itanium, pageblock_order is a variable with default value of 0.  It's set
to the right value by set_pageblock_order() in function

But pageblock_order may be used by sparse_init() before free_area_init_core()
is called along path:
->((1UL << (PFN_SECTION_SHIFT - pageblock_order)) *

The uninitialized pageblock_size will cause memory wasting because
usemap_size() returns a much bigger value then it's really needed.

For example, on an Itanium platform,
sparse_init() pageblock_order=0 usemap_size=24576
free_area_init_core() before pageblock_order=0, usemap_size=24576
free_area_init_core() after pageblock_order=12, usemap_size=8

That means 24K memory has been wasted for each section, so fix it by calling
set_pageblock_order() from sparse_init().

Signed-off-by: Xishi Qiu <>
Signed-off-by: Jiang Liu <>
Cc: Tony Luck <>
Cc: Yinghai Lu <>
Cc: KAMEZAWA Hiroyuki <>
Cc: Benjamin Herrenschmidt <>
Cc: KOSAKI Motohiro <>
Cc: David Rientjes <>
Cc: Keping Chen <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
9 years agomm/page_alloc.c: remove pageblock_default_order()
Andrew Morton [Tue, 29 May 2012 22:06:31 +0000 (15:06 -0700)]
mm/page_alloc.c: remove pageblock_default_order()

commit 955c1cd7401565671b064e499115344ec8067dfd upstream.

This has always been broken: one version takes an unsigned int and the
other version takes no arguments.  This bug was hidden because one
version of set_pageblock_order() was a macro which doesn't evaluate its

Simplify it all and remove pageblock_default_order() altogether.

Reported-by: rajman mekaco <>
Cc: Mel Gorman <>
Cc: KAMEZAWA Hiroyuki <>
Cc: Tejun Heo <>
Cc: Minchan Kim <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Ben Hutchings <>
9 years agoASoC: wm8962: Allow VMID time to fully ramp
Mark Brown [Mon, 30 Jul 2012 17:24:19 +0000 (18:24 +0100)]
ASoC: wm8962: Allow VMID time to fully ramp

commit 9d40e5582c9c4cfb6977ba2a0ca9c2ed82c56f21 upstream.

Required for reliable power up from cold.

Signed-off-by: Mark Brown <>
Signed-off-by: Ben Hutchings <>
9 years agoUSB: echi-dbgp: increase the controller wait time to come out of halt.
Colin Ian King [Mon, 30 Jul 2012 15:06:42 +0000 (16:06 +0100)]
USB: echi-dbgp: increase the controller wait time to come out of halt.

commit f96a4216e85050c0a9d41a41ecb0ae9d8e39b509 upstream.

The default 10 microsecond delay for the controller to come out of
halt in dbgp_ehci_startup is too short, so increase it to 1 millisecond.

This is based on emperical testing on various USB debug ports on
modern machines such as a Lenovo X220i and an Ivybridge development
platform that needed to wait ~450-950 microseconds.

Signed-off-by: Colin Ian King <>
Signed-off-by: Jason Wessel <>
Signed-off-by: Ben Hutchings <>
9 years agoARM: Fix undefined instruction exception handling
Russell King [Mon, 30 Jul 2012 18:42:10 +0000 (19:42 +0100)]
ARM: Fix undefined instruction exception handling

commit 15ac49b65024f55c4371a53214879a9c77c4fbf9 upstream.

While trying to get a v3.5 kernel booted on the cubox, I noticed that
VFP does not work correctly with VFP bounce handling.  This is because
of the confusion over 16-bit vs 32-bit instructions, and where PC is
supposed to point to.

The rule is that FP handlers are entered with regs->ARM_pc pointing at
the _next_ instruction to be executed.  However, if the exception is
not handled, regs->ARM_pc points at the faulting instruction.

This is easy for ARM mode, because we know that the next instruction and
previous instructions are separated by four bytes.  This is not true of
Thumb2 though.

Since all FP instructions are 32-bit in Thumb2, it makes things easy.
We just need to select the appropriate adjustment.  Do this by moving
the adjustment out of do_undefinstr() into the assembly code, as only
the assembly code knows whether it's dealing with a 32-bit or 16-bit

Acked-by: Will Deacon <>
Signed-off-by: Russell King <>
Signed-off-by: Ben Hutchings <>
9 years agoARM: 7478/1: errata: extend workaround for erratum #720789
Will Deacon [Fri, 20 Jul 2012 17:24:55 +0000 (18:24 +0100)]
ARM: 7478/1: errata: extend workaround for erratum #720789

commit 5a783cbc48367cfc7b65afc75430953dfe60098f upstream.

Commit cdf357f1 ("ARM: 6299/1: errata: TLBIASIDIS and TLBIMVAIS
operations can broadcast a faulty ASID") replaced by-ASID TLB flushing
operations with all-ASID variants to workaround A9 erratum #720789.

This patch extends the workaround to include the tlb_range operations,
which were overlooked by the original patch.

Tested-by: Steve Capper <>
Signed-off-by: Will Deacon <>
Signed-off-by: Russell King <>
Signed-off-by: Ben Hutchings <>
9 years agoARM: 7477/1: vfp: Always save VFP state in vfp_pm_suspend on UP
Colin Cross [Fri, 20 Jul 2012 01:03:42 +0000 (02:03 +0100)]
ARM: 7477/1: vfp: Always save VFP state in vfp_pm_suspend on UP

commit 24b35521b8ddf088531258f06f681bb7b227bf47 upstream.

vfp_pm_suspend should save the VFP state in suspend after
any lazy context switch.  If it only saves when the VFP is enabled,
the state can get lost when, on a UP system:
  Thread 1 uses the VFP
  Context switch occurs to thread 2, VFP is disabled but the
     VFP context is not saved
  Thread 2 initiates suspend
  vfp_pm_suspend is called with the VFP disabled, and the unsaved
     VFP context of Thread 1 in the registers

Modify vfp_pm_suspend to save the VFP context whenever
vfp_current_hw_state is not NULL.

Includes a fix from Ido Yariv <>, who pointed out that on
SMP systems, the state pointer can be pointing to a freed task struct if
a task exited on another cpu, fixed by using #ifndef CONFIG_SMP in the
new if clause.

Cc: Barry Song <>
Cc: Catalin Marinas <>
Cc: Ido Yariv <>
Cc: Daniel Drake <>
Cc: Will Deacon <>
Signed-off-by: Colin Cross <>
Signed-off-by: Russell King <>
Signed-off-by: Ben Hutchings <>
9 years agoARM: 7476/1: vfp: only clear vfp state for current cpu in vfp_pm_suspend
Colin Cross [Fri, 20 Jul 2012 01:03:43 +0000 (02:03 +0100)]
ARM: 7476/1: vfp: only clear vfp state for current cpu in vfp_pm_suspend

commit a84b895a2348f0dbff31b71ddf954f70a6cde368 upstream.

vfp_pm_suspend runs on each cpu, only clear the hardware state
pointer for the current cpu.  Prevents a possible crash if one
cpu clears the hw state pointer when another cpu has already
checked if it is valid.

Signed-off-by: Colin Cross <>
Signed-off-by: Russell King <>
Signed-off-by: Ben Hutchings <>
9 years agoARM: 7467/1: mutex: use generic xchg-based implementation for ARMv6+
Will Deacon [Fri, 13 Jul 2012 18:15:40 +0000 (19:15 +0100)]
ARM: 7467/1: mutex: use generic xchg-based implementation for ARMv6+

commit a76d7bd96d65fa5119adba97e1b58d95f2e78829 upstream.

The open-coded mutex implementation for ARMv6+ cores suffers from a
severe lack of barriers, so in the uncontended case we don't actually
protect any accesses performed during the critical section.

Furthermore, the code is largely a duplication of the ARMv6+ atomic_dec
code but optimised to remove a branch instruction, as the mutex fastpath
was previously inlined. Now that this is executed out-of-line, we can
reuse the atomic access code for the locking (in fact, we use the xchg
code as this produces shorter critical sections).

This patch uses the generic xchg based implementation for mutexes on
ARMv6+, which introduces barriers to the lock/unlock operations and also
has the benefit of removing a fair amount of inline assembly code.

Acked-by: Arnd Bergmann <>
Acked-by: Nicolas Pitre <>
Reported-by: Shan Kang <>
Signed-off-by: Will Deacon <>
Signed-off-by: Russell King <>
Signed-off-by: Ben Hutchings <>
9 years agoARM: 7466/1: disable interrupt before spinning endlessly
Shawn Guo [Fri, 13 Jul 2012 07:19:34 +0000 (08:19 +0100)]
ARM: 7466/1: disable interrupt before spinning endlessly

commit 98bd8b96b26db3399a48202318dca4aaa2515355 upstream.

The CPU will endlessly spin at the end of machine_halt and
machine_restart calls.  However, this will lead to a soft lockup
warning after about 20 seconds, if CONFIG_LOCKUP_DETECTOR is enabled,
as system timer is still alive.

Disable interrupt before going to spin endlessly, so that the lockup
warning will never be seen.

Reported-by: Marek Vasut <>
Signed-off-by: Shawn Guo <>
Signed-off-by: Russell King <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
9 years agoSUNRPC: return negative value in case rpcbind client creation error
Stanislav Kinsbursky [Fri, 20 Jul 2012 11:57:48 +0000 (15:57 +0400)]
SUNRPC: return negative value in case rpcbind client creation error

commit caea33da898e4e14f0ba58173e3b7689981d2c0b upstream.

Without this patch kernel will panic on LockD start, because lockd_up() checks
lockd_up_net() result for negative value.
From my pow it's better to return negative value from rpcbind routines instead
of replacing all such checks like in lockd_up().

Signed-off-by: Stanislav Kinsbursky <>
Signed-off-by: Trond Myklebust <>
Signed-off-by: Ben Hutchings <>
9 years agonilfs2: fix deadlock issue between chcp and thaw ioctls
Ryusuke Konishi [Mon, 30 Jul 2012 21:42:07 +0000 (14:42 -0700)]
nilfs2: fix deadlock issue between chcp and thaw ioctls

commit 572d8b3945a31bee7c40d21556803e4807fd9141 upstream.

An fs-thaw ioctl causes deadlock with a chcp or mkcp -s command:

 chcp            D ffff88013870f3d0     0  1325   1324 0x00000004
 Call Trace:
   nilfs_transaction_begin+0x11c/0x1a0 [nilfs2]
   copy_from_user+0x18/0x30 [nilfs2]
   nilfs_ioctl_change_cpmode+0x7d/0xcf [nilfs2]
   nilfs_ioctl+0x252/0x61a [nilfs2]
 thaw            D ffff88013870d890     0  1352   1351 0x00000004
 Call Trace:

where the thaw ioctl deadlocked at thaw_super() when called while chcp was
waiting at nilfs_transaction_begin() called from
nilfs_ioctl_change_cpmode().  This deadlock is 100% reproducible.

This is because nilfs_ioctl_change_cpmode() first locks sb->s_umount in
read mode and then waits for unfreezing in nilfs_transaction_begin(),
whereas thaw_super() locks sb->s_umount in write mode.  The locking of
sb->s_umount here was intended to make snapshot mounts and the downgrade
of snapshots to checkpoints exclusive.

This fixes the deadlock issue by replacing the sb->s_umount usage in
nilfs_ioctl_change_cpmode() with a dedicated mutex which protects snapshot

Signed-off-by: Ryusuke Konishi <>
Cc: Fernando Luis Vazquez Cao <>
Tested-by: Ryusuke Konishi <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Ben Hutchings <>
9 years agolib/vsprintf.c: kptr_restrict: fix pK-error in SysRq show-all-timers(Q)
Dan Rosenberg [Mon, 30 Jul 2012 21:40:26 +0000 (14:40 -0700)]
lib/vsprintf.c: kptr_restrict: fix pK-error in SysRq show-all-timers(Q)

commit 3715c5309f6d175c3053672b73fd4f73be16fd07 upstream.

When using ALT+SysRq+Q all the pointers are replaced with "pK-error" like

[23153.208033]   .base:               pK-error

with echo h > /proc/sysrq-trigger it works:

[23107.776363]   .base:       ffff88023e60d540

The intent behind this behavior was to return "pK-error" in cases where
the %pK format specifier was used in interrupt context, because the
CAP_SYSLOG check wouldn't be meaningful.  Clearly this should only apply
when kptr_restrict is actually enabled though.

Reported-by: Stevie Trujillo <>
Signed-off-by: Dan Rosenberg <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Ben Hutchings <>
9 years agopcdp: use early_ioremap/early_iounmap to access pcdp table
Greg Pearson [Mon, 30 Jul 2012 21:39:05 +0000 (14:39 -0700)]
pcdp: use early_ioremap/early_iounmap to access pcdp table

commit 6c4088ac3a4d82779903433bcd5f048c58fb1aca upstream.

efi_setup_pcdp_console() is called during boot to parse the HCDP/PCDP
EFI system table and setup an early console for printk output.  The
routine uses ioremap/iounmap to setup access to the HCDP/PCDP table

The call to ioremap is happening early in the boot process which leads
to a panic on x86_64 systems:


This replaces the calls to ioremap/iounmap in efi_setup_pcdp_console()
with calls to early_ioremap/early_iounmap which can be called during
early boot.

This patch was tested on an x86_64 prototype system which uses the
HCDP/PCDP table for early console setup.

Signed-off-by: Greg Pearson <>
Acked-by: Khalid Aziz <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Ben Hutchings <>
9 years agomd/raid1: don't abort a resync on the first badblock.
NeilBrown [Tue, 31 Jul 2012 00:05:34 +0000 (10:05 +1000)]
md/raid1: don't abort a resync on the first badblock.

commit b7219ccb33aa0df9949a60c68b5e9f712615e56f upstream.

If a resync of a RAID1 array with 2 devices finds a known bad block
one device it will neither read from, or write to, that device for
this block offset.
So there will be one read_target (The other device) and zero write
This condition causes md/raid1 to abort the resync assuming that it
has finished - without known bad blocks this would be true.

When there are no write targets because of the presence of bad blocks
we should only skip over the area covered by the bad block.
RAID10 already gets this right, raid1 doesn't.  Or didn't.

As this can cause a 'sync' to abort early and appear to have succeeded
it could lead to some data corruption, so it suitable for -stable.

Reported-by: Alexander Lyakas <>
Signed-off-by: NeilBrown <>
Signed-off-by: Ben Hutchings <>
9 years agonfs: skip commit in releasepage if we're freeing memory for fs-related reasons
Jeff Layton [Mon, 23 Jul 2012 17:58:51 +0000 (13:58 -0400)]
nfs: skip commit in releasepage if we're freeing memory for fs-related reasons

commit 5cf02d09b50b1ee1c2d536c9cf64af5a7d433f56 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <>
Signed-off-by: Trond Myklebust <>
Signed-off-by: Ben Hutchings <>
9 years agos390/mm: fix fault handling for page table walk case
Heiko Carstens [Fri, 27 Jul 2012 07:45:39 +0000 (09:45 +0200)]
s390/mm: fix fault handling for page table walk case

commit 008c2e8f247f0a8db1e8e26139da12f3a3abcda0 upstream.

Make sure the kernel does not incorrectly create a SIGBUS signal during
user space accesses:

For user space accesses in the switched addressing mode case the kernel
may walk page tables and access user address space via the kernel
mapping. If a page table entry is invalid the function __handle_fault()
gets called in order to emulate a page fault and trigger all the usual
actions like paging in a missing page etc. by calling handle_mm_fault().

If handle_mm_fault() returns with an error fixup handling is necessary.
For the switched addressing mode case all errors need to be mapped to
-EFAULT, so that the calling uaccess function can return -EFAULT to
user space.

Unfortunately the __handle_fault() incorrectly calls do_sigbus() if
VM_FAULT_SIGBUS is set. This however should only happen if a page fault
was triggered by a user space instruction. For kernel mode uaccesses
the correct action is to only return -EFAULT.
So user space may incorrectly see SIGBUS signals because of this bug.

For current machines this would only be possible for the switched
addressing mode case in conjunction with futex operations.

Signed-off-by: Heiko Carstens <>
Signed-off-by: Martin Schwidefsky <>
[bwh: Backported to 3.2: do_exception() and do_sigbus() parameters differ]
Signed-off-by: Ben Hutchings <>
9 years agovirtio-blk: Use block layer provided spinlock
Asias He [Fri, 25 May 2012 08:03:27 +0000 (16:03 +0800)]
virtio-blk: Use block layer provided spinlock

commit 2c95a3290919541b846bee3e0fbaa75860929f53 upstream.

Block layer will allocate a spinlock for the queue if the driver does
not provide one in blk_init_queue().

The reason to use the internal spinlock is that blk_cleanup_queue() will
switch to use the internal spinlock in the cleanup code path.

        if (q->queue_lock != &q->__queue_lock)
                q->queue_lock = &q->__queue_lock;

However, processes which are in D state might have taken the driver
provided spinlock, when the processes wake up, they would release the
block provided spinlock.

[ BUG: bad unlock balance detected! ]
3.4.0-rc7+ #238 Not tainted
fio/3587 is trying to release lock (&(&q->__queue_lock)->rlock) at:
[<ffffffff813274d2>] blk_queue_bio+0x2a2/0x380
but there are no more locks to release!

other info that might help us debug this:
1 lock held by fio/3587:
 #0:  (&(&vblk->lock)->rlock){......}, at:
[<ffffffff8132661a>] get_request_wait+0x19a/0x250

Other drivers use block layer provided spinlock as well, e.g. SCSI.

Switching to the block layer provided spinlock saves a bit of memory and
does not increase lock contention. Performance test shows no real
difference is observed before and after this patch.

Changes in v2: Improve commit log as Michael suggested.

Signed-off-by: Asias He <>
Acked-by: Michael S. Tsirkin <>
Signed-off-by: Rusty Russell <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
9 years agoasus-wmi: use ASUS_WMI_METHODID_DSTS2 as default DSTS ID.
Alex Hung [Wed, 20 Jun 2012 03:47:35 +0000 (11:47 +0800)]
asus-wmi: use ASUS_WMI_METHODID_DSTS2 as default DSTS ID.

commit 63a78bb1051b240417daad3a3fa9c1bb10646dca upstream.

According to responses from the BIOS team, ASUS_WMI_METHODID_DSTS2
(0x53545344) will be used as future DSTS ID. In addition, calling
asus_wmi_evaluate_method(ASUS_WMI_METHODID_DSTS2, 0, 0, NULL) returns
ASUS_WMI_UNSUPPORTED_METHOD in new ASUS laptop PCs. This patch fixes
no DSTS ID will be assigned in this case.

Signed-off-by: Alex Hung <>
Signed-off-by: Matthew Garrett <>
Signed-off-by: Ben Hutchings <>
9 years agorandom: mix in architectural randomness in extract_buf()
H. Peter Anvin [Sat, 28 Jul 2012 02:26:08 +0000 (22:26 -0400)]
random: mix in architectural randomness in extract_buf()

commit d2e7c96af1e54b507ae2a6a7dd2baf588417a7e5 upstream.

Mix in any architectural randomness in extract_buf() instead of
xfer_secondary_buf().  This allows us to mix in more architectural
randomness, and it also makes xfer_secondary_buf() faster, moving a
tiny bit of additional CPU overhead to process which is extracting the

[ Commit description modified by tytso to remove an extended
  advertisement for the RDRAND instruction. ]

Signed-off-by: H. Peter Anvin <>
Acked-by: Ingo Molnar <>
Cc: DJ Johnston <>
Signed-off-by: Theodore Ts'o <>
Signed-off-by: Ben Hutchings <>
9 years agodm thin: fix memory leak in process_prepared_mapping error paths
Joe Thornber [Fri, 27 Jul 2012 14:08:05 +0000 (15:08 +0100)]
dm thin: fix memory leak in process_prepared_mapping error paths

commit 905386f82d08f66726912f303f3e6605248c60a3 upstream.

Fix memory leak in process_prepared_mapping by always freeing
the dm_thin_new_mapping structs from the mapping_pool mempool on
the error paths.

Signed-off-by: Joe Thornber <>
Signed-off-by: Mike Snitzer <>
Signed-off-by: Alasdair G Kergon <>
Signed-off-by: Ben Hutchings <>
9 years agodm thin: reduce endio_hook pool size
Alasdair G Kergon [Fri, 27 Jul 2012 14:07:57 +0000 (15:07 +0100)]
dm thin: reduce endio_hook pool size

commit 7768ed33ccdc02801c4483fc5682dc66ace14aea upstream.

Reduce the slab size used for the dm_thin_endio_hook mempool.

Allocation has been seen to fail on machines with smaller amounts
of memory due to fragmentation.

  lvm: page allocation failure. order:5, mode:0xd0
  device-mapper: table: 253:38: thin-pool: Error creating pool's endio_hook mempool

Signed-off-by: Alasdair G Kergon <>
Signed-off-by: Ben Hutchings <>
9 years agoRedefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts
Tony Luck [Thu, 26 Jul 2012 17:55:26 +0000 (10:55 -0700)]
Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts

commit a119365586b0130dfea06457f584953e0ff6481d upstream.

The following build error occured during a ia64 build with
swap-over-NFS patches applied.

net/core/sock.c:274:36: error: initializer element is not constant
net/core/sock.c:274:36: error: (near initialization for 'memalloc_socks')
net/core/sock.c:274:36: error: initializer element is not constant

This is identical to a parisc build error. Fengguang Wu, Mel Gorman
and James Bottomley did all the legwork to track the root cause of
the problem. This fix and entire commit log is shamelessly copied
from them with one extra detail to change a dubious runtime use of
ATOMIC_INIT() to atomic_set() in drivers/char/mspec.c

Dave Anglin says:
> Here is the line in sock.i:
> struct static_key memalloc_socks = ((struct static_key) { .enabled =
> ((atomic_t) { (0) }) });

The above line contains two compound literals.  It also uses a designated
initializer to initialize the field enabled.  A compound literal is not a
constant expression.

The location of the above statement isn't fully clear, but if a compound
literal occurs outside the body of a function, the initializer list must
consist of constant expressions.

Signed-off-by: Tony Luck <>
Signed-off-by: Ben Hutchings <>
9 years agos390/mm: downgrade page table after fork of a 31 bit process
Martin Schwidefsky [Thu, 26 Jul 2012 06:53:06 +0000 (08:53 +0200)]
s390/mm: downgrade page table after fork of a 31 bit process

commit 0f6f281b731d20bfe75c13f85d33f3f05b440222 upstream.

The downgrade of the 4 level page table created by init_new_context is
currently done only in start_thread31. If a 31 bit process forks the
new mm uses a 4 level page table, including the task size of 2<<42
that goes along with it. This is incorrect as now a 31 bit process
can map memory beyond 2GB. Define arch_dup_mmap to do the downgrade
after fork.

Signed-off-by: Martin Schwidefsky <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
9 years agox86, nops: Missing break resulting in incorrect selection on Intel
Alan Cox [Wed, 25 Jul 2012 15:28:19 +0000 (16:28 +0100)]
x86, nops: Missing break resulting in incorrect selection on Intel

commit d6250a3f12edb3a86db9598ffeca3de8b4a219e9 upstream.

The Intel case falls through into the generic case which then changes
the values.  For cases like the P6 it doesn't do the right thing so
this seems to be a screwup.

Signed-off-by: Alan Cox <>
Signed-off-by: H. Peter Anvin <>
Signed-off-by: Ben Hutchings <>
9 years agoALSA: mpu401: Fix missing initialization of irq field
Takashi Iwai [Mon, 23 Jul 2012 09:35:55 +0000 (11:35 +0200)]
ALSA: mpu401: Fix missing initialization of irq field

commit bc733d495267a23ef8660220d696c6e549ce30b3 upstream.

The irq field of struct snd_mpu401 is supposed to be initialized to -1.
Since it's set to zero as of now, a probing error before the irq
installation results in a kernel warning "Trying to free already-free
IRQ 0".

Signed-off-by: Takashi Iwai <>
Signed-off-by: Ben Hutchings <>
9 years agoALSA: hda - Fix invalid D3 of headphone DAC on VT202x codecs
Takashi Iwai [Wed, 25 Jul 2012 11:54:55 +0000 (13:54 +0200)]
ALSA: hda - Fix invalid D3 of headphone DAC on VT202x codecs

commit 6162552b0de6ba80937c3dd53e084967851cd199 upstream.

We've got a bug report about the silent output from the headphone on a
mobo with VT2021, and spotted out that this was because of the wrong
D3 state on the DAC for the headphone output.  The bug is triggered by
the incomplete check for this DAC in set_widgets_power_state_vt1718S().
It checks only the connectivity of the primary output (0x27) but
doesn't consider the path from the headphone pin (0x28).

Now this patch fixes the problem by checking both pins for DAC 0x0b.

Signed-off-by: Takashi Iwai <>
[bwh: Backported to 3.2: keep using snd_hda_codec_write() as
 update_power_state() is missing]
Signed-off-by: Ben Hutchings <>
9 years agoInput: synaptics - handle out of bounds values from the hardware
Seth Forshee [Wed, 25 Jul 2012 06:54:11 +0000 (23:54 -0700)]
Input: synaptics - handle out of bounds values from the hardware

commit c0394506e69b37c47d391c2a7bbea3ea236d8ec8 upstream.

The touchpad on the Acer Aspire One D250 will report out of range values
in the extreme lower portion of the touchpad. These appear as abrupt
changes in the values reported by the hardware from very low values to
very high values, which can cause unexpected vertical jumps in the
position of the mouse pointer.

What seems to be happening is that the value is wrapping to a two's
compliment negative value of higher resolution than the 13-bit value
reported by the hardware, with the high-order bits being truncated. This
patch adds handling for these values by converting them to the
appropriate negative values.

The only tricky part about this is deciding when to treat a number as
negative. It stands to reason that if out of range values can be
reported on the low end then it could also happen on the high end, so
not all out of range values should be treated as negative. The approach
taken here is to split the difference between the maximum legitimate
value for the axis and the maximum possible value that the hardware can
report, treating values greater than this number as negative and all
other values as positive. This can be tweaked later if hardware is found
that operates outside of these parameters.

Signed-off-by: Seth Forshee <>
Reviewed-by: Daniel Kurtz <>
Signed-off-by: Dmitry Torokhov <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
9 years agovideo/smscufx: fix line counting in fb_write
Alexander Holler [Fri, 20 Apr 2012 22:11:07 +0000 (00:11 +0200)]
video/smscufx: fix line counting in fb_write

commit 2fe2d9f47cfe1a3e66e7d087368b3d7155b04c15 upstream.

Line 0 and 1 were both written to line 0 (on the display) and all subsequent
lines had an offset of -1. The result was that the last line on the display
was never overwritten by writes to /dev/fbN.

The origin of this bug seems to have been udlfb.

Signed-off-by: Alexander Holler <>
Signed-off-by: Florian Tobias Schandinat <>
Signed-off-by: Ben Hutchings <>
9 years agofutex: Forbid uaddr == uaddr2 in futex_wait_requeue_pi()
Darren Hart [Fri, 20 Jul 2012 18:53:31 +0000 (11:53 -0700)]
futex: Forbid uaddr == uaddr2 in futex_wait_requeue_pi()

commit 6f7b0a2a5c0fb03be7c25bd1745baa50582348ef upstream.

If uaddr == uaddr2, then we have broken the rule of only requeueing
from a non-pi futex to a pi futex with this call. If we attempt this,
as the trinity test suite manages to do, we miss early wakeups as
q.key is equal to key2 (because they are the same uaddr). We will then
attempt to dereference the pi_mutex (which would exist had the futex_q
been properly requeued to a pi futex) and trigger a NULL pointer

Signed-off-by: Darren Hart <>
Cc: Dave Jones <>
Signed-off-by: Thomas Gleixner <>
Signed-off-by: Ben Hutchings <>
9 years agofutex: Fix bug in WARN_ON for NULL q.pi_state
Darren Hart [Fri, 20 Jul 2012 18:53:30 +0000 (11:53 -0700)]
futex: Fix bug in WARN_ON for NULL q.pi_state

commit f27071cb7fe3e1d37a9dbe6c0dfc5395cd40fa43 upstream.

The WARN_ON in futex_wait_requeue_pi() for a NULL q.pi_state was testing
the address (&q.pi_state) of the pointer instead of the value
(q.pi_state) of the pointer. Correct it accordingly.

Signed-off-by: Darren Hart <>
Cc: Dave Jones <>
Signed-off-by: Thomas Gleixner <>
Signed-off-by: Ben Hutchings <>
9 years agofutex: Test for pi_mutex on fault in futex_wait_requeue_pi()
Darren Hart [Fri, 20 Jul 2012 18:53:29 +0000 (11:53 -0700)]
futex: Test for pi_mutex on fault in futex_wait_requeue_pi()

commit b6070a8d9853eda010a549fa9a09eb8d7269b929 upstream.

If fixup_pi_state_owner() faults, pi_mutex may be NULL. Test
for pi_mutex != NULL before testing the owner against current
and possibly unlocking it.

Signed-off-by: Darren Hart <>
Cc: Dave Jones <>
Cc: Dan Carpenter <>
Signed-off-by: Thomas Gleixner <>
Signed-off-by: Ben Hutchings <>
9 years agoASoC: wm8994: Ensure there are enough BCLKs for four channels
Mark Brown [Fri, 22 Jun 2012 16:21:17 +0000 (17:21 +0100)]
ASoC: wm8994: Ensure there are enough BCLKs for four channels

commit b8edf3e5522735c8ce78b81845f7a1a2d4a08626 upstream.

Otherwise if someone tries to use all four channels on AIF1 with the
device in master mode we won't be able to clock out all the data.

Signed-off-by: Mark Brown <>
Signed-off-by: Ben Hutchings <>
9 years agomfd: wm831x: Feed the device UUID into device_add_randomness()
Mark Brown [Thu, 5 Jul 2012 20:23:21 +0000 (20:23 +0000)]
mfd: wm831x: Feed the device UUID into device_add_randomness()

commit 27130f0cc3ab97560384da437e4621fc4e94f21c upstream.

wm831x devices contain a unique ID value. Feed this into the newly added
device_add_randomness() to add some per device seed data to the pool.

Signed-off-by: Mark Brown <>
Signed-off-by: Theodore Ts'o <>
Signed-off-by: Ben Hutchings <>
9 years agortc: wm831x: Feed the write counter into device_add_randomness()
Mark Brown [Thu, 5 Jul 2012 20:19:17 +0000 (20:19 +0000)]
rtc: wm831x: Feed the write counter into device_add_randomness()

commit 9dccf55f4cb011a7552a8a2749a580662f5ed8ed upstream.

The tamper evident features of the RTC include the "write counter" which
is a pseudo-random number regenerated whenever we set the RTC. Since this
value is unpredictable it should provide some useful seeding to the random
number generator.

Only do this on boot since the goal is to seed the pool rather than add
useful entropy.

Signed-off-by: Mark Brown <>
Signed-off-by: Theodore Ts'o <>
Signed-off-by: Ben Hutchings <>
9 years agorandom: add new get_random_bytes_arch() function
Theodore Ts'o [Thu, 5 Jul 2012 14:35:23 +0000 (10:35 -0400)]
random: add new get_random_bytes_arch() function

commit c2557a303ab6712bb6e09447df828c557c710ac9 upstream.

Create a new function, get_random_bytes_arch() which will use the
architecture-specific hardware random number generator if it is
present.  Change get_random_bytes() to not use the HW RNG, even if it
is avaiable.

The reason for this is that the hw random number generator is fast (if
it is present), but it requires that we trust the hardware
manufacturer to have not put in a back door.  (For example, an
increasing counter encrypted by an AES key known to the NSA.)

It's unlikely that Intel (for example) was paid off by the US
Government to do this, but it's impossible for them to prove otherwise

9 years agorandom: use the arch-specific rng in xfer_secondary_pool
Theodore Ts'o [Thu, 5 Jul 2012 14:21:01 +0000 (10:21 -0400)]
random: use the arch-specific rng in xfer_secondary_pool

commit e6d4947b12e8ad947add1032dd754803c6004824 upstream.

If the CPU supports a hardware random number generator, use it in
xfer_secondary_pool(), where it will significantly improve things and
where we can afford it.

Also, remove the use of the arch-specific rng in
add_timer_randomness(), since the call is significantly slower than
get_cycles(), and we're much better off using it in
xfer_secondary_pool() anyway.

Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Ben Hutchings <>
9 years agonet: feed /dev/random with the MAC address when registering a device
Theodore Ts'o [Thu, 5 Jul 2012 01:23:25 +0000 (21:23 -0400)]
net: feed /dev/random with the MAC address when registering a device

commit 7bf2357524408b97fec58344caf7397f8140c3fd upstream.

Cc: David Miller <>
Cc: Linus Torvalds <>
Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Ben Hutchings <>
9 years agousb: feed USB device information to the /dev/random driver
Theodore Ts'o [Wed, 4 Jul 2012 15:22:20 +0000 (11:22 -0400)]
usb: feed USB device information to the /dev/random driver

commit b04b3156a20d395a7faa8eed98698d1e17a36000 upstream.

Send the USB device's serial, product, and manufacturer strings to the
/dev/random driver to help seed its pools.

Cc: Linus Torvalds <>
Acked-by: Greg KH <>
Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Ben Hutchings <>
9 years agorandom: create add_device_randomness() interface
Linus Torvalds [Wed, 4 Jul 2012 15:16:01 +0000 (11:16 -0400)]
random: create add_device_randomness() interface

commit a2080a67abe9e314f9e9c2cc3a4a176e8a8f8793 upstream.

Add a new interface, add_device_randomness() for adding data to the
random pool that is likely to differ between two devices (or possibly
even per boot).  This would be things like MAC addresses or serial
numbers, or the read-out of the RTC. This does *not* add any actual
entropy to the pool, but it initializes the pool to different values
for devices that might otherwise be identical and have very little
entropy available to them (particularly common in the embedded world).

[ Modified by tytso to mix in a timestamp, since there may be some
  variability caused by the time needed to detect/configure the hardware
  in question. ]

Signed-off-by: Linus Torvalds <>
Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Ben Hutchings <>
9 years agorandom: use lockless techniques in the interrupt path
Theodore Ts'o [Wed, 4 Jul 2012 14:38:30 +0000 (10:38 -0400)]
random: use lockless techniques in the interrupt path

commit 902c098a3663de3fa18639efbb71b6080f0bcd3c upstream.

The real-time Linux folks don't like add_interrupt_randomness() taking
a spinlock since it is called in the low-level interrupt routine.
This also allows us to reduce the overhead in the fast path, for the
random driver, which is the interrupt collection path.

Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Ben Hutchings <>
9 years agorandom: make 'add_interrupt_randomness()' do something sane
Theodore Ts'o [Mon, 2 Jul 2012 11:52:16 +0000 (07:52 -0400)]
random: make 'add_interrupt_randomness()' do something sane

commit 775f4b297b780601e61787b766f306ed3e1d23eb upstream.

We've been moving away from add_interrupt_randomness() for various
reasons: it's too expensive to do on every interrupt, and flooding the
CPU with interrupts could theoretically cause bogus floods of entropy
from a somewhat externally controllable source.

This solves both problems by limiting the actual randomness addition
to just once a second or after 64 interrupts, whicever comes first.
During that time, the interrupt cycle data is buffered up in a per-cpu
pool.  Also, we make sure the the nonblocking pool used by urandom is
initialized before we start feeding the normal input pool.  This
assures that /dev/urandom is returning unpredictable data as soon as

(Based on an original patch by Linus, but significantly modified by

Tested-by: Eric Wustrow <>
Reported-by: Eric Wustrow <>
Reported-by: Nadia Heninger <>
Reported-by: Zakir Durumeric <>
Reported-by: J. Alex Halderman <>.
Signed-off-by: Linus Torvalds <>
Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: Ben Hutchings <>
9 years agorandom: Adjust the number of loops when initializing
H. Peter Anvin [Mon, 16 Jan 2012 19:23:29 +0000 (11:23 -0800)]
random: Adjust the number of loops when initializing

commit 2dac8e54f988ab58525505d7ef982493374433c3 upstream.

When we are initializing using arch_get_random_long() we only need to
loop enough times to touch all the bytes in the buffer; using
poolwords for that does twice the number of operations necessary on a
64-bit machine, since in the random number generator code "word" means
32 bits.

Signed-off-by: H. Peter Anvin <>
Cc: "Theodore Ts'o" <>
Signed-off-by: Ben Hutchings <>
9 years agorandom: Use arch-specific RNG to initialize the entropy store
Theodore Ts'o [Thu, 22 Dec 2011 21:28:01 +0000 (16:28 -0500)]
random: Use arch-specific RNG to initialize the entropy store

commit 3e88bdff1c65145f7ba297ccec69c774afe4c785 upstream.

If there is an architecture-specific random number generator (such as
RDRAND for Intel architectures), use it to initialize /dev/random's
entropy stores.  Even in the worst case, if RDRAND is something like
AES(NSA_KEY, counter++), it won't hurt, and it will definitely help
against any other adversaries.

Signed-off-by: "Theodore Ts'o" <>
Signed-off-by: H. Peter Anvin <>
Signed-off-by: Ben Hutchings <>
9 years agorandom: Use arch_get_random_int instead of cycle counter if avail
Linus Torvalds [Thu, 22 Dec 2011 19:36:22 +0000 (11:36 -0800)]
random: Use arch_get_random_int instead of cycle counter if avail

commit cf833d0b9937874b50ef2867c4e8badfd64948ce upstream.

We still don't use rdrand in /dev/random, which just seems stupid. We
accept the *cycle*counter* as a random input, but we don't accept
rdrand? That's just broken.

Sure, people can do things in user space (write to /dev/random, use
rdrand in addition to /dev/random themselves etc etc), but that
*still* seems to be a particularly stupid reason for saying "we
shouldn't bother to try to do better in /dev/random".

And even if somebody really doesn't trust rdrand as a source of random
bytes, it seems singularly stupid to trust the cycle counter *more*.

So I'd suggest the attached patch. I'm not going to even bother
arguing that we should add more bits to the entropy estimate, because
that's not the point - I don't care if /dev/random fills up slowly or
not, I think it's just stupid to not use the bits we can get from
rdrand and mix them into the strong randomness pool.

Acked-by: "David S. Miller" <>
Acked-by: "Theodore Ts'o" <>
Acked-by: Herbert Xu <>
Cc: Matt Mackall <>
Cc: Tony Luck <>
Cc: Eric Dumazet <>
Signed-off-by: H. Peter Anvin <>
Signed-off-by: Ben Hutchings <>
9 years agonfsd4: our filesystems are normally case sensitive
J. Bruce Fields [Tue, 5 Jun 2012 20:52:06 +0000 (16:52 -0400)]
nfsd4: our filesystems are normally case sensitive

commit 2930d381d22b9c56f40dd4c63a8fa59719ca2c3c upstream.

Actually, xfs and jfs can optionally be case insensitive; we'll handle
that case in later patches.

Signed-off-by: J. Bruce Fields <>
Signed-off-by: Ben Hutchings <>