pandora-kernel.git
11 years agobnx2x: remove false warning regarding interrupt number
Ariel Elior [Thu, 20 Sep 2012 05:26:41 +0000 (05:26 +0000)]
bnx2x: remove false warning regarding interrupt number

Since version 7.4 the FW configures in the pci config space the max
number of interrupts available to the physical function, instead of
the exact number to use.
This causes a false warning in driver when comparing the number of
configured interrupts to the number about to be used.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoARM: reserve syscall 378 for kcmp
Russell King [Fri, 21 Sep 2012 16:55:20 +0000 (17:55 +0100)]
ARM: reserve syscall 378 for kcmp

kcmp has appeared on x86, but has not been noticed because
checksyscalls.sh is broken at the moment.  Reserve ARM syscall 378
for this should we ever need it, and add an __IGNORE entry for this
unimplemented syscall.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
11 years agox86/kbuild: archscripts depends on scripts_basic
Jeff Mahoney [Thu, 20 Sep 2012 14:28:45 +0000 (10:28 -0400)]
x86/kbuild: archscripts depends on scripts_basic

While building the SUSE kernel packages, which build the scripts,
make clean, and then build everything, we have been running into spurious
build failures. We tracked them down to a simple dependency issue:

$ make mrproper
  CLEAN   arch/x86/tools
  CLEAN   scripts/basic
$ cp patches/config/x86_64/desktop .config
$ make archscripts
  HOSTCC  arch/x86/tools/relocs
/bin/sh: scripts/basic/fixdep: No such file or directory
make[3]: *** [arch/x86/tools/relocs] Error 1
make[2]: *** [archscripts] Error 2
make[1]: *** [sub-make] Error 2
make: *** [all] Error 2

This was introduced by commit
6520fe55 (x86, realmode: 16-bit real-mode code support for relocs),
which added the archscripts dependency to archprepare.

This patch adds the scripts_basic dependency to the x86 archscripts.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
11 years agofirmware: fix directory creation rule matching with make 3.80
Mark Asselstine [Wed, 19 Sep 2012 20:30:44 +0000 (16:30 -0400)]
firmware: fix directory creation rule matching with make 3.80

Since make 3.80 doesn't support secondary expansion it uses a fallback
rule to create firmware directories which is matched after primary
expansion of the $(installed-fw) rule's prerequisite. Commit
6c7080a61fc7 [firmware: fix directory creation rule matching with make
3.82] changed the expression generated after primary expansion such
that the fallback was not matched. Updating the fallback rule to match
the new look primary expansion is not an option for various reasons.

The trailing slash added here to $(INSTALL_FW_PATH)/. while defining
installed-fw-dirs fixes builds with make 3.82 since this will provide
a matching rule for $(INSTALL_FW_PATH)/$$(dir %) when % is in the base
firmware directory (ie. $(dir %) gives './'). Versions of make prior
to 3.82 will strip this trailing slash along with the one generated by
$(dir %) when % is in the base firmware directory and as such continue
to function as before.

Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
Tested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
11 years agocan: ti_hecc: fix oops during rmmod
Marc Kleine-Budde [Wed, 19 Sep 2012 12:58:45 +0000 (14:58 +0200)]
can: ti_hecc: fix oops during rmmod

This patch fixes an oops which occurs when unloading the driver, while the
network interface is still up. The problem is that first the io mapping is
teared own, then the CAN device is unregistered, resulting in accessing the
hardware's iomem:

[  172.744232] Unable to handle kernel paging request at virtual address c88b0040
[  172.752441] pgd = c7be4000
[  172.755645] [c88b0040] *pgd=87821811, *pte=00000000, *ppte=00000000
[  172.762207] Internal error: Oops: 807 [#1] PREEMPT ARM
[  172.767517] Modules linked in: ti_hecc(-) can_dev
[  172.772430] CPU: 0    Not tainted  (3.5.0alpha-00037-g3554cc0 #126)
[  172.778961] PC is at ti_hecc_close+0xb0/0x100 [ti_hecc]
[  172.784423] LR is at __dev_close_many+0x90/0xc0
[  172.789123] pc : [<bf00c768>]    lr : [<c033be58>]    psr: 60000013
[  172.789123] sp : c5c1de68  ip : 00040081  fp : 00000000
[  172.801025] r10: 00000001  r9 : c5c1c000  r8 : 00100100
[  172.806457] r7 : c5d0a48c  r6 : c5d0a400  r5 : 00000000  r4 : c5d0a000
[  172.813232] r3 : c88b0000  r2 : 00000001  r1 : c5d0a000  r0 : c5d0a000
[  172.820037] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  172.827423] Control: 10c5387d  Table: 87be4019  DAC: 00000015
[  172.833404] Process rmmod (pid: 600, stack limit = 0xc5c1c2f0)
[  172.839447] Stack: (0xc5c1de68 to 0xc5c1e000)
[  172.843994] de60:                   bf00c6b8 c5c1dec8 c5d0a000 c5d0a000 00200200 c033be58
[  172.852478] de80: c5c1de44 c5c1dec8 c5c1dec8 c033bf2c c5c1de90 c5c1de90 c5d0a084 c5c1de44
[  172.860992] dea0: c5c1dec8 c033c098 c061d3dc c5d0a000 00000000 c05edf28 c05edb34 c000d724
[  172.869476] dec0: 00000000 c033c2f8 c5d0a084 c5d0a084 00000000 c033c370 00000000 c5d0a000
[  172.877990] dee0: c05edb00 c033c3b8 c5d0a000 bf00d3ac c05edb00 bf00d7c8 bf00d7c8 c02842dc
[  172.886474] df00: c02842c8 c0282f90 c5c1c000 c05edb00 bf00d7c8 c0283668 bf00d7c8 00000000
[  172.894989] df20: c0611f98 befe2f80 c000d724 c0282d10 bf00d804 00000000 00000013 c0068a8c
[  172.903472] df40: c5c538e8 685f6974 00636365 c61571a8 c5cb9980 c61571a8 c6158a20 c00c9bc4
[  172.911987] df60: 00000000 00000000 c5cb9980 00000000 c5cb9980 00000000 c7823680 00000006
[  172.920471] df80: bf00d804 00000880 c5c1df8c 00000000 000d4267 befe2f80 00000001 b6d90068
[  172.928985] dfa0: 00000081 c000d5a0 befe2f80 00000001 befe2f80 00000880 b6d90008 00000008
[  172.937469] dfc0: befe2f80 00000001 b6d90068 00000081 00000001 00000000 befe2eac 00000000
[  172.945983] dfe0: 00000000 befe2b18 00023ba4 b6e6addc 60000010 befe2f80 a8e00190 86d2d344
[  172.954498] [<bf00c768>] (ti_hecc_close+0xb0/0x100 [ti_hecc]) from [<c033be58>] (__dev__registered_many+0xc0/0x2a0)
[  172.984161] [<c033c098>] (rollback_registered_many+0xc0/0x2a0) from [<c033c2f8>] (rollback_registered+0x20/0x30)
[  172.994750] [<c033c2f8>] (rollback_registered+0x20/0x30) from [<c033c370>] (unregister_netdevice_queue+0x68/0x98)
[  173.005401] [<c033c370>] (unregister_netdevice_queue+0x68/0x98) from [<c033c3b8>] (unregister_netdev+0x18/0x20)
[  173.015899] [<c033c3b8>] (unregister_netdev+0x18/0x20) from [<bf00d3ac>] (ti_hecc_remove+0x60/0x80 [ti_hecc])
[  173.026245] [<bf00d3ac>] (ti_hecc_remove+0x60/0x80 [ti_hecc]) from [<c02842dc>] (platform_drv_remove+0x14/0x18)
[  173.036712] [<c02842dc>] (platform_drv_remove+0x14/0x18) from [<c0282f90>] (__device_release_driver+0x7c/0xbc)

Cc: stable <stable@vger.kernel.org>
Cc: Anant Gole <anantgole@ti.com>
Tested-by: Jan Luebbe <jlu@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 years agocan: janz-ican3: fix support for older hardware revisions
Ira W. Snyder [Tue, 11 Sep 2012 22:58:15 +0000 (15:58 -0700)]
can: janz-ican3: fix support for older hardware revisions

The Revision 1.0 Janz CMOD-IO Carrier Board does not have support for
the reset registers. To support older hardware, the code is changed to
use the hardware reset register on the Janz VMOD-ICAN3 hardware itself.

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 years agoMerge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel...
Dave Airlie [Fri, 21 Sep 2012 10:46:01 +0000 (20:46 +1000)]
Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes

Daniel writes:
Essentially just flush my -fixes queue before I head off to xdc.
- gen2 regression fixer, we've enabled the lvds stuff too late. Not
  causing any known issues, but this restores the sequence before a
  refactor that landed in 3.5, and lvds is a fickle beast. And seriously,
  who runs gen2 still ...
- downgrade a BUG to a WARN - we haven't root-caused/fixed the underlying
  issue yet, but this should help bug reporters quite a bit.
- properly disable hdmi audio - we've lost track of this, which resulted
  in the alsa driver again losing track of the unplug event.

* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
  drm/i915: HDMI - Clear Audio Enable bit for Hot Plug
  drm/i915: Reduce a pin-leak BUG into a WARN
  drm/i915: enable lvds pin pairs before dpll on gen2

11 years agodrm/nouveau: add dmi quirk for gpio reset
Dave Airlie [Fri, 21 Sep 2012 13:19:50 +0000 (09:19 -0400)]
drm/nouveau: add dmi quirk for gpio reset

This fixes the gpio reset problem so the Retina MBP works, but avoids
breaking the Dell systems. Ben will work on a better solution for 3.7.

Tested by me on retina MBP.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agonet: do not disable sg for packets requiring no checksum
Ed Cashin [Wed, 19 Sep 2012 15:49:00 +0000 (15:49 +0000)]
net: do not disable sg for packets requiring no checksum

A change in a series of VLAN-related changes appears to have
inadvertently disabled the use of the scatter gather feature of
network cards for transmission of non-IP ethernet protocols like ATA
over Ethernet (AoE).  Below is a reference to the commit that
introduces a "harmonize_features" function that turns off scatter
gather when the NIC does not support hardware checksumming for the
ethernet protocol of an sk buff.

  commit f01a5236bd4b140198fbcc550f085e8361fd73fa
  Author: Jesse Gross <jesse@nicira.com>
  Date:   Sun Jan 9 06:23:31 2011 +0000

      net offloading: Generalize netif_get_vlan_features().

The can_checksum_protocol function is not equipped to consider a
protocol that does not require checksumming.  Calling it for a
protocol that requires no checksum is inappropriate.

The patch below has harmonize_features call can_checksum_protocol when
the protocol needs a checksum, so that the network layer is not forced
to perform unnecessary skb linearization on the transmission of AoE
packets.  Unnecessary linearization results in decreased performance
and increased memory pressure, as reported here:

  http://www.spinics.net/lists/linux-mm/msg15184.html

The problem has probably not been widely experienced yet, because
only recently has the kernel.org-distributed aoe driver acquired the
ability to use payloads of over a page in size, with the patchset
recently included in the mm tree:

  https://lkml.org/lkml/2012/8/28/140

The coraid.com-distributed aoe driver already could use payloads of
greater than a page in size, but its users generally do not use the
newest kernels.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoaoe: assert AoE packets marked as requiring no checksum
Ed Cashin [Wed, 19 Sep 2012 15:46:39 +0000 (15:46 +0000)]
aoe: assert AoE packets marked as requiring no checksum

In order for the network layer to see that AoE requires
no checksumming in a generic way, the packets must be
marked as requiring no checksum, so we make this requirement
explicit with the assertion.

Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoat91ether: return PTR_ERR if call to clk_get fails
Devendra Naga [Wed, 19 Sep 2012 21:04:36 +0000 (21:04 +0000)]
at91ether: return PTR_ERR if call to clk_get fails

we are currently returning ENODEV, as the clk_get may give a exact
error code in its returned pointer, assign it to the ret by using the
PTR_ERR function, so that the subsequent goto label will jump to the
error path and clean the driver and return the error correctly.

Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm_user: don't copy esn replay window twice for new states
Mathias Krause [Wed, 19 Sep 2012 11:33:43 +0000 (11:33 +0000)]
xfrm_user: don't copy esn replay window twice for new states

The ESN replay window was already fully initialized in
xfrm_alloc_replay_state_esn(). No need to copy it again.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm_user: ensure user supplied esn replay window is valid
Mathias Krause [Thu, 20 Sep 2012 10:01:49 +0000 (10:01 +0000)]
xfrm_user: ensure user supplied esn replay window is valid

The current code fails to ensure that the netlink message actually
contains as many bytes as the header indicates. If a user creates a new
state or updates an existing one but does not supply the bytes for the
whole ESN replay window, the kernel copies random heap bytes into the
replay bitmap, the ones happen to follow the XFRMA_REPLAY_ESN_VAL
netlink attribute. This leads to following issues:

1. The replay window has random bits set confusing the replay handling
   code later on.

2. A malicious user could use this flaw to leak up to ~3.5kB of heap
   memory when she has access to the XFRM netlink interface (requires
   CAP_NET_ADMIN).

Known users of the ESN replay window are strongSwan and Steffen's
iproute2 patch (<http://patchwork.ozlabs.org/patch/85962/>). The latter
uses the interface with a bitmap supplied while the former does not.
strongSwan is therefore prone to run into issue 1.

To fix both issues without breaking existing userland allow using the
XFRMA_REPLAY_ESN_VAL netlink attribute with either an empty bitmap or a
fully specified one. For the former case we initialize the in-kernel
bitmap with zero, for the latter we copy the user supplied bitmap. For
state updates the full bitmap must be supplied.

To prevent overflows in the bitmap length calculation the maximum size
of bmp_len is limited to 128 by this patch -- resulting in a maximum
replay window of 4096 packets. This should be sufficient for all real
life scenarios (RFC 4303 recommends a default replay window size of 64).

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Martin Willi <martin@revosec.ch>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm_user: fix info leak in copy_to_user_tmpl()
Mathias Krause [Wed, 19 Sep 2012 11:33:41 +0000 (11:33 +0000)]
xfrm_user: fix info leak in copy_to_user_tmpl()

The memory used for the template copy is a local stack variable. As
struct xfrm_user_tmpl contains multiple holes added by the compiler for
alignment, not initializing the memory will lead to leaking stack bytes
to userland. Add an explicit memset(0) to avoid the info leak.

Initial version of the patch by Brad Spengler.

Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm_user: fix info leak in copy_to_user_policy()
Mathias Krause [Wed, 19 Sep 2012 11:33:40 +0000 (11:33 +0000)]
xfrm_user: fix info leak in copy_to_user_policy()

The memory reserved to dump the xfrm policy includes multiple padding
bytes added by the compiler for alignment (padding bytes in struct
xfrm_selector and struct xfrm_userpolicy_info). Add an explicit
memset(0) before filling the buffer to avoid the heap info leak.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm_user: fix info leak in copy_to_user_state()
Mathias Krause [Wed, 19 Sep 2012 11:33:39 +0000 (11:33 +0000)]
xfrm_user: fix info leak in copy_to_user_state()

The memory reserved to dump the xfrm state includes the padding bytes of
struct xfrm_usersa_info added by the compiler for alignment (7 for
amd64, 3 for i386). Add an explicit memset(0) before filling the buffer
to avoid the info leak.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm_user: fix info leak in copy_to_user_auth()
Mathias Krause [Wed, 19 Sep 2012 11:33:38 +0000 (11:33 +0000)]
xfrm_user: fix info leak in copy_to_user_auth()

copy_to_user_auth() fails to initialize the remainder of alg_name and
therefore discloses up to 54 bytes of heap memory via netlink to
userland.

Use strncpy() instead of strcpy() to fill the trailing bytes of alg_name
with null bytes.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: qmi_wwan: adding Huawei E367, ZTE MF683 and Pantech P4200
Bjørn Mork [Wed, 19 Sep 2012 10:03:36 +0000 (10:03 +0000)]
net: qmi_wwan: adding Huawei E367, ZTE MF683 and Pantech P4200

One of the modes of Huawei E367 has this QMI/wwan interface:

 I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=01 Prot=07 Driver=(none)
 E:  Ad=83(I) Atr=03(Int.) MxPS=  64 Ivl=2ms
 E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms

Huawei use subclass and protocol to identify vendor specific
functions, so adding a new vendor rule for this combination.

The Pantech devices UML290 (106c:3718) and P4200 (106c:3721) use
the same subclass to identify the QMI/wwan function.  Replace the
existing device specific UML290 entries with generic vendor matching,
adding support for the Pantech P4200.

The ZTE MF683 has 6 vendor specific interfaces, all using
ff/ff/ff for cls/sub/prot.  Adding a match on interface #5 which
is a QMI/wwan interface.

Cc: Fangxiaozhi (Franko) <fangxiaozhi@huawei.com>
Cc: Thomas Schäfer <tschaefer@t-online.de>
Cc: Dan Williams <dcbw@redhat.com>
Cc: Shawn J. Goff <shawn7400@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotcp: restore rcv_wscale in a repair mode (v2)
Andrey Vagin [Wed, 19 Sep 2012 09:40:00 +0000 (09:40 +0000)]
tcp: restore rcv_wscale in a repair mode (v2)

rcv_wscale is a symetric parameter with snd_wscale.

Both this parameters are set on a connection handshake.

Without this value a remote window size can not be interpreted correctly,
because a value from a packet should be shifted on rcv_wscale.

And one more thing is that wscale_ok should be set too.

This patch doesn't break a backward compatibility.
If someone uses it in a old scheme, a rcv window
will be restored with the same bug (rcv_wscale = 0).

v2: Save backward compatibility on big-endian system. Before
    the first two bytes were snd_wscale and the second two bytes were
    rcv_wscale. Now snd_wscale is opt_val & 0xFFFF and rcv_wscale >> 16.
    This approach is independent on byte ordering.

Cc: David S. Miller <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
CC: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Thu, 20 Sep 2012 20:50:40 +0000 (06:50 +1000)]
Merge branch 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

fixes a resume regression on pre-r6xx asics.

* 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: Prevent leak of scratch register on resume from suspend

11 years agotracing: Don't call page_to_pfn() if page is NULL
Wen Congyang [Thu, 20 Sep 2012 06:04:47 +0000 (14:04 +0800)]
tracing: Don't call page_to_pfn() if page is NULL

When allocating memory fails, page is NULL. page_to_pfn() will
cause the kernel panicked if we don't use sparsemem vmemmap.

Link: http://lkml.kernel.org/r/505AB1FF.8020104@cn.fujitsu.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable <stable@vger.kernel.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 years agodrm/radeon: Prevent leak of scratch register on resume from suspend
Simon Kitching [Thu, 20 Sep 2012 16:59:16 +0000 (12:59 -0400)]
drm/radeon: Prevent leak of scratch register on resume from suspend

Cards typically have 5-7 scratch registers; one of these is reserved for
rdev->rptr_save_reg. Unfortunately the reservation is done in function
r100_cp_init, which is called by all drivers except r600 - and this
function is also invoked on resume from suspend. After several resumes,
no scratch registers are free and graphics acceleration is disabled.

Dmesg then reports either:
   *ERROR* radeon: cp failed to get scratch reg (-22).
   *ERROR* radeon: cp isn't working(-22).
   radeon 0000:01:00.0: failed initializing CP (-22).
or:
   *ERROR* radeon: failed to get scratch reg (-22).
   *ERROR* radeon: failed testing IB on GFX ring (-22).
   *ERROR* ib ring test failed (-22).

The chain of calls on boot for all except r600 is:
radeon_init -> ... -> (rXXX_init) -> rXXX_startup -> r100_cp_init

The chain of calls on resume for all except r600 is:
rXXX_resume -> rXXX_startup -> r100_cp_init.

R600 correctly allocates rptr_save_reg in r600_init (ie once only, not
in resume). However moving the code into the init functions for all
drivers means touching 4 drivers. So instead, this patch just adds a
test in r100_cp_init to avoid reallocating on resume. As the rdev
structure is allocated via kzalloc in radeon_driver_load_kms, and zero
is not a valid registerid, zero safely implies not-yet-allocated.

This issue appears to have been introduced in c7eff978 (3.6.0-rcN)

Signed-off-by: Simon Kitching <skitching@vonos.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agoRevert "drm/nv50-/gpio: initialise to vbios defaults during init"
Dave Airlie [Thu, 20 Sep 2012 11:00:15 +0000 (21:00 +1000)]
Revert "drm/nv50-/gpio: initialise to vbios defaults during init"

This reverts commit 991083ba60f89e717e3a4175be96d48a810e9eae.

We discovered this causes problem on some Dell eDP laptops, so Apple
lose out for now, I might try and whip up a dmi based workaround for 3.6
but I'm not sure I'll get time.

Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agoMerge branch 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Thu, 20 Sep 2012 10:48:31 +0000 (20:48 +1000)]
Merge branch 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

The pll fix ended up causing some regressions.  Drop it for 3.6.  I've
fixed it properly in 3.7, but the fix is too invasive for 3.6.

* 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux:
  Revert "drm/radeon: rework pll selection (v3)"

11 years agoInput: edt-ft5x06 - return -EFAULT on copy_to_user() error
Axel Lin [Wed, 19 Sep 2012 22:56:23 +0000 (15:56 -0700)]
Input: edt-ft5x06 - return -EFAULT on copy_to_user() error

copy_to_user() returns the number of bytes remaining, but we want a
negative error code here.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
11 years agoInput: sentelic - filter out erratic movement when lifting finger
Tai-hwa Liang [Wed, 19 Sep 2012 18:10:47 +0000 (11:10 -0700)]
Input: sentelic - filter out erratic movement when lifting finger

When lifing finger off the surface some versions of touchpad send movement
packets with very low coordinates, which cause cursor to jump to the upper
left corner of the screen. Let's ignore least significant bits of X and Y
coordinates if higher bits are all zeroes and consider finger not touching
the pad.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43197
Reported-and-tested-by: Aleksey Spiridonov <leks13@leks13.ru>
Tested-by: Eddie Dunn <eddie.dunn@gmail.com>
Tested-by: Jakub Luzny <limoto94@gmail.com>
Tested-by: Olivier Goffart <olivier@woboq.com>
Signed-off-by: Tai-hwa Liang <avatar@sentelic.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
11 years agoInput: ambakmi - [un]prepare clocks when enabling amd disabling
Pawel Moll [Wed, 19 Sep 2012 18:10:47 +0000 (11:10 -0700)]
Input: ambakmi - [un]prepare clocks when enabling amd disabling

Clocks must be prepared before enabling and unprepared
after disabling. Use appropriate functions to do this
in one go.

Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
11 years agoInput: i8042 - disable mux on Toshiba C850D
Anisse Astier [Wed, 19 Sep 2012 18:10:48 +0000 (11:10 -0700)]
Input: i8042 - disable mux on Toshiba C850D

On Toshiba Satellite C850D, the touchpad and the keyboard might randomly
not work at boot. Preventing MUX mode activation solves this issue.

Signed-off-by: Anisse Astier <anisse@astier.eu>
Cc: stable@kernel.org
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
11 years agonet/core: fix comment in skb_try_coalesce
Li RongQing [Tue, 18 Sep 2012 16:53:21 +0000 (16:53 +0000)]
net/core: fix comment in skb_try_coalesce

It should be the skb which is not cloned

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoasix: Support DLink DUB-E100 H/W Ver C1
Søren holm [Mon, 17 Sep 2012 21:50:57 +0000 (21:50 +0000)]
asix: Support DLink DUB-E100 H/W Ver C1

Signed-off-by: Søren Holm <sgh@sgh.dk>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'clkdev' into fixes
Russell King [Wed, 19 Sep 2012 21:04:48 +0000 (22:04 +0100)]
Merge branch 'clkdev' into fixes

11 years agoARM: 7535/1: Reprogram smp_twd based on new common clk framework notifiers
Mike Turquette [Thu, 13 Sep 2012 21:04:18 +0000 (22:04 +0100)]
ARM: 7535/1: Reprogram smp_twd based on new common clk framework notifiers

Running cpufreq driver on imx6q, the following warning is seen.

$ BUG: sleeping function called from invalid context at kernel/mutex.c:269

<snip>

stack backtrace:
Backtrace:
[<80011d64>] (dump_backtrace+0x0/0x10c) from [<803fc164>] (dump_stack+0x18/0x1c)
 r6:bf8142e0 r5:bf814000 r4:806ac794 r3:bf814000
[<803fc14c>] (dump_stack+0x0/0x1c) from [<803fd444>] (print_usage_bug+0x250/0x2b
8)
[<803fd1f4>] (print_usage_bug+0x0/0x2b8) from [<80060f90>] (mark_lock+0x56c/0x67
0)
[<80060a24>] (mark_lock+0x0/0x670) from [<80061a20>] (__lock_acquire+0x98c/0x19b
4)
[<80061094>] (__lock_acquire+0x0/0x19b4) from [<80062f14>] (lock_acquire+0x68/0x
7c)
[<80062eac>] (lock_acquire+0x0/0x7c) from [<80400f28>] (mutex_lock_nested+0x78/0
x344)
 r7:00000000 r6:bf872000 r5:805cc858 r4:805c2a04
[<80400eb0>] (mutex_lock_nested+0x0/0x344) from [<803089ac>] (clk_get_rate+0x1c/
0x58)
[<80308990>] (clk_get_rate+0x0/0x58) from [<80013c48>] (twd_update_frequency+0x1
8/0x50)
 r5:bf253d04 r4:805cadf4
[<80013c30>] (twd_update_frequency+0x0/0x50) from [<80068e20>] (generic_smp_call
_function_single_interrupt+0xd4/0x13c)
 r4:bf873ee0 r3:80013c30
[<80068d4c>] (generic_smp_call_function_single_interrupt+0x0/0x13c) from [<80013
34c>] (handle_IPI+0xc0/0x194)
 r8:00000001 r7:00000000 r6:80574e48 r5:bf872000 r4:80593958
[<8001328c>] (handle_IPI+0x0/0x194) from [<800084e8>] (gic_handle_irq+0x58/0x60)
 r8:00000000 r7:bf873f8c r6:bf873f58 r5:80593070 r4:f4000100
r3:00000005
[<80008490>] (gic_handle_irq+0x0/0x60) from [<8000e124>] (__irq_svc+0x44/0x60)
Exception stack(0xbf873f58 to 0xbf873fa0)
3f40:                                                       00000001 00000001
3f60: 00000000 bf814000 bf872000 805cab48 80405aa4 80597648 00000000 412fc09a
3f80: bf872000 bf873fac bf873f70 bf873fa0 80063844 8000f1f8 20000013 ffffffff
 r6:ffffffff r5:20000013 r4:8000f1f8 r3:bf814000
[<8000f1b8>] (default_idle+0x0/0x4c) from [<8000f428>] (cpu_idle+0x98/0x114)
[<8000f390>] (cpu_idle+0x0/0x114) from [<803f9834>] (secondary_start_kernel+0x11
c/0x140)
[<803f9718>] (secondary_start_kernel+0x0/0x140) from [<103f9234>] (0x103f9234)
 r6:10c03c7d r5:0000001f r4:4f86806a r3:803f921c

It looks that the warning is caused by that twd_update_frequency() gets
called from an atomic context while it calls clk_get_rate() where a
mutex gets held.

To fix the warning, let's convert common clk users over to clk notifiers
in place of CPUfreq notifiers.  This works out nicely for Cortex-A9
MPcore designs that scale all CPUs at the same frequency.

Platforms that have not been converted to the common clk framework and
support CPUfreq will rely on the old mechanism.  Once these platforms
are converted over fully then we can remove the CPUfreq-specific bits
for good.

Signed-off-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
11 years agoARM: 7537/1: clk: Fix release in devm_clk_put()
Mark Brown [Wed, 19 Sep 2012 11:43:21 +0000 (12:43 +0100)]
ARM: 7537/1: clk: Fix release in devm_clk_put()

Surprisingly devres_destroy() doesn't call the destructor for the
resource it is destroying, use the newly added devres_release() instead
to fix this.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
11 years agopkt_sched: fix virtual-start-time update in QFQ
Paolo Valente [Sat, 15 Sep 2012 00:41:35 +0000 (00:41 +0000)]
pkt_sched: fix virtual-start-time update in QFQ

If the old timestamps of a class, say cl, are stale when the class
becomes active, then QFQ may assign to cl a much higher start time
than the maximum value allowed. This may happen when QFQ assigns to
the start time of cl the finish time of a group whose classes are
characterized by a higher value of the ratio
max_class_pkt/weight_of_the_class with respect to that of
cl. Inserting a class with a too high start time into the bucket list
corrupts the data structure and may eventually lead to crashes.
This patch limits the maximum start time assigned to a class.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotcp: flush DMA queue before sk_wait_data if rcv_wnd is zero
Michal Kubeček [Fri, 14 Sep 2012 04:59:52 +0000 (04:59 +0000)]
tcp: flush DMA queue before sk_wait_data if rcv_wnd is zero

If recv() syscall is called for a TCP socket so that
  - IOAT DMA is used
  - MSG_WAITALL flag is used
  - requested length is bigger than sk_rcvbuf
  - enough data has already arrived to bring rcv_wnd to zero
then when tcp_recvmsg() gets to calling sk_wait_data(), receive
window can be still zero while sk_async_wait_queue exhausts
enough space to keep it zero. As this queue isn't cleaned until
the tcp_service_net_dma() call, sk_wait_data() cannot receive
any data and blocks forever.

If zero receive window and non-empty sk_async_wait_queue is
detected before calling sk_wait_data(), process the queue first.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agobatman-adv: make batadv_test_bit() return 0 or 1 only
Linus Lüssing [Fri, 14 Sep 2012 00:40:54 +0000 (00:40 +0000)]
batman-adv: make batadv_test_bit() return 0 or 1 only

On some architectures test_bit() can return other values than 0 or 1:

With a generic x86 OpenWrt image in a kvm setup (batadv_)test_bit()
frequently returns -1 for me, leading to batadv_iv_ogm_update_seqnos()
wrongly signaling a protected seqno window.

This patch tries to fix this issue by making batadv_test_bit() return 0
or 1 only.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxen/boot: Disable BIOS SMP MP table search.
Konrad Rzeszutek Wilk [Wed, 19 Sep 2012 12:30:55 +0000 (08:30 -0400)]
xen/boot: Disable BIOS SMP MP table search.

As the initial domain we are able to search/map certain regions
of memory to harvest configuration data. For all low-level we
use ACPI tables - for interrupts we use exclusively ACPI _PRT
(so DSDT) and MADT for INT_SRC_OVR.

The SMP MP table is not used at all. As a matter of fact we do
not even support machines that only have SMP MP but no ACPI tables.

Lets follow how Moorestown does it and just disable searching
for BIOS SMP tables.

This also fixes an issue on HP Proliant BL680c G5 and DL380 G6:

9f->100 for 1:1 PTE
Freeing 9f-100 pfn range: 97 pages freed
1-1 mapping on 9f->100
.. snip..
e820: BIOS-provided physical RAM map:
Xen: [mem 0x0000000000000000-0x000000000009efff] usable
Xen: [mem 0x000000000009f400-0x00000000000fffff] reserved
Xen: [mem 0x0000000000100000-0x00000000cfd1dfff] usable
.. snip..
Scan for SMP in [mem 0x00000000-0x000003ff]
Scan for SMP in [mem 0x0009fc00-0x0009ffff]
Scan for SMP in [mem 0x000f0000-0x000fffff]
found SMP MP-table at [mem 0x000f4fa0-0x000f4faf] mapped at [ffff8800000f4fa0]
(XEN) mm.c:908:d0 Error getting mfn 100 (pfn 5555555555555555) from L1 entry 0000000000100461 for l1e_owner=0, pg_owner=0
(XEN) mm.c:4995:d0 ptwr_emulate: could not get_page_from_l1e()
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff81ac07e2>] xen_set_pte_init+0x66/0x71
. snip..
Pid: 0, comm: swapper Not tainted 3.6.0-rc6upstream-00188-gb6fb969-dirty #2 HP ProLiant BL680c G5
.. snip..
Call Trace:
 [<ffffffff81ad31c6>] __early_ioremap+0x18a/0x248
 [<ffffffff81624731>] ? printk+0x48/0x4a
 [<ffffffff81ad32ac>] early_ioremap+0x13/0x15
 [<ffffffff81acc140>] get_mpc_size+0x2f/0x67
 [<ffffffff81acc284>] smp_scan_config+0x10c/0x136
 [<ffffffff81acc2e4>] default_find_smp_config+0x36/0x5a
 [<ffffffff81ac3085>] setup_arch+0x5b3/0xb5b
 [<ffffffff81624731>] ? printk+0x48/0x4a
 [<ffffffff81abca7f>] start_kernel+0x90/0x390
 [<ffffffff81abc356>] x86_64_start_reservations+0x131/0x136
 [<ffffffff81abfa83>] xen_start_kernel+0x65f/0x661
(XEN) Domain 0 crashed: 'noreboot' set - not rebooting.

which is that ioremap would end up mapping 0xff using _PAGE_IOMAP
(which is what early_ioremap sticks as a flag) - which meant
we would get MFN 0xFF (pte ff461, which is OK), and then it would
also map 0x100 (b/c ioremap tries to get page aligned request, and
it was trying to map 0xf4fa0 + PAGE_SIZE - so it mapped the next page)
as _PAGE_IOMAP. Since 0x100 is actually a RAM page, and the _PAGE_IOMAP
bypasses the P2M lookup we would happily set the PTE to 1000461.
Xen would deny the request since we do not have access to the
Machine Frame Number (MFN) of 0x100. The P2M[0x100] is for example
0x80140.

CC: stable@vger.kernel.org
Fixes-Oracle-Bugzilla: https://bugzilla.oracle.com/bugzilla/show_bug.cgi?id=13665
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
11 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Wed, 19 Sep 2012 18:04:34 +0000 (11:04 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A small collection of driver fixes/updates and a core fix for 3.6.  It
  contains:

   - Bug fixes for mtip32xx, and support for new hardware (just addition
     of IDs).  They have been queued up for 3.7 for a few weeks as well.

   - rate-limit a failing command error message in block core.

   - A fix for an old cciss bug from Stephen.

   - Prevent overflow of partition count from Alan."

* 'for-linus' of git://git.kernel.dk/linux-block:
  cciss: fix handling of protocol error
  blk: add an upper sanity check on partition adding
  mtip32xx: fix user_buffer check in exec_drive_command
  mtip32xx: Remove dead code
  mtip32xx: Change printk to pr_xxxx
  mtip32xx: Proper reporting of write protect status on big-endian
  mtip32xx: Increase timeout for standby command
  mtip32xx: Handle NCQ commands during the security locked state
  mtip32xx: Add support for new devices
  block: rate-limit the error message from failing commands

11 years agoMerge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
Linus Torvalds [Wed, 19 Sep 2012 18:03:55 +0000 (11:03 -0700)]
Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh

Pull SuperH fixes from Paul Mundt.

* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
  sh: Fix up TIF_NOTIFY_RESUME sans TIF_SIGPENDING handling.
  sh: pfc: Release spinlock in sh_pfc_gpio_request_enable() error path
  sh: intc: Fix up multi-evt irq association.

11 years agoMerge tag 'rpmsg-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg
Linus Torvalds [Wed, 19 Sep 2012 18:03:13 +0000 (11:03 -0700)]
Merge tag 'rpmsg-3.6-fix' of git://git./linux/kernel/git/ohad/rpmsg

Pull rpmsg fix from Ohad Ben-Cohen:
 "A quick rpmsg fix from Fernando, fixing two buggy invocations of
  dma_free_coherent"

* tag 'rpmsg-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg:
  rpmsg: fix dma_free_coherent dev parameter

11 years agoMerge tag 'md-3.6-fixes' of git://neil.brown.name/md
Linus Torvalds [Wed, 19 Sep 2012 18:01:38 +0000 (11:01 -0700)]
Merge tag 'md-3.6-fixes' of git://neil.brown.name/md

Pull md fixes from NeilBrown:
 "3 fixes for md in 3.6.

  One reverts a recent patch which turns out to not be such a good idea.

  Other two fix minor bugs with the new (since 3.3) 'replacement' code
  and have been tagged for -stable."

* tag 'md-3.6-fixes' of git://neil.brown.name/md:
  md: make sure metadata is updated when spares are activated or removed.
  md/raid5: fix calculate of 'degraded' when a replacement becomes active.
  Revert "md/raid5: For odirect-write performance, do not set STRIPE_PREREAD_ACTIVE."

11 years agoMerge branch 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Linus Torvalds [Wed, 19 Sep 2012 18:00:07 +0000 (11:00 -0700)]
Merge branch 'for-3.6-fixes' of git://git./linux/kernel/git/tj/wq

Pull workqueue / powernow-k8 fix from Tejun Heo:
 "This is the fix for the bug where cpufreq/powernow-k8 was tripping
  BUG_ON() in try_to_wake_up_local() by migrating workqueue worker to a
  different CPU.

    https://bugzilla.kernel.org/show_bug.cgi?id=47301

  As discussed, the fix is now two parts - one to reimplement
  work_on_cpu() so that it doesn't create a new kthread each time and
  the actual fix which makes powernow-k8 use work_on_cpu() instead of
  performing manual migration.

  While pretty late in the merge cycle, both changes are on the safer
  side.  Jiri and I verified two existing users of work_on_cpu() and
  Duncan confirmed that the powernow-k8 fix survived about 18 hours of
  testing."

* 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU
  workqueue: reimplement work_on_cpu() using system_wq

11 years agoRevert "input: ab8500-ponkey: Create AB8500 domain IRQ mapping"
Dmitry Torokhov [Wed, 19 Sep 2012 17:23:18 +0000 (10:23 -0700)]
Revert "input: ab8500-ponkey: Create AB8500 domain IRQ mapping"

This reverts commit ca3b3faf9bee4dc5df4f10eae2d1e48f7de0a8ad.

There was a plan to place ab8500_irq_get_virq() calls in each AB8500
child device prior to requesting an IRQ, but as we're no longer using
Device Tree to collect our IRQ numbers, it's actually better to allow
the core to do this during device registration time. So the IRQ number
we pull from its resource has already been converted to a virtual IRQ.

Signed-off-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
11 years agoMerge tag 'v3.6-rc5' into for-linus
Dmitry Torokhov [Wed, 19 Sep 2012 17:21:21 +0000 (10:21 -0700)]
Merge tag 'v3.6-rc5' into for-linus

Sync with mainline so that I can revert an input patch that came in through
another subsystem tree.

11 years agocpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU
Tejun Heo [Tue, 18 Sep 2012 21:24:59 +0000 (14:24 -0700)]
cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU

powernowk8_target() runs off a per-cpu work item and if the
cpufreq_policy->cpu is different from the current one, it migrates the
kworker to the target CPU by manipulating current->cpus_allowed.  The
function migrates the kworker back to the original CPU but this is
still broken.  Workqueue concurrency management requires the kworkers
to stay on the same CPU and powernowk8_target() ends up triggerring
BUG_ON(rq != this_rq()) in try_to_wake_up_local() if it contends on
fidvid_mutex and sleeps.

It is unclear why this bug is being reported now.  Duncan says it
appeared to be a regression of 3.6-rc1 and couldn't reproduce it on
3.5.  Bisection seemed to point to 63d95a91 "workqueue: use @pool
instead of @gcwq or @cpu where applicable" which is an non-functional
change.  Given that the reproduce case sometimes took upto days to
trigger, it's easy to be misled while bisecting.  Maybe something made
contention on fidvid_mutex more likely?  I don't know.

This patch fixes the bug by using work_on_cpu() instead if @pol->cpu
isn't the same as the current one.  The code assumes that
cpufreq_policy->cpu is kept online by the caller, which Rafael tells
me is the case.

stable: ed48ece27c ("workqueue: reimplement work_on_cpu() using
        system_wq") should be applied before this; otherwise, the
        behavior could be horrible.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Duncan <1i5t5.duncan@cox.net>
Tested-by: Duncan <1i5t5.duncan@cox.net>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: stable@vger.kernel.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47301

11 years agoworkqueue: reimplement work_on_cpu() using system_wq
Tejun Heo [Tue, 18 Sep 2012 19:48:43 +0000 (12:48 -0700)]
workqueue: reimplement work_on_cpu() using system_wq

The existing work_on_cpu() implementation is hugely inefficient.  It
creates a new kthread, execute that single function and then let the
kthread die on each invocation.

Now that system_wq can handle concurrent executions, there's no
advantage of doing this.  Reimplement work_on_cpu() using system_wq
which makes it simpler and way more efficient.

stable: While this isn't a fix in itself, it's needed to fix a
        workqueue related bug in cpufreq/powernow-k8.  AFAICS, this
        shouldn't break other existing users.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@vger.kernel.org
11 years agoperf/x86: Fix Intel Ivy Bridge support
Stephane Eranian [Mon, 10 Sep 2012 23:07:01 +0000 (01:07 +0200)]
perf/x86: Fix Intel Ivy Bridge support

This patch updates the existing Intel IvyBridge (model 58)
support with proper PEBS event constraints. It cannot reuse
the same as SandyBridge because some events (0xd3) are
specific to IvyBridge.

Also there is no UOPS_DISPATCHED.THREAD on IVB, so do not
populate the PERF_COUNT_HW_STALLED_CYCLES_BACKEND mapping.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Link: http://lkml.kernel.org/r/20120910230701.GA5898@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 years agolinux/kernel.h: Fix warning seen with W=1 due to change in DIV_ROUND_CLOSEST
Guenter Roeck [Wed, 19 Sep 2012 03:42:31 +0000 (20:42 -0700)]
linux/kernel.h: Fix warning seen with W=1 due to change in DIV_ROUND_CLOSEST

After commit b6d86d3d (Fix DIV_ROUND_CLOSEST to support negative dividends),
the following warning is seen if the kernel is compiled with W=1 (-Wextra):

warning: comparison of unsigned expression >= 0 is always true

The warning is due to the test '((typeof(x))-1) >= 0', which is used to detect
if the variable type is unsigned. Research on the web suggests that the warning
disappears if '>' instead of '>=' is used for the comparison.

Tests after changing the macro along that line show that the warning is gone,
and that the result is still correct:

i=-4: DIV_ROUND_CLOSEST(i, 2)=-2
i=-3: DIV_ROUND_CLOSEST(i, 2)=-2
i=-2: DIV_ROUND_CLOSEST(i, 2)=-1
i=-1: DIV_ROUND_CLOSEST(i, 2)=-1
i=0: DIV_ROUND_CLOSEST(i, 2)=0
i=1: DIV_ROUND_CLOSEST(i, 2)=1
i=2: DIV_ROUND_CLOSEST(i, 2)=1
i=3: DIV_ROUND_CLOSEST(i, 2)=2
i=4: DIV_ROUND_CLOSEST(i, 2)=2

Code size is the same as before.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
11 years agomd: make sure metadata is updated when spares are activated or removed.
NeilBrown [Wed, 19 Sep 2012 02:54:22 +0000 (12:54 +1000)]
md: make sure metadata is updated when spares are activated or removed.

It isn't always necessary to update the metadata when spares are
removed as the presence-or-not of a spare isn't really important to
the integrity of an array.
Also activating a spare doesn't always require updating the metadata
as the update on 'recovery-completed' is usually sufficient.

However the introduction of 'replacement' devices have made these
transitions sometimes more important.  For example the 'Replacement'
flag isn't cleared until the original device is removed, so we need
to ensure a metadata update after that 'spare' is removed.

So set MD_CHANGE_DEVS whenever a spare is activated or removed, to
complement the current situation where it is set when a spare is added
or a device is failed (or a number of other less common situations).

This is suitable for -stable as out-of-data metadata could lead
to data corruption.
This is only relevant for 3.3 and later 9when 'replacement' as
introduced.

Cc: stable@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agomd/raid5: fix calculate of 'degraded' when a replacement becomes active.
NeilBrown [Wed, 19 Sep 2012 02:52:30 +0000 (12:52 +1000)]
md/raid5: fix calculate of 'degraded' when a replacement becomes active.

When a replacement device becomes active, we mark the device that it
replaces as 'faulty' so that it can subsequently get removed.
However 'calc_degraded' only pays attention to the primary device, not
the replacement, so the array appears to become degraded, which is
wrong.

So teach 'calc_degraded' to consider any replacement if a primary
device is faulty.

This is suitable for -stable as an incorrect 'degraded' value can
confuse md and could lead to data corruption.
This is only relevant for 3.3 and later.

Cc: stable@vger.kernel.org
Reported-by: Robin Hill <robin@robinhill.me.uk>
Reported-by: John Drescher <drescherjm@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoRevert "md/raid5: For odirect-write performance, do not set STRIPE_PREREAD_ACTIVE."
NeilBrown [Wed, 19 Sep 2012 02:48:30 +0000 (12:48 +1000)]
Revert "md/raid5: For odirect-write performance, do not set STRIPE_PREREAD_ACTIVE."

This reverts commit 895e3c5c58a80bb9e4e05d9ac38b4f30e0f97d80.

While this patch seemed like a good idea and did help some workloads,
it hurts other workloads.
Large sequential O_DIRECT writes were faster,
Small random O_DIRECT writes were slower.

Other changes (batching RAID5 writes) have improved the sequential
writes using a different mechanism, so the net result of this patch
is definitely negative.  So revert it.

Reported-by: Shaohua Li <shli@kernel.org>
Tested-by: Jianpeng Ma <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
11 years agoxfs: stop the sync worker before xfs_unmountfs
Ben Myers [Thu, 13 Sep 2012 21:18:47 +0000 (16:18 -0500)]
xfs: stop the sync worker before xfs_unmountfs

Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a05 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
11 years agocifs: fix return value in cifsConvertToUTF16
Jeff Layton [Tue, 18 Sep 2012 18:21:01 +0000 (14:21 -0400)]
cifs: fix return value in cifsConvertToUTF16

This function returns the wrong value, which causes the callers to get
the length of the resulting pathname wrong when it contains non-ASCII
characters.

This seems to fix https://bugzilla.samba.org/show_bug.cgi?id=6767

Cc: <stable@vger.kernel.org>
Reported-by: Baldvin Kovacs <baldvin.kovacs@gmail.com>
Reported-and-Tested-by: Nicolas Lefebvre <nico.lefebvre@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
11 years agoe1000: Small packets may get corrupted during padding by HW
Tushar Dave [Sat, 15 Sep 2012 10:16:57 +0000 (10:16 +0000)]
e1000: Small packets may get corrupted during padding by HW

On PCI/PCI-X HW, if packet size is less than ETH_ZLEN,
packets may get corrupted during padding by HW.
To WA this issue, pad all small packets manually.

Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm: fix a read lock imbalance in make_blackhole
Li RongQing [Mon, 17 Sep 2012 22:40:10 +0000 (22:40 +0000)]
xfrm: fix a read lock imbalance in make_blackhole

if xfrm_policy_get_afinfo returns 0, it has already released the read
lock, xfrm_policy_put_afinfo should not be called again.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotcp: fix regression in urgent data handling
Eric Dumazet [Mon, 17 Sep 2012 12:51:39 +0000 (12:51 +0000)]
tcp: fix regression in urgent data handling

Stephan Springl found that commit 1402d366019fed "tcp: introduce
tcp_try_coalesce" introduced a regression for rlogin

It turns out problem comes from TCP urgent data handling and
a change in behavior in input path.

rlogin sends two one-byte packets with URG ptr set, and when next data
frame is coalesced, we lack sk_data_ready() calls to wakeup consumer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Stephan Springl <springl-k@lar.bfw.de>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: fix memory leak on oom with zerocopy
Michael S. Tsirkin [Sat, 15 Sep 2012 22:44:16 +0000 (22:44 +0000)]
net: fix memory leak on oom with zerocopy

If orphan flags fails, we don't free the skb
on receive, which leaks the skb memory.

Return value was also wrong: netif_receive_skb
is supposed to return NET_RX_DROP, not ENOMEM.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetxen: check for root bus in netxen_mask_aer_correctable
Nikolay Aleksandrov [Fri, 14 Sep 2012 05:50:03 +0000 (05:50 +0000)]
netxen: check for root bus in netxen_mask_aer_correctable

Add a check if pdev->bus->self == NULL (root bus). When attaching
a netxen NIC to a VM it can be on the root bus and the guest would
crash in netxen_mask_aer_correctable() because of a NULL pointer
dereference if CONFIG_PCIEAER is present.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agohwmon: (applesmc) Bump max wait
Parag Warudkar [Mon, 17 Sep 2012 15:49:55 +0000 (17:49 +0200)]
hwmon: (applesmc) Bump max wait

A heavy-load test on a MacBookPro6,1 is still showing a substantial
amount of read errors.  Increasing the maximum wait time to 128 ms
resolves the issue.

Signed-off-by: Parag Warudkar <parag.lkml@gmail.com>
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
11 years agohwmon: (ad7314) Add 'name' sysfs attribute
Guenter Roeck [Tue, 11 Sep 2012 20:43:17 +0000 (13:43 -0700)]
hwmon: (ad7314) Add 'name' sysfs attribute

The 'name' sysfs attribute is mandatory for hwmon devices, but was missing
in this driver.

Cc: Jonathan Cameron <jic23@cam.ac.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Jonathan Cameron <jic23@cam.ac.uk>
11 years agobnx2x: fix rx checksum validation for IPv6
Michal Schmidt [Thu, 13 Sep 2012 12:59:44 +0000 (12:59 +0000)]
bnx2x: fix rx checksum validation for IPv6

Commit d6cb3e41 "bnx2x: fix checksum validation" caused a performance
regression for IPv6. Rx checksum offload does not work. IPv6 packets
are passed to the stack with CHECKSUM_NONE.

The hardware obviously cannot perform IP checksum validation for IPv6,
because there is no checksum in the IPv6 header. This should not prevent
us from setting CHECKSUM_UNNECESSARY.

Tested on BCM57711.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm_user: return error pointer instead of NULL #2
Mathias Krause [Fri, 14 Sep 2012 09:58:32 +0000 (09:58 +0000)]
xfrm_user: return error pointer instead of NULL #2

When dump_one_policy() returns an error, e.g. because of a too small
buffer to dump the whole xfrm policy, xfrm_policy_netlink() returns
NULL instead of an error pointer. But its caller expects an error
pointer and therefore continues to operate on a NULL skbuff.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm_user: return error pointer instead of NULL
Mathias Krause [Thu, 13 Sep 2012 11:41:26 +0000 (11:41 +0000)]
xfrm_user: return error pointer instead of NULL

When dump_one_state() returns an error, e.g. because of a too small
buffer to dump the whole xfrm state, xfrm_state_netlink() returns NULL
instead of an error pointer. But its callers expect an error pointer
and therefore continue to operate on a NULL skbuff.

This could lead to a privilege escalation (execution of user code in
kernel context) if the attacker has CAP_NET_ADMIN and is able to map
address 0.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: use DST_* macro to set obselete field
Nicolas Dichtel [Mon, 10 Sep 2012 22:09:47 +0000 (22:09 +0000)]
ipv6: use DST_* macro to set obselete field

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv6: use net->rt_genid to check dst validity
Nicolas Dichtel [Mon, 10 Sep 2012 22:09:46 +0000 (22:09 +0000)]
ipv6: use net->rt_genid to check dst validity

IPv6 dst should take care of rt_genid too. When a xfrm policy is inserted or
deleted, all dst should be invalidated.
To force the validation, dst entries should be created with ->obsolete set to
DST_OBSOLETE_FORCE_CHK. This was already the case for all functions calling
ip6_dst_alloc(), except for ip6_rt_copy().

As a consequence, we can remove the specific code in inet6_connection_sock.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoxfrm: invalidate dst on policy insertion/deletion
Nicolas Dichtel [Mon, 10 Sep 2012 22:09:45 +0000 (22:09 +0000)]
xfrm: invalidate dst on policy insertion/deletion

When a policy is inserted or deleted, all dst should be recalculated.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetns: move net->ipv4.rt_genid to net->rt_genid
Nicolas Dichtel [Mon, 10 Sep 2012 22:09:44 +0000 (22:09 +0000)]
netns: move net->ipv4.rt_genid to net->rt_genid

This commit prepares the use of rt_genid by both IPv4 and IPv6.
Initialization is left in IPv4 part.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: rt_cache_flush() cleanup
Eric Dumazet [Fri, 7 Sep 2012 20:27:11 +0000 (22:27 +0200)]
net: rt_cache_flush() cleanup

We dont use jhash anymore since route cache removal,
so we can get rid of get_random_bytes() calls for rt_genid
changes.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoipv4/route: arg delay is useless in rt_cache_flush()
Nicolas Dichtel [Fri, 7 Sep 2012 00:45:29 +0000 (00:45 +0000)]
ipv4/route: arg delay is useless in rt_cache_flush()

Since route cache deletion (89aef8921bfbac22f), delay is no
more used. Remove it.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge tag 'hwspinlock-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad...
Linus Torvalds [Tue, 18 Sep 2012 18:58:54 +0000 (11:58 -0700)]
Merge tag 'hwspinlock-3.6-fix' of git://git./linux/kernel/git/ohad/hwspinlock

Pull hwspinlock fix from Ohad Ben-Cohen:
 "A single hwspinlock fix by Wei Yongjun, which prevents potential NULL
  dereferences"

* tag 'hwspinlock-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock:
  hwspinlock/core: move the dereference below the NULL test

11 years agovfs: dcache: use DCACHE_DENTRY_KILLED instead of DCACHE_DISCONNECTED in d_kill()
Miklos Szeredi [Mon, 17 Sep 2012 20:31:38 +0000 (22:31 +0200)]
vfs: dcache: use DCACHE_DENTRY_KILLED instead of DCACHE_DISCONNECTED in d_kill()

IBM reported a soft lockup after applying the fix for the rename_lock
deadlock.  Commit c83ce989cb5f ("VFS: Fix the nfs sillyrename regression
in kernel 2.6.38") was found to be the culprit.

The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the
dentry was killed.  This flag can be set on non-killed dentries too,
which results in infinite retries when trying to traverse the dentry
tree.

This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is
only set in d_kill() and makes try_to_ascend() test only this flag.

IBM reported successful test results with this patch.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoMerge tag 'imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6 into fixes
Olof Johansson [Tue, 18 Sep 2012 17:16:44 +0000 (10:16 -0700)]
Merge tag 'imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6 into fixes

From Sascha Hauer:

ARM i.MX: Two fixes for i.MX
- armadillo5x0 board broken since v3.5 (stable material)
- i.MX25 Architecture broken since v3.6-rc1

* tag 'imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6:
  ARM i.MX25: Make timer irq work again
  ARM: imx: armadillo5x0: Fix illegal register access

11 years agoARM i.MX25: Make timer irq work again
Sascha Hauer [Tue, 18 Sep 2012 08:05:31 +0000 (10:05 +0200)]
ARM i.MX25: Make timer irq work again

Since i.MX has SPARSE_IRQ enabled the i.MX25 timer is broken. This
is because the internal irqs now start at an offset of NR_IRQS_LEGACY.
The patch fixed this up, but missed the i.MX25 timer which used a
hardcoded value instead of a define. This patch introduces a define
for the timer irq and uses it.

This is broken since introduced with 3.6-rc1:

| commit 8842a9e2869cae14bbb8184004a42fc3070587fb
| Author: Shawn Guo <shawn.guo@linaro.org>
| Date:   Thu Jun 14 11:16:14 2012 +0800
|
|    ARM: imx: enable SPARSE_IRQ for imx platform

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
11 years agoARM: imx: armadillo5x0: Fix illegal register access
Fabio Estevam [Mon, 17 Sep 2012 01:28:35 +0000 (22:28 -0300)]
ARM: imx: armadillo5x0: Fix illegal register access

Since commit eb92044eb (ARM i.MX3: Make ccm base address a variable )
it is necessary to pass the CCM register base as a variable.

Fix the CCM register access in mach-armadillo5x0 by passing mx3_ccm_base and
avoid illegal accesses.

Also applies to v3.5

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Cc: stable@vger.kernel.org
11 years agoMerge tag 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixes
Olof Johansson [Tue, 18 Sep 2012 14:43:16 +0000 (07:43 -0700)]
Merge tag 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixes

From Nicolas Ferre:

Modify AT91 device tree files for making the GPIO interrupts work.

* tag 'at91-fixes' of git://github.com/at91linux/linux-at91:
  ARM: at91: fix missing #interrupt-cells on gpio-controller

11 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas...
Olof Johansson [Tue, 18 Sep 2012 14:41:25 +0000 (07:41 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/horms/renesas into fixes

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  ARM: shmobile: kzm9g: bugfix: correct mmcif interrupt settings

11 years agoMerge branch 'v3.6-samsung-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Tue, 18 Sep 2012 14:40:02 +0000 (07:40 -0700)]
Merge branch 'v3.6-samsung-fixes-3' of git://git./linux/kernel/git/kgene/linux-samsung into fixes

* 'v3.6-samsung-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: SAMSUNG: Use spin_lock_{irqsave,irqrestore} in clk_set_rate
  ARM: SAMSUNG: use spin_lock_irqsave() in clk_set_parent

11 years ago[SCSI] hpsa: fix handling of protocol error
Stephen M. Cameron [Fri, 14 Sep 2012 21:34:25 +0000 (16:34 -0500)]
[SCSI] hpsa: fix handling of protocol error

If a command status of CMD_PROTOCOL_ERR is received, this
information should be conveyed to the SCSI mid layer, not
dropped on the floor.  CMD_PROTOCOL_ERR may be received
from the Smart Array for any commands destined for an external
RAID controller such as a P2000, or commands destined for tape
drives or CD/DVD-ROM drives, if for instance a cable is
disconnected.  This mostly affects multipath configurations, as
disconnecting a cable on a non-multipath configuration is not
going to do anything good regardless of whether CMD_PROTOCOL_ERR
is handled correctly or not.  Not handling CMD_PROTOCOL_ERR
correctly in a multipath configaration involving external RAID
controllers may cause data corruption, so this is quite a serious
bug.  This bug should not normally cause a problem for direct
attached disk storage.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
11 years agocciss: fix handling of protocol error
Stephen M. Cameron [Fri, 14 Sep 2012 21:35:10 +0000 (16:35 -0500)]
cciss: fix handling of protocol error

If a command completes with a status of CMD_PROTOCOL_ERR, this
information should be conveyed to the SCSI mid layer, not dropped
on the floor.  Unlike a similar bug in the hpsa driver, this bug
only affects tape drives and CD and DVD ROM drives in the cciss
driver, and to induce it, you have to disconnect (or damage) a
cable, so it is not a very likely scenario (which would explain
why the bug has gone undetected for the last 10 years.)

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 years agoblk: add an upper sanity check on partition adding
Alan Cox [Mon, 17 Sep 2012 10:47:13 +0000 (11:47 +0100)]
blk: add an upper sanity check on partition adding

65536 should be ludicrous anyway but without it we overflow the
memory computation doing the allocation and badness occurs.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
11 years agosh: Fix up TIF_NOTIFY_RESUME sans TIF_SIGPENDING handling.
Al Viro [Tue, 18 Sep 2012 08:04:37 +0000 (17:04 +0900)]
sh: Fix up TIF_NOTIFY_RESUME sans TIF_SIGPENDING handling.

As Al notes, we missed a TIF_NOTIFY_RESUME check which caused any
handlers without TIF_SIGPENDING also set to skip the notification:

Looks like while it is in the relevant masks *and* checked in
do_notify_resume() both on 32bit and 64bit variants since commit
ab99c733ae73cce31f2a2434f7099564e5a73d95 ("sh: Make syscall tracer
use tracehook notifiers, add TIF_NOTIFY_RESUME.") they are
actually *not* reached without simulataneous SIGPENDING, since
the actual glue in the callers had not been updated back then and
still checks for _TIF_SIGPENDING alone when deciding whether to
hit do_notify_resume() or not.

Reported-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
11 years agosh: pfc: Release spinlock in sh_pfc_gpio_request_enable() error path
Laurent Pinchart [Fri, 14 Sep 2012 18:25:48 +0000 (20:25 +0200)]
sh: pfc: Release spinlock in sh_pfc_gpio_request_enable() error path

The sh_pfc_gpio_request_enable() function acquires a spinlock but fails
to release it before returning if the requested mux type is not
supported. Fix this.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
11 years agoDMA: PL330: Check the pointer returned by kzalloc
Sachin Kamat [Mon, 17 Sep 2012 09:50:23 +0000 (15:20 +0530)]
DMA: PL330: Check the pointer returned by kzalloc

kzalloc could return NULL. Hence add a check to avoid
NULL pointer dereference.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
11 years agoDMA: PL330: Fix potential NULL pointer dereference in pl330_submit_req()
Sachin Kamat [Mon, 17 Sep 2012 09:50:22 +0000 (15:20 +0530)]
DMA: PL330: Fix potential NULL pointer dereference in pl330_submit_req()

'r->cfg' is being checked for NULL. However, it is dereferenced
in the previous statements. Thus moving those statements within
the check.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
11 years agoARM: shmobile: kzm9g: bugfix: correct mmcif interrupt settings
Tetsuyuki Kobayashi [Thu, 6 Sep 2012 07:19:36 +0000 (07:19 +0000)]
ARM: shmobile: kzm9g: bugfix: correct mmcif interrupt settings

Correct interrupt settings of sh_mmc:int and sh_mmc:error in board-kzm9g.c.

Signed-off-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
11 years agoARM: SAMSUNG: Use spin_lock_{irqsave,irqrestore} in clk_set_rate
Tushar Behera [Tue, 18 Sep 2012 01:05:34 +0000 (10:05 +0900)]
ARM: SAMSUNG: Use spin_lock_{irqsave,irqrestore} in clk_set_rate

The spinlock clocks_lock can be held during ISR, hence it is not safe to
hold that lock with disabling interrupts.

It fixes following potential deadlock.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.6.0-rc4+ #2 Not tainted
---------------------------------------------------------
swapper/0/1 just changed the state of lock:
 (&(&host->lock)->rlock){-.....}, at: [<c027fb0d>] sdhci_irq+0x15/0x564
but this lock took another, HARDIRQ-unsafe lock in the past:
 (clocks_lock){+.+...}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(clocks_lock);
                               local_irq_disable();
                               lock(&(&host->lock)->rlock);
                               lock(clocks_lock);
  <Interrupt>
    lock(&(&host->lock)->rlock);

 *** DEADLOCK ***

Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
11 years agoMerge branch 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Linus Torvalds [Mon, 17 Sep 2012 23:05:23 +0000 (16:05 -0700)]
Merge branch 'for-3.6-fixes' of git://git./linux/kernel/git/tj/wq

Pull another workqueue fix from Tejun Heo:
 "Unfortunately, yet another late fix.  This too is discovered and fixed
  by Lai.  This bug was introduced during this merge window by commit
  25511a477657 ("workqueue: reimplement CPU online rebinding to handle
  idle workers") which started using WORKER_REBIND flag for idle rebind
  too.

  The bug is relatively easy to trigger if the CPU rapidly goes through
  off, on and then off (and stay off).  The fix is on the safer side.
  This hasn't been on linux-next yet but I'm pushing early so that it
  can get more exposure before v3.6 release."

* 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: always clear WORKER_REBIND in busy_worker_rebind_fn()

11 years agoworkqueue: always clear WORKER_REBIND in busy_worker_rebind_fn()
Lai Jiangshan [Mon, 17 Sep 2012 22:42:31 +0000 (15:42 -0700)]
workqueue: always clear WORKER_REBIND in busy_worker_rebind_fn()

busy_worker_rebind_fn() didn't clear WORKER_REBIND if rebinding failed
(CPU is down again).  This used to be okay because the flag wasn't
used for anything else.

However, after 25511a477 "workqueue: reimplement CPU online rebinding
to handle idle workers", WORKER_REBIND is also used to command idle
workers to rebind.  If not cleared, the worker may confuse the next
CPU_UP cycle by having REBIND spuriously set or oops / get stuck by
prematurely calling idle_worker_rebind().

  WARNING: at /work/os/wq/kernel/workqueue.c:1323 worker_thread+0x4cd/0x5
 00()
  Hardware name: Bochs
  Modules linked in: test_wq(O-)
  Pid: 33, comm: kworker/1:1 Tainted: G           O 3.6.0-rc1-work+ #3
  Call Trace:
   [<ffffffff8109039f>] warn_slowpath_common+0x7f/0xc0
   [<ffffffff810903fa>] warn_slowpath_null+0x1a/0x20
   [<ffffffff810b3f1d>] worker_thread+0x4cd/0x500
   [<ffffffff810bc16e>] kthread+0xbe/0xd0
   [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
  ---[ end trace e977cf20f4661968 ]---
  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: [<ffffffff810b3db0>] worker_thread+0x360/0x500
  PGD 0
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in: test_wq(O-)
  CPU 0
  Pid: 33, comm: kworker/1:1 Tainted: G        W  O 3.6.0-rc1-work+ #3 Bochs Bochs
  RIP: 0010:[<ffffffff810b3db0>]  [<ffffffff810b3db0>] worker_thread+0x360/0x500
  RSP: 0018:ffff88001e1c9de0  EFLAGS: 00010086
  RAX: 0000000000000000 RBX: ffff88001e633e00 RCX: 0000000000004140
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
  RBP: ffff88001e1c9ea0 R08: 0000000000000000 R09: 0000000000000001
  R10: 0000000000000002 R11: 0000000000000000 R12: ffff88001fc8d580
  R13: ffff88001fc8d590 R14: ffff88001e633e20 R15: ffff88001e1c6900
  FS:  0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000000 CR3: 00000000130e8000 CR4: 00000000000006f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kworker/1:1 (pid: 33, threadinfo ffff88001e1c8000, task ffff88001e1c6900)
  Stack:
   ffff880000000000 ffff88001e1c9e40 0000000000000001 ffff88001e1c8010
   ffff88001e519c78 ffff88001e1c9e58 ffff88001e1c6900 ffff88001e1c6900
   ffff88001e1c6900 ffff88001e1c6900 ffff88001fc8d340 ffff88001fc8d340
  Call Trace:
   [<ffffffff810bc16e>] kthread+0xbe/0xd0
   [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
  Code: b1 00 f6 43 48 02 0f 85 91 01 00 00 48 8b 43 38 48 89 df 48 8b 00 48 89 45 90 e8 ac f0 ff ff 3c 01 0f 85 60 01 00 00 48 8b 53 50 <8b> 02 83 e8 01 85 c0 89 02 0f 84 3b 01 00 00 48 8b 43 38 48 8b
  RIP  [<ffffffff810b3db0>] worker_thread+0x360/0x500
   RSP <ffff88001e1c9de0>
  CR2: 0000000000000000

There was no reason to keep WORKER_REBIND on failure in the first
place - WORKER_UNBOUND is guaranteed to be set in such cases
preventing incorrectly activating concurrency management.  Always
clear WORKER_REBIND.

tj: Updated comment and description.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
11 years agoMerge branch 'akpm' (Andrew's patch-bomb)
Linus Torvalds [Mon, 17 Sep 2012 22:01:14 +0000 (15:01 -0700)]
Merge branch 'akpm' (Andrew's patch-bomb)

Merge fixes from Andrew Morton:
 "13 patches.  12 are fixes and one is a little preparatory thing for
  Andi."

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (13 commits)
  memory hotplug: fix section info double registration bug
  mm/page_alloc: fix the page address of higher page's buddy calculation
  drivers/rtc/rtc-twl.c: ensure all interrupts are disabled during probe
  compiler.h: add __visible
  pid-namespace: limit value of ns_last_pid to (0, max_pid)
  include/net/sock.h: squelch compiler warning in sk_rmem_schedule()
  slub: consider pfmemalloc_match() in get_partial_node()
  slab: fix starting index for finding another object
  slab: do ClearSlabPfmemalloc() for all pages of slab
  nbd: clear waiting_queue on shutdown
  MAINTAINERS: fix TXT maintainer list and source repo path
  mm/ia64: fix a memory block size bug
  memory hotplug: reset pgdat->kswapd to NULL if creating kernel thread fails

11 years agomemory hotplug: fix section info double registration bug
qiuxishi [Mon, 17 Sep 2012 21:09:24 +0000 (14:09 -0700)]
memory hotplug: fix section info double registration bug

There may be a bug when registering section info.  For example, on my
Itanium platform, the pfn range of node0 includes the other nodes, so
other nodes' section info will be double registered, and memmap's page
count will equal to 3.

  node0: start_pfn=0x100,    spanned_pfn=0x20fb00, present_pfn=0x7f8a3, => 0x000100-0x20fc00
  node1: start_pfn=0x80000,  spanned_pfn=0x80000,  present_pfn=0x80000, => 0x080000-0x100000
  node2: start_pfn=0x100000, spanned_pfn=0x80000,  present_pfn=0x80000, => 0x100000-0x180000
  node3: start_pfn=0x180000, spanned_pfn=0x80000,  present_pfn=0x80000, => 0x180000-0x200000

  free_all_bootmem_node()
register_page_bootmem_info_node()
register_page_bootmem_info_section()

When hot remove memory, we can't free the memmap's page because
page_count() is 2 after put_page_bootmem().

  sparse_remove_one_section()
free_section_usemap()
free_map_bootmem()
put_page_bootmem()

[akpm@linux-foundation.org: add code comment]
Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm/page_alloc: fix the page address of higher page's buddy calculation
Li Haifeng [Mon, 17 Sep 2012 21:09:21 +0000 (14:09 -0700)]
mm/page_alloc: fix the page address of higher page's buddy calculation

The heuristic method for buddy has been introduced since commit
43506fad21ca ("mm/page_alloc.c: simplify calculation of combined index
of adjacent buddy lists").  But the page address of higher page's buddy
was wrongly calculated, which will lead page_is_buddy to fail for ever.
IOW, the heuristic method would be disabled with the wrong page address
of higher page's buddy.

Calculating the page address of higher page's buddy should be based
higher_page with the offset between index of higher page and index of
higher page's buddy.

Signed-off-by: Haifeng Li <omycle@gmail.com>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: KyongHo Cho <pullip.cho@samsung.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: <stable@vger.kernel.org> [2.6.38+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodrivers/rtc/rtc-twl.c: ensure all interrupts are disabled during probe
Kevin Hilman [Mon, 17 Sep 2012 21:09:17 +0000 (14:09 -0700)]
drivers/rtc/rtc-twl.c: ensure all interrupts are disabled during probe

On some platforms, bootloaders are known to do some interesting RTC
programming.  Without going into the obscurities as to why this may be
the case, suffice it to say the the driver should not make any
assumptions about the state of the RTC when the driver loads.  In
particular, the driver probe should be sure that all interrupts are
disabled until otherwise programmed.

This was discovered when finding bursty I2C traffic every second on
Overo platforms.  This I2C overhead was keeping the SoC from hitting
deep power states.  The cause was found to be the RTC firing every
second on the I2C-connected TWL PMIC.

Special thanks to Felipe Balbi for suggesting to look for a rogue driver
as the source of the I2C traffic rather than the I2C driver itself.

Special thanks to Steve Sakoman for helping track down the source of the
continuous RTC interrups on the Overo boards.

Signed-off-by: Kevin Hilman <khilman@ti.com>
Cc: Felipe Balbi <balbi@ti.com>
Tested-by: Steve Sakoman <steve@sakoman.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Tested-by: Shubhrajyoti Datta <omaplinuxkernel@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agocompiler.h: add __visible
Andi Kleen [Mon, 17 Sep 2012 21:09:15 +0000 (14:09 -0700)]
compiler.h: add __visible

gcc 4.6+ has support for a externally_visible attribute that prevents the
optimizer from optimizing unused symbols away.  Add a __visible macro to
use it with that compiler version or later.

This is used (at least) by the "Link Time Optimization" patchset.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agopid-namespace: limit value of ns_last_pid to (0, max_pid)
Andrew Vagin [Mon, 17 Sep 2012 21:09:12 +0000 (14:09 -0700)]
pid-namespace: limit value of ns_last_pid to (0, max_pid)

The kernel doesn't check the pid for negative values, so if you try to
write -2 to /proc/sys/kernel/ns_last_pid, you will get a kernel panic.

The crash happens because the next pid is -1, and alloc_pidmap() will
try to access to a nonexistent pidmap.

  map = &pid_ns->pidmap[pid/BITS_PER_PAGE];

Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoinclude/net/sock.h: squelch compiler warning in sk_rmem_schedule()
Chuck Lever [Mon, 17 Sep 2012 21:09:11 +0000 (14:09 -0700)]
include/net/sock.h: squelch compiler warning in sk_rmem_schedule()

This warning:

  In file included from linux/include/linux/tcp.h:227:0,
                   from linux/include/linux/ipv6.h:221,
                   from linux/include/net/ipv6.h:16,
                   from linux/include/linux/sunrpc/clnt.h:26,
                   from linux/net/sunrpc/stats.c:22:
  linux/include/net/sock.h: In function `sk_rmem_schedule':
  linux/nfs-2.6/include/net/sock.h:1339:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

is seen with gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2) using the
-Wextra option.

Commit c76562b6709f ("netvm: prevent a stream-specific deadlock")
accidentally replaced the "size" parameter of sk_rmem_schedule() with an
unsigned int.  This changes the semantics of the comparison in the
return statement.

In sk_wmem_schedule we have syntactically the same comparison, but
"size" is a signed integer.  In addition, __sk_mem_schedule() takes a
signed integer for its "size" parameter, so there is an implicit type
conversion in sk_rmem_schedule() anyway.

Revert the "size" parameter back to a signed integer so that the
semantics of the expressions in both sk_[rw]mem_schedule() are exactly
the same.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoslub: consider pfmemalloc_match() in get_partial_node()
Joonsoo Kim [Mon, 17 Sep 2012 21:09:09 +0000 (14:09 -0700)]
slub: consider pfmemalloc_match() in get_partial_node()

get_partial() is currently not checking pfmemalloc_match() meaning that
it is possible for pfmemalloc pages to leak to non-pfmemalloc users.
This is a problem in the following situation.  Assume that there is a
request from normal allocation and there are no objects in the per-cpu
cache and no node-partial slab.

In this case, slab_alloc enters the slow path and new_slab_objects() is
called which may return a PFMEMALLOC page.  As the current user is not
allowed to access PFMEMALLOC page, deactivate_slab() is called
([5091b74a: mm: slub: optimise the SLUB fast path to avoid pfmemalloc
checks]) and returns an object from PFMEMALLOC page.

Next time, when we get another request from normal allocation,
slab_alloc() enters the slow-path and calls new_slab_objects().  In
new_slab_objects(), we call get_partial() and get a partial slab which
was just deactivated but is a pfmemalloc page.  We extract one object
from it and re-deactivate.

  "deactivate -> re-get in get_partial -> re-deactivate" occures repeatedly.

As a result, access to PFMEMALLOC page is not properly restricted and it
can cause a performance degradation due to frequent deactivation.
deactivation frequently.

This patch changes get_partial_node() to take pfmemalloc_match() into
account and prevents the "deactivate -> re-get in get_partial()
scenario.  Instead, new_slab() is called.

Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoslab: fix starting index for finding another object
Joonsoo Kim [Mon, 17 Sep 2012 21:09:06 +0000 (14:09 -0700)]
slab: fix starting index for finding another object

In array cache, there is a object at index 0, check it.

Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoslab: do ClearSlabPfmemalloc() for all pages of slab
Mel Gorman [Mon, 17 Sep 2012 21:09:03 +0000 (14:09 -0700)]
slab: do ClearSlabPfmemalloc() for all pages of slab

Right now, we call ClearSlabPfmemalloc() for first page of slab when we
clear SlabPfmemalloc flag.  This is fine for most swap-over-network use
cases as it is expected that order-0 pages are in use.  Unfortunately it
is possible that that __ac_put_obj() checks SlabPfmemalloc on a tail
page and while this is harmless, it is sloppy.  This patch ensures that
the head page is always used.

This problem was originally identified by Joonsoo Kim.

[js1304@gmail.com: Original implementation and problem identification]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agonbd: clear waiting_queue on shutdown
Paul Clements [Mon, 17 Sep 2012 21:09:02 +0000 (14:09 -0700)]
nbd: clear waiting_queue on shutdown

Fix a serious but uncommon bug in nbd which occurs when there is heavy
I/O going to the nbd device while, at the same time, a failure (server,
network) or manual disconnect of the nbd connection occurs.

There is a small window between the time that the nbd_thread is stopped
and the socket is shutdown where requests can continue to be queued to
nbd's internal waiting_queue.  When this happens, those requests are
never completed or freed.

The fix is to clear the waiting_queue on shutdown of the nbd device, in
the same way that the nbd request queue (queue_head) is already being
cleared.

Signed-off-by: Paul Clements <paul.clements@steeleye.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoMAINTAINERS: fix TXT maintainer list and source repo path
Gang Wei [Mon, 17 Sep 2012 21:08:59 +0000 (14:08 -0700)]
MAINTAINERS: fix TXT maintainer list and source repo path

Signed-off-by: Gang Wei <gang.wei@intel.com>
Cc: Richard L Maliszewski <richard.l.maliszewski@intel.com>
Cc: Gang Wei <gang.wei@intel.com>
Cc: Shane Wang <shane.wang@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>