pandora-kernel.git
11 years agosignal: Define __ARCH_HAS_SA_RESTORER so we know whether to clear sa_restorer
Ben Hutchings [Mon, 26 Nov 2012 03:24:19 +0000 (22:24 -0500)]
signal: Define __ARCH_HAS_SA_RESTORER so we know whether to clear sa_restorer

flush_signal_handlers() needs to know whether sigaction::sa_restorer
is defined, not whether SA_RESTORER is defined.  Define the
__ARCH_HAS_SA_RESTORER macro to indicate this.

Vaguely based on upstream commit 574c4866e33d 'consolidate kernel-side
struct sigaction declarations'.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
11 years agoefivars: pstore: Do not check size when erasing variable
Ben Hutchings [Sat, 23 Mar 2013 03:49:53 +0000 (03:49 +0000)]
efivars: pstore: Do not check size when erasing variable

In 3.2, unlike mainline, efi_pstore_erase() calls efi_pstore_write()
with a size of 0, as the underlying EFI interface treats a size of 0
as meaning deletion.

This was not taken into account in my backport of commit d80a361d779a
'efi_pstore: Check remaining space with QueryVariableInfo() before
writing data'.  The size check should be omitted when erasing.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoKMS: fix EDID detailed timing frame rate
Torsten Duwe [Sat, 23 Mar 2013 14:39:34 +0000 (15:39 +0100)]
KMS: fix EDID detailed timing frame rate

commit c19b3b0f6eed552952845e4ad908dba2113d67b4 upstream.

When KMS has parsed an EDID "detailed timing", it leaves the frame rate
zeroed.  Consecutive (debug-) output of that mode thus yields 0 for
vsync.  This simple fix also speeds up future invocations of
drm_mode_vrefresh().

While it is debatable whether this qualifies as a -stable fix I'd apply
it for consistency's sake; drm_helper_probe_single_connector_modes()
does the same thing already for all probed modes.

Signed-off-by: Torsten Duwe <duwe@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoKMS: fix EDID detailed timing vsync parsing
Torsten Duwe [Sat, 23 Mar 2013 14:38:22 +0000 (15:38 +0100)]
KMS: fix EDID detailed timing vsync parsing

commit 16dad1d743d31a104a849c8944e6b9eb479f6cd7 upstream.

EDID spreads some values across multiple bytes; bit-fiddling is needed
to retrieve these.  The current code to parse "detailed timings" has a
cut&paste error that results in a vsync offset of at most 15 lines
instead of 63.

See

   http://en.wikipedia.org/wiki/EDID

and in the "EDID Detailed Timing Descriptor" see bytes 10+11 show why
that needs to be a left shift.

Signed-off-by: Torsten Duwe <duwe@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agomm/hugetlb: fix total hugetlbfs pages count when using memory overcommit accouting
Wanpeng Li [Fri, 22 Mar 2013 22:04:40 +0000 (15:04 -0700)]
mm/hugetlb: fix total hugetlbfs pages count when using memory overcommit accouting

commit d00285884c0892bb1310df96bce6056e9ce9b9d9 upstream.

hugetlb_total_pages is used for overcommit calculations but the current
implementation considers only the default hugetlb page size (which is
either the first defined hugepage size or the one specified by
default_hugepagesz kernel boot parameter).

If the system is configured for more than one hugepage size, which is
possible since commit a137e1cc6d6e ("hugetlbfs: per mount huge page
sizes") then the overcommit estimation done by __vm_enough_memory()
(resp.  shown by meminfo_proc_show) is not precise - there is an
impression of more available/allowed memory.  This can lead to an
unexpected ENOMEM/EFAULT resp.  SIGSEGV when memory is accounted.

Testcase:
  boot: hugepagesz=1G hugepages=1
  the default overcommit ratio is 50
  before patch:

    egrep 'CommitLimit' /proc/meminfo
    CommitLimit:     55434168 kB

  after patch:

    egrep 'CommitLimit' /proc/meminfo
    CommitLimit:     54909880 kB

[akpm@linux-foundation.org: coding-style tweak]
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agovfs,proc: guarantee unique inodes in /proc
Linus Torvalds [Fri, 22 Mar 2013 18:44:04 +0000 (11:44 -0700)]
vfs,proc: guarantee unique inodes in /proc

commit 51f0885e5415b4cc6535e9cdcc5145bfbc134353 upstream.

Dave Jones found another /proc issue with his Trinity tool: thanks to
the namespace model, we can have multiple /proc dentries that point to
the same inode, aliasing directories in /proc/<pid>/net/ for example.

This ends up being a total disaster, because it acts like hardlinked
directories, and causes locking problems.  We rely on the topological
sort of the inodes pointed to by dentries, and if we have aliased
directories, that odering becomes unreliable.

In short: don't do this.  Multiple dentries with the same (directory)
inode is just a bad idea, and the namespace code should never have
exposed things this way.  But we're kind of stuck with it.

This solves things by just always allocating a new inode during /proc
dentry lookup, instead of using "iget_locked()" to look up existing
inodes by superblock and number.  That actually simplies the code a bit,
at the cost of potentially doing more inode [de]allocations.

That said, the inode lookup wasn't free either (and did a lot of locking
of inodes), so it is probably not that noticeable.  We could easily keep
the old lookup model for non-directory entries, but rather than try to
be excessively clever this just implements the minimal and simplest
workaround for the problem.

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Analyzed-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2:
 - Adjust context
 - Never drop the pde reference in proc_get_inode(), as callers only
   expect this when we return an existing inode, and we never do that now]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoi2c: tegra: check the clk_prepare_enable() return value
Laxman Dewangan [Fri, 15 Mar 2013 05:34:08 +0000 (05:34 +0000)]
i2c: tegra: check the clk_prepare_enable() return value

commit 132c803f7b70b17322579f6f4f3f65cf68e55135 upstream.

NVIDIA's Tegra SoC allows read/write of controller register only
if controller clock is enabled. System hangs if read/write happens
to registers without enabling clock.

clk_prepare_enable() can be fail due to unknown reason and hence
adding check for return value of this function. If this function
success then only access register otherwise return to caller with
error.

Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Reviewed-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
[bwh: Backported to 3.2:
 - Adjust context
 - Keep calling clk_enable() directly]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoUSB: serial: fix interface refcounting
Johan Hovold [Tue, 19 Mar 2013 08:21:09 +0000 (09:21 +0100)]
USB: serial: fix interface refcounting

commit d7971051e4df825e0bc11b995e87bfe86355b8e5 upstream.

Make sure the interface is not released before our serial device.

Note that drivers are still not allowed to access the interface in
any way that may interfere with another driver that may have gotten
bound to the same interface after disconnect returns.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoUSB: io_ti: fix get_icount for two port adapters
Johan Hovold [Tue, 19 Mar 2013 08:21:08 +0000 (09:21 +0100)]
USB: io_ti: fix get_icount for two port adapters

commit 5492bf3d5655b4954164f69c02955a7fca267611 upstream.

Add missing get_icount field to two-port driver.

The two-port driver was not updated when switching to the new icount
interface in commit 0bca1b913aff ("tty: Convert the USB drivers to the
new icount interface").

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoUSB: garmin_gps: fix memory leak on disconnect
Johan Hovold [Tue, 19 Mar 2013 08:21:07 +0000 (09:21 +0100)]
USB: garmin_gps: fix memory leak on disconnect

commit 618aa1068df29c37a58045fe940f9106664153fd upstream.

Remove bogus disconnect test introduced by 95bef012e ("USB: more serial
drivers writing after disconnect") which prevented queued data from
being freed on disconnect.

The possible IO it was supposed to prevent is long gone.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agocifs: ignore everything in SPNEGO blob after mechTypes
Jeff Layton [Mon, 11 Mar 2013 13:52:19 +0000 (09:52 -0400)]
cifs: ignore everything in SPNEGO blob after mechTypes

commit f853c616883a8de966873a1dab283f1369e275a1 upstream.

We've had several reports of people attempting to mount Windows 8 shares
and getting failures with a return code of -EINVAL. The default sec=
mode changed recently to sec=ntlmssp. With that, we expect and parse a
SPNEGO blob from the server in the NEGOTIATE reply.

The current decode_negTokenInit function first parses all of the
mechTypes and then tries to parse the rest of the negTokenInit reply.
The parser however currently expects a mechListMIC or nothing to follow the
mechTypes, but Windows 8 puts a mechToken field there instead to carry
some info for the new NegoEx stuff.

In practice, we don't do anything with the fields after the mechTypes
anyway so I don't see any real benefit in continuing to parse them.
This patch just has the kernel ignore the fields after the mechTypes.
We'll probably need to reinstate some of this if we ever want to support
NegoEx.

Reported-by: Jason Burgess <jason@jacknife2.dns2go.com>
Reported-by: Yan Li <elliot.li.tech@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoefivars: Handle duplicate names from get_next_variable()
Matt Fleming [Thu, 7 Mar 2013 11:59:14 +0000 (11:59 +0000)]
efivars: Handle duplicate names from get_next_variable()

commit e971318bbed610e28bb3fde9d548e6aaf0a6b02e upstream.

Some firmware exhibits a bug where the same VariableName and
VendorGuid values are returned on multiple invocations of
GetNextVariableName(). See,

    https://bugzilla.kernel.org/show_bug.cgi?id=47631

As a consequence of such a bug, Andre reports hitting the following
WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte
Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e
11/21/2012)" machine,

[    0.581554] EFI Variables Facility v0.08 2004-May-17
[    0.584914] ------------[ cut here ]------------
[    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100()
[    0.586381] Hardware name: To be filled by O.E.M.
[    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
[    0.588694] Modules linked in:
[    0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7
[    0.590280] Call Trace:
[    0.591066]  [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100
[    0.591861]  [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0
[    0.592650]  [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50
[    0.593429]  [<ffffffff8134dd85>] ? strlcat+0x65/0x80
[    0.594203]  [<ffffffff81208954>] sysfs_add_one+0xd4/0x100
[    0.594979]  [<ffffffff81208b78>] create_dir+0x78/0xd0
[    0.595753]  [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0
[    0.596532]  [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220
[    0.597310]  [<ffffffff81348307>] kobject_init_and_add+0x67/0x90
[    0.598083]  [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0
[    0.598859]  [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0
[    0.599631]  [<ffffffff8158517e>] register_efivars+0xde/0x420
[    0.600395]  [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5
[    0.601150]  [<ffffffff81d4315f>] efivars_init+0xb8/0x104
[    0.601903]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
[    0.602659]  [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6
[    0.603418]  [<ffffffff81d05586>] ? loglevel+0x31/0x31
[    0.604183]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.604936]  [<ffffffff816a653e>] kernel_init+0xe/0xf0
[    0.605681]  [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0
[    0.606414]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
[    0.607143] ---[ end trace 1609741ab737eb29 ]---

There's not much we can do to work around and keep traversing the
variable list once we hit this firmware bug. Our only solution is to
terminate the loop because, as Lingzhu reports, some machines get
stuck when they encounter duplicate names,

  > I had an IBM System x3100 M4 and x3850 X5 on which kernel would
  > get stuck in infinite loop creating duplicate sysfs files because,
  > for some reason, there are several duplicate boot entries in nvram
  > getting GetNextVariableName into a circle of iteration (with
  > period > 2).

Also disable the workqueue, as efivar_update_sysfs_entries() uses
GetNextVariableName() to figure out which variables have been created
since the last iteration. That algorithm isn't going to work if
GetNextVariableName() returns duplicates. Note that we don't disable
EFI variable creation completely on the affected machines, it's just
that any pstore dump-* files won't appear in sysfs until the next
boot.

Reported-by: Andre Heider <a.heider@gmail.com>
Reported-by: Lingzhu Xiang <lxiang@redhat.com>
Tested-by: Lingzhu Xiang <lxiang@redhat.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[bwh: Backported to 3.2: reason is not checked in efi_pstore_write()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoefivars: explicitly calculate length of VariableName
Matt Fleming [Fri, 1 Mar 2013 14:49:12 +0000 (14:49 +0000)]
efivars: explicitly calculate length of VariableName

commit ec50bd32f1672d38ddce10fb1841cbfda89cfe9a upstream.

It's not wise to assume VariableNameSize represents the length of
VariableName, as not all firmware updates VariableNameSize in the same
way (some don't update it at all if EFI_SUCCESS is returned). There
are even implementations out there that update VariableNameSize with
values that are both larger than the string returned in VariableName
and smaller than the buffer passed to GetNextVariableName(), which
resulted in the following bug report from Michael Schroeder,

  > On HP z220 system (firmware version 1.54), some EFI variables are
  > incorrectly named :
  >
  > ls -d /sys/firmware/efi/vars/*8be4d* | grep -v -- -8be returns
  > /sys/firmware/efi/vars/dbxDefault-pport8be4df61-93ca-11d2-aa0d-00e098032b8c
  > /sys/firmware/efi/vars/KEKDefault-pport8be4df61-93ca-11d2-aa0d-00e098032b8c
  > /sys/firmware/efi/vars/SecureBoot-pport8be4df61-93ca-11d2-aa0d-00e098032b8c
  > /sys/firmware/efi/vars/SetupMode-Information8be4df61-93ca-11d2-aa0d-00e098032b8c

The issue here is that because we blindly use VariableNameSize without
verifying its value, we can potentially read garbage values from the
buffer containing VariableName if VariableNameSize is larger than the
length of VariableName.

Since VariableName is a string, we can calculate its size by searching
for the terminating NULL character.

Reported-by: Frederic Crozat <fcrozat@suse.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Michael Schroeder <mls@suse.com>
Cc: Lee, Chun-Yi <jlee@suse.com>
Cc: Lingzhu Xiang <lxiang@redhat.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoefi_pstore: Introducing workqueue updating sysfs
Seiji Aguchi [Tue, 12 Feb 2013 21:04:41 +0000 (13:04 -0800)]
efi_pstore: Introducing workqueue updating sysfs

commit a93bc0c6e07ed9bac44700280e65e2945d864fd4 upstream.

[Problem]
efi_pstore creates sysfs entries, which enable users to access to NVRAM,
in a write callback. If a kernel panic happens in an interrupt context,
it may fail because it could sleep due to dynamic memory allocations during
creating sysfs entries.

[Patch Description]
This patch removes sysfs operations from a write callback by introducing
a workqueue updating sysfs entries which is scheduled after the write
callback is called.

Also, the workqueue is kicked in a just oops case.
A system will go down in other cases such as panic, clean shutdown and emergency
restart. And we don't need to create sysfs entries because there is no chance for
users to access to them.

efi_pstore will be robust against a kernel panic in an interrupt context with this patch.

Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Acked-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
[bwh: Backported to 3.2:
 - Adjust contest
 - Don't check reason in efi_pstore_write(), as it is not given as a
   parameter
 - Move up declaration of __efivars]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoefivars: Fix check for CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE
Ben Hutchings [Fri, 22 Mar 2013 19:56:51 +0000 (19:56 +0000)]
efivars: Fix check for CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE

commit ca0ba26fbbd2d81c43085df49ce0abfe34535a90 upstream.

The 'CONFIG_' prefix is not implicit in IS_ENABLED().

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
11 years agoefivars: Add module parameter to disable use as a pstore backend
Seth Forshee [Mon, 11 Mar 2013 21:17:50 +0000 (16:17 -0500)]
efivars: Add module parameter to disable use as a pstore backend

commit ec0971ba5372a4dfa753f232449d23a8fd98490e upstream.

We know that with some firmware implementations writing too much data to
UEFI variables can lead to bricking machines. Recent changes attempt to
address this issue, but for some it may still be prudent to avoid
writing large amounts of data until the solution has been proven on a
wide variety of hardware.

Crash dumps or other data from pstore can potentially be a large data
source. Add a pstore_module parameter to efivars to allow disabling its
use as a backend for pstore. Also add a config option,
CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE, to allow setting the default
value of this paramter to true (i.e. disabled by default).

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoefivars: Allow disabling use as a pstore backend
Seth Forshee [Thu, 7 Mar 2013 17:40:17 +0000 (11:40 -0600)]
efivars: Allow disabling use as a pstore backend

commit ed9dc8ce7a1c8115dba9483a9b51df8b63a2e0ef upstream.

Add a new option, CONFIG_EFI_VARS_PSTORE, which can be set to N to
avoid using efivars as a backend to pstore, as some users may want to
compile out the code completely.

Set the default to Y to maintain backwards compatability, since this
feature has always been enabled until now.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agodm thin: fix discard corruption
Joe Thornber [Wed, 20 Mar 2013 17:21:24 +0000 (17:21 +0000)]
dm thin: fix discard corruption

commit f046f89a99ccfd9408b94c653374ff3065c7edb3 upstream.

Fix a bug in dm_btree_remove that could leave leaf values with incorrect
reference counts.  The effect of this was that removal of a shared block
could result in the space maps thinking the block was no longer used.
More concretely, if you have a thin device and a snapshot of it, sending
a discard to a shared region of the thin could corrupt the snapshot.

Thinp uses a 2-level nested btree to store it's mappings.  This first
level is indexed by thin device, and the second level by logical
block.

Often when we're removing an entry in this mapping tree we need to
rebalance nodes, which can involve shadowing them, possibly creating a
copy if the block is shared.  If we do create a copy then children of
that node need to have their reference counts incremented.  In this
way reference counts percolate down the tree as shared trees diverge.

The rebalance functions were incrementing the children at the
appropriate time, but they were always assuming the children were
internal nodes.  This meant the leaf values (in our case packed
block/flags entries) were not being incremented.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
[bwh: Backported to 3.2: bump target version numbers from 1.0.1 to 1.0.2]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agousb: gadget: udc-core: fix a regression during gadget driver unbinding
Alan Stern [Fri, 15 Mar 2013 18:02:14 +0000 (14:02 -0400)]
usb: gadget: udc-core: fix a regression during gadget driver unbinding

commit 511f3c5326eabe1ece35202a404c24c0aeacc246 upstream.

This patch (as1666) fixes a regression in the UDC core.  The core
takes care of unbinding gadget drivers, and it does the unbinding
before telling the UDC driver to turn off the controller hardware.
When the call to the udc_stop callback is made, the gadget no longer
has a driver.  The callback routine should not be invoked with a
pointer to the old driver; doing so can cause problems (such as
use-after-free accesses in net2280).

This patch should be applied, with appropriate context changes, to all
the stable kernels going back to 3.1.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Felipe Balbi <balbi@ti.com>
[bwh: Backported to 3.2: adjust context, indentation]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoALSA: hda - Fix typo in checking IEC958 emphasis bit
Takashi Iwai [Wed, 20 Mar 2013 14:42:00 +0000 (15:42 +0100)]
ALSA: hda - Fix typo in checking IEC958 emphasis bit

commit a686fd141e20244ad75f80ad54706da07d7bb90a upstream.

There is a typo in convert_to_spdif_status() about checking the
emphasis IEC958 status bit.  It should check the given value instead
of the resultant value.

Reported-by: Martin Weishart <martin.weishart@telosalliance.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoext4: fix data=journal fast mount/umount hang
Theodore Ts'o [Wed, 20 Mar 2013 13:42:11 +0000 (09:42 -0400)]
ext4: fix data=journal fast mount/umount hang

commit 2b405bfa84063bfa35621d2d6879f52693c614b0 upstream.

In data=journal mode, if we unmount the file system before a
transaction has a chance to complete, when the journal inode is being
evicted, we can end up calling into jbd2_log_wait_commit() for the
last transaction, after the journalling machinery has been shut down.

Arguably we should adjust ext4_should_journal_data() to return FALSE
for the journal inode, but the only place it matters is
ext4_evict_inode(), and so to save a bit of CPU time, and to make the
patch much more obviously correct by inspection(tm), we'll fix it by
explicitly not trying to waiting for a journal commit when we are
evicting the journal inode, since it's guaranteed to never succeed in
this case.

This can be easily replicated via:

     mount -t ext4 -o data=journal /dev/vdb /vdb ; umount /vdb

------------[ cut here ]------------
WARNING: at /usr/projects/linux/ext4/fs/jbd2/journal.c:542 __jbd2_log_start_commit+0xba/0xcd()
Hardware name: Bochs
JBD2: bad log_start_commit: 3005630206 3005630206 0 0
Modules linked in:
Pid: 2909, comm: umount Not tainted 3.8.0-rc3 #1020
Call Trace:
 [<c015c0ef>] warn_slowpath_common+0x68/0x7d
 [<c02b7e7d>] ? __jbd2_log_start_commit+0xba/0xcd
 [<c015c177>] warn_slowpath_fmt+0x2b/0x2f
 [<c02b7e7d>] __jbd2_log_start_commit+0xba/0xcd
 [<c02b8075>] jbd2_log_start_commit+0x24/0x34
 [<c0279ed5>] ext4_evict_inode+0x71/0x2e3
 [<c021f0ec>] evict+0x94/0x135
 [<c021f9aa>] iput+0x10a/0x110
 [<c02b7836>] jbd2_journal_destroy+0x190/0x1ce
 [<c0175284>] ? bit_waitqueue+0x50/0x50
 [<c028d23f>] ext4_put_super+0x52/0x294
 [<c020efe3>] generic_shutdown_super+0x48/0xb4
 [<c020f071>] kill_block_super+0x22/0x60
 [<c020f3e0>] deactivate_locked_super+0x22/0x49
 [<c020f5d6>] deactivate_super+0x30/0x33
 [<c0222795>] mntput_no_expire+0x107/0x10c
 [<c02233a7>] sys_umount+0x2cf/0x2e0
 [<c02233ca>] sys_oldumount+0x12/0x14
 [<c08096b8>] syscall_call+0x7/0xb
---[ end trace 6a954cc790501c1f ]---
jbd2_log_wait_commit: error: j_commit_request=-1289337090, tid=0

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoALSA: snd-usb: mixer: ignore -EINVAL in snd_usb_mixer_controls()
Daniel Mack [Tue, 19 Mar 2013 20:09:25 +0000 (21:09 +0100)]
ALSA: snd-usb: mixer: ignore -EINVAL in snd_usb_mixer_controls()

commit 83ea5d18d74f032a760fecde78c0210f66f7f70c upstream.

Creation of individual mixer controls may fail, but that shouldn't cause
the entire mixer creation to fail. Even worse, if the mixer creation
fails, that will error out the entire device probing.

All the functions called by parse_audio_unit() should return -EINVAL if
they find descriptors that are unsupported or believed to be malformed,
so we can safely handle this error code as a non-fatal condition in
snd_usb_mixer_controls().

That fixes a long standing bug which is commonly worked around by
adding quirks which make the driver ignore entire interfaces. Some of
them might now be unnecessary.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Reported-and-tested-by: Rodolfo Thomazelli <pe.soberbo@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoALSA: snd-usb: mixer: propagate errors up the call chain
Daniel Mack [Tue, 19 Mar 2013 20:09:24 +0000 (21:09 +0100)]
ALSA: snd-usb: mixer: propagate errors up the call chain

commit 4d7b86c98e445b075c2c4c3757eb6d3d6efbe72e upstream.

In check_input_term() and parse_audio_feature_unit(), propagate the
error value that has been returned by a failing function instead of
-EINVAL. That helps cleaning up the error pathes in the mixer.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agox86-64: Fix the failure case in copy_user_handle_tail()
CQ Tang [Mon, 18 Mar 2013 15:02:21 +0000 (11:02 -0400)]
x86-64: Fix the failure case in copy_user_handle_tail()

commit 66db3feb486c01349f767b98ebb10b0c3d2d021b upstream.

The increment of "to" in copy_user_handle_tail() will have incremented
before a failure has been noted.  This causes us to skip a byte in the
failure case.

Only do the increment when assured there is no failure.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Link: http://lkml.kernel.org/r/20130318150221.8439.993.stgit@phlsvslse11.ph.intel.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoUSB: xhci - fix bit definitions for IMAN register
Dmitry Torokhov [Mon, 25 Feb 2013 18:56:01 +0000 (10:56 -0800)]
USB: xhci - fix bit definitions for IMAN register

commit f8264340e694604863255cc0276491d17c402390 upstream.

According to XHCI specification (5.5.2.1) the IP is bit 0 and IE is bit 1
of IMAN register. Previously their definitions were reversed.

Even though there are no ill effects being observed from the swapped
definitions (because IMAN_IP is RW1C and in legacy PCI case we come in
with it already set to 1 so it was clearing itself even though we were
setting IMAN_IE instead of IMAN_IP), we should still correct the values.

This patch should be backported to kernels as old as 2.6.36, that
contain the commit 4e833c0b87a30798e67f06120cecebef6ee9644c "xhci: don't
re-enable IE constantly".

Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoALSA: hda/cirrus - Fix the digital beep registration
Takashi Iwai [Mon, 18 Mar 2013 10:00:44 +0000 (11:00 +0100)]
ALSA: hda/cirrus - Fix the digital beep registration

commit a86b1a2cd2f81f74e815e07f756edd7bc5b6f034 upstream.

The argument passed to snd_hda_attach_beep_device() is a widget NID
while spec->beep_amp holds the composed value for amp controls.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agodrm/radeon/benchmark: make sure bo blit copy exists before using it
Alex Deucher [Tue, 12 Mar 2013 16:53:13 +0000 (12:53 -0400)]
drm/radeon/benchmark: make sure bo blit copy exists before using it

commit fa8d387dc3f62062a6b4afbbb2a3438094fd8584 upstream.

Fixes a segfault on asics without a blit callback.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=62239

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[bwh: Backported to 3.2: s/copy\.blit/copy_blit/]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agousb-storage: add unusual_devs entry for Samsung YP-Z3 mp3 player
Dmitry Artamonow [Sat, 9 Mar 2013 16:30:58 +0000 (20:30 +0400)]
usb-storage: add unusual_devs entry for Samsung YP-Z3 mp3 player

commit 29f86e66428ee083aec106cca1748dc63d98ce23 upstream.

Device stucks on filesystem writes, unless following quirk is passed:
  echo 04e8:5136:m > /sys/module/usb_storage/parameters/quirks

Add corresponding entry to unusual_devs.h

Signed-off-by: Dmitry Artamonow <mad_soft@inbox.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoUSB: xhci: correctly enable interrupts
Hannes Reinecke [Mon, 4 Mar 2013 16:14:43 +0000 (17:14 +0100)]
USB: xhci: correctly enable interrupts

commit 00eed9c814cb8f281be6f0f5d8f45025dc0a97eb upstream.

xhci has its own interrupt enabling routine, which will try to
use MSI-X/MSI if present. So the usb core shouldn't try to enable
legacy interrupts; on some machines the xhci legacy IRQ setting
is invalid.

v3: Be careful to not break XHCI_BROKEN_MSI workaround (by trenn)

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Frederik Himpe <fhimpe@vub.ac.be>
Cc: David Haerdeman <david@hardeman.nu>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reviewed-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agotracing: Prevent buffer overwrite disabled for latency tracers
Steven Rostedt (Red Hat) [Thu, 14 Mar 2013 19:03:53 +0000 (15:03 -0400)]
tracing: Prevent buffer overwrite disabled for latency tracers

commit 613f04a0f51e6e68ac6fe571ab79da3c0a5eb4da upstream.

The latency tracers require the buffers to be in overwrite mode,
otherwise they get screwed up. Force the buffers to stay in overwrite
mode when latency tracers are enabled.

Added a flag_changed() method to the tracer structure to allow
the tracers to see what flags are being changed, and also be able
to prevent the change from happing.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bwh: Backported to 3.2:
 - Adjust context
 - Drop some changes that are not needed because trace_set_options() is not
   separate from tracing_trace_options_write()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agotracing: Keep overwrite in sync between regular and snapshot buffers
Steven Rostedt (Red Hat) [Thu, 14 Mar 2013 18:20:54 +0000 (14:20 -0400)]
tracing: Keep overwrite in sync between regular and snapshot buffers

commit 80902822658aab18330569587cdb69ac1dfdcea8 upstream.

Changing the overwrite mode for the ring buffer via the trace
option only sets the normal buffer. But the snapshot buffer could
swap with it, and then the snapshot would be in non overwrite mode
and the normal buffer would be in overwrite mode, even though the
option flag states otherwise.

Keep the two buffers overwrite modes in sync.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agotracing: Protect tracer flags with trace_types_lock
Steven Rostedt (Red Hat) [Thu, 14 Mar 2013 17:50:56 +0000 (13:50 -0400)]
tracing: Protect tracer flags with trace_types_lock

commit 69d34da2984c95b33ea21518227e1f9470f11d95 upstream.

Seems that the tracer flags have never been protected from
synchronous writes. Luckily, admins don't usually modify the
tracing flags via two different tasks. But if scripts were to
be used to modify them, then they could get corrupted.

Move the trace_types_lock that protects against tracers changing
to also protect the flags being set.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bwh: Backported to 3.2: also move failure return in
 tracing_trace_options_write() after unlocking]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agotracing: Fix free of probe entry by calling call_rcu_sched()
Steven Rostedt (Red Hat) [Wed, 13 Mar 2013 15:15:19 +0000 (11:15 -0400)]
tracing: Fix free of probe entry by calling call_rcu_sched()

commit 740466bc89ad8bd5afcc8de220f715f62b21e365 upstream.

Because function tracing is very invasive, and can even trace
calls to rcu_read_lock(), RCU access in function tracing is done
with preempt_disable_notrace(). This requires a synchronize_sched()
for updates and not a synchronize_rcu().

Function probes (traceon, traceoff, etc) must be freed after
a synchronize_sched() after its entry has been removed from the
hash. But call_rcu() is used. Fix this by using call_rcu_sched().

Also fix the usage to use hlist_del_rcu() instead of hlist_del().

Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agodrm/i915: bounds check execbuffer relocation count
Kees Cook [Tue, 12 Mar 2013 00:31:45 +0000 (17:31 -0700)]
drm/i915: bounds check execbuffer relocation count

commit 3118a4f652c7b12c752f3222af0447008f9b2368 upstream.

It is possible to wrap the counter used to allocate the buffer for
relocation copies. This could lead to heap writing overflows.

CVE-2013-0913

v3: collapse test, improve comment
v2: move check into validate_exec_list

Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Pinkie Pie
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agodrm/i915: restrict kernel address leak in debugfs
Kees Cook [Mon, 11 Mar 2013 19:25:19 +0000 (12:25 -0700)]
drm/i915: restrict kernel address leak in debugfs

commit 2563a4524febe8f4a98e717e02436d1aaf672aa2 upstream.

Masks kernel address info-leak in object dumps with the %pK suffix,
so they cannot be used to target kernel memory corruption attacks if
the kptr_restrict sysctl is set.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwh: Backported to 3.2: the rest of the format string is different]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agocifs: delay super block destruction until all cifsFileInfo objects are gone
Mateusz Guzik [Fri, 8 Mar 2013 15:30:03 +0000 (16:30 +0100)]
cifs: delay super block destruction until all cifsFileInfo objects are gone

commit 24261fc23db950951760d00c188ba63cc756b932 upstream.

cifsFileInfo objects hold references to dentries and it is possible that
these will still be around in workqueues when VFS decides to kill super
block during unmount.

This results in panics like this one:
BUG: Dentry ffff88001f5e76c0{i=66b4a,n=1M-2} still in use (1) [unmount of cifs cifs]
------------[ cut here ]------------
kernel BUG at fs/dcache.c:943!
[..]
Process umount (pid: 1781, threadinfo ffff88003d6e8000, task ffff880035eeaec0)
[..]
Call Trace:
 [<ffffffff811b44f3>] shrink_dcache_for_umount+0x33/0x60
 [<ffffffff8119f7fc>] generic_shutdown_super+0x2c/0xe0
 [<ffffffff8119f946>] kill_anon_super+0x16/0x30
 [<ffffffffa036623a>] cifs_kill_sb+0x1a/0x30 [cifs]
 [<ffffffff8119fcc7>] deactivate_locked_super+0x57/0x80
 [<ffffffff811a085e>] deactivate_super+0x4e/0x70
 [<ffffffff811bb417>] mntput_no_expire+0xd7/0x130
 [<ffffffff811bc30c>] sys_umount+0x9c/0x3c0
 [<ffffffff81657c19>] system_call_fastpath+0x16/0x1b

Fix this by making each cifsFileInfo object hold a reference to cifs
super block, which implicitly keeps VFS super block around as well.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reported-and-Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agotracing: Fix race in snapshot swapping
Steven Rostedt (Red Hat) [Tue, 12 Mar 2013 15:32:32 +0000 (11:32 -0400)]
tracing: Fix race in snapshot swapping

commit 2721e72dd10f71a3ba90f59781becf02638aa0d9 upstream.

Although the swap is wrapped with a spin_lock, the assignment
of the temp buffer used to swap is not within that lock.
It needs to be moved into that lock, otherwise two swaps
happening on two different CPUs, can end up using the wrong
temp buffer to assign in the swap.

Luckily, all current callers of the swap function appear to have
their own locks. But in case something is added that allows two
different callers to call the swap, then there's a chance that
this race can trigger and corrupt the buffers.

New code is coming soon that will allow for this race to trigger.

I've Cc'd stable, so this bug will not show up if someone backports
one of the changes that can trigger this bug.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoext4: use atomic64_t for the per-flexbg free_clusters count
Theodore Ts'o [Tue, 12 Mar 2013 03:39:59 +0000 (23:39 -0400)]
ext4: use atomic64_t for the per-flexbg free_clusters count

commit 90ba983f6889e65a3b506b30dc606aa9d1d46cd2 upstream.

A user who was using a 8TB+ file system and with a very large flexbg
size (> 65536) could cause the atomic_t used in the struct flex_groups
to overflow.  This was detected by PaX security patchset:

http://forums.grsecurity.net/viewtopic.php?f=3&t=3289&p=12551#p12551

This bug was introduced in commit 9f24e4208f7e, so it's been around
since 2.6.30.  :-(

Fix this by using an atomic64_t for struct orlav_stats's
free_clusters.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoext4: convert number of blocks to clusters properly
Lukas Czerner [Sat, 2 Mar 2013 22:18:58 +0000 (17:18 -0500)]
ext4: convert number of blocks to clusters properly

commit 810da240f221d64bf90020f25941b05b378186fe upstream.

We're using macro EXT4_B2C() to convert number of blocks to number of
clusters for bigalloc file systems.  However, we should be using
EXT4_NUM_B2C().

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
[bwh: Backported to 3.2:
 - Adjust context
 - Drop changes in ext4_setup_new_descs() and ext4_calculate_overhead()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agojbd2: fix use after free in jbd2_journal_dirty_metadata()
Jan Kara [Mon, 11 Mar 2013 17:24:56 +0000 (13:24 -0400)]
jbd2: fix use after free in jbd2_journal_dirty_metadata()

commit ad56edad089b56300fd13bb9eeb7d0424d978239 upstream.

jbd2_journal_dirty_metadata() didn't get a reference to journal_head it
was working with. This is OK in most of the cases since the journal head
should be attached to a transaction but in rare occasions when we are
journalling data, __ext4_journalled_writepage() can race with
jbd2_journal_invalidatepage() stripping buffers from a page and thus
journal head can be freed under hands of jbd2_journal_dirty_metadata().

Fix the problem by getting own journal head reference in
jbd2_journal_dirty_metadata() (and also in jbd2_journal_set_triggers()
which can possibly have the same issue).

Reported-by: Zheng Liu <gnehzuil.liu@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoext4: fix the wrong number of the allocated blocks in ext4_split_extent()
Zheng Liu [Mon, 11 Mar 2013 01:20:23 +0000 (21:20 -0400)]
ext4: fix the wrong number of the allocated blocks in ext4_split_extent()

commit 3a2256702e47f68f921dfad41b1764d05c572329 upstream.

This commit fixes a wrong return value of the number of the allocated
blocks in ext4_split_extent.  When the length of blocks we want to
allocate is greater than the length of the current extent, we return a
wrong number.  Let's see what happens in the following case when we
call ext4_split_extent().

  map: [48, 72]
  ex:  [32, 64, u]

'ex' will be split into two parts:
  ex1: [32, 47, u]
  ex2: [48, 64, w]

'map->m_len' is returned from this function, and the value is 24.  But
the real length is 16.  So it should be fixed.

Meanwhile in this commit we use right length of the allocated blocks
when get_reserved_cluster_alloc in ext4_ext_handle_uninitialized_extents
is called.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosfc: Only use TX push if a single descriptor is to be written
Ben Hutchings [Wed, 27 Feb 2013 16:50:38 +0000 (16:50 +0000)]
sfc: Only use TX push if a single descriptor is to be  written

[ Upstream commit fae8563b25f73dc584a07bcda7a82750ff4f7672 ]

Using TX push when notifying the NIC of multiple new descriptors in
the ring will very occasionally cause the TX DMA engine to re-use an
old descriptor.  This can result in a duplicated or partly duplicated
packet (new headers with old data), or an IOMMU page fault.  This does
not happen when the pushed descriptor is the only one written.

TX push also provides little latency benefit when a packet requires
more than one descriptor.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosfc: Disable soft interrupt handling during efx_device_detach_sync()
Ben Hutchings [Tue, 5 Mar 2013 01:03:47 +0000 (01:03 +0000)]
sfc: Disable soft interrupt handling during  efx_device_detach_sync()

[ Upstream commit 35205b211c8d17a8a0b5e8926cb7c73e9a7ef1ad ]

efx_device_detach_sync() locks all TX queues before marking the device
detached and thus disabling further TX scheduling.  But it can still
be interrupted by TX completions which then result in TX scheduling in
soft interrupt context.  This will deadlock when it tries to acquire
a TX queue lock that efx_device_detach_sync() already acquired.

To avoid deadlock, we must use netif_tx_{,un}lock_bh().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosfc: Detach net device when stopping queues for reconfiguration
Ben Hutchings [Mon, 28 Jan 2013 19:01:06 +0000 (19:01 +0000)]
sfc: Detach net device when stopping queues for  reconfiguration

[ Upstream commit 29c69a4882641285a854d6d03ca5adbba68c0034 ]

We must only ever stop TX queues when they are full or the net device
is not 'ready' so far as the net core, and specifically the watchdog,
is concerned.  Otherwise, the watchdog may fire *immediately* if no
packets have been added to the queue in the last 5 seconds.

The device is ready if all the following are true:

(a) It has a qdisc
(b) It is marked present
(c) It is running
(d) The link is reported up

(a) and (c) are normally true, and must not be changed by a driver.
(d) is under our control, but fake link changes may disturb userland.
This leaves (b).  We already mark the device absent during reset
and self-test, but we need to do the same during MTU changes and ring
reallocation.  We don't need to do this when the device is brought
down because then (c) is already false.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosfc: Fix efx_rx_buf_offset() in the presence of swiotlb
Ben Hutchings [Thu, 10 Jan 2013 23:51:54 +0000 (23:51 +0000)]
sfc: Fix efx_rx_buf_offset() in the presence of  swiotlb

[ Upstream commits 06e63c57acbb1df7c35ebe846ae416a8b88dfafa,
  b590ace09d51cd39744e0f7662c5e4a0d1b5d952 and
  c73e787a8db9117d59b5180baf83203a42ecadca ]

We assume that the mapping between DMA and virtual addresses is done
on whole pages, so we can find the page offset of an RX buffer using
the lower bits of the DMA address.  However, swiotlb maps in units of
2K, breaking this assumption.

Add an explicit page_offset field to struct efx_rx_buffer.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosfc: Properly sync RX DMA buffer when it is not the last in the page
Ben Hutchings [Thu, 20 Dec 2012 18:48:20 +0000 (18:48 +0000)]
sfc: Properly sync RX DMA buffer when it is not the  last in the page

[ Upstream commit 3a68f19d7afb80f548d016effbc6ed52643a8085 ]

We may currently allocate two RX DMA buffers to a page, and only unmap
the page when the second is completed.  We do not sync the first RX
buffer to be completed; this can result in packet loss or corruption
if the last RX buffer completed in a NAPI poll is the first in a page
and is not DMA-coherent.  (In the middle of a NAPI poll, we will
handle the following RX completion and unmap the page *before* looking
at the content of the first buffer.)

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosfc: Fix timekeeping in efx_mcdi_poll()
Ben Hutchings [Sat, 1 Dec 2012 02:21:17 +0000 (02:21 +0000)]
sfc: Fix timekeeping in efx_mcdi_poll()

[ Upstream commit ebf98e797b4e26ad52ace1511a0b503ee60a6cd4 ]

efx_mcdi_poll() uses get_seconds() to read the current time and to
implement a polling timeout.  The use of this function was chosen
partly because it could easily be replaced in a co-sim environment
with a macro that read the simulated time.

Unfortunately the real get_seconds() returns the system time (real
time) which is subject to adjustment by e.g. ntpd.  If the system time
is adjusted forward during a polled MCDI operation, the effective
timeout can be shorter than the intended 10 seconds, resulting in a
spurious failure.  It is also possible for a backward adjustment to
delay detection of a areal failure.

Use jiffies instead, and change MCDI_RPC_TIMEOUT to be denominated in
jiffies.  Also correct rounding of the timeout: check time > finish
(or rather time_after(time, finish)) and not time >= finish.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosfc: lock TX queues when calling netif_device_detach()
Daniel Pieczko [Wed, 17 Oct 2012 12:21:23 +0000 (13:21 +0100)]
sfc: lock TX queues when calling netif_device_detach()

[ Upstream commit c2f3b8e3a44b6fe9e36704e30157ebe1a88c08b1 ]

The assertion of netif_device_present() at the top of
efx_hard_start_xmit() may fail if we don't do this.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosfc: Fix two causes of flush failure
Ben Hutchings [Mon, 23 May 2011 11:18:45 +0000 (12:18 +0100)]
sfc: Fix two causes of flush failure

[ Upstream commits a606f4325dca6950996abbae452d33f2af095f39,
  d5e8cc6c946e0857826dcfbb3585068858445bfe,
  525d9e824018cd7cc8d8d44832ddcd363abfe6e1 ]

The TX DMA engine issues upstream read requests when there is room in
the TX FIFO for the completion. However, the fetches for the rest of
the packet might be delayed by any back pressure.  Since a flush must
wait for an EOP, the entire flush may be delayed by back pressure.

Mitigate this by disabling flow control before the flushes are
started.  Since PF and VF flushes run in parallel introduce
fc_disable, a reference count of the number of flushes outstanding.

The same principle could be applied to Falcon, but that
would bring with it its own testing.

We sometimes hit a "failed to flush" timeout on some TX queues, but the
flushes have completed and the flush completion events seem to go missing.
In this case, we can check the TX_DESC_PTR_TBL register and drain the
queues if the flushes had finished.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
[bwh: Backported to 3.2:
 - Call efx_nic_type::finish_flush() on both success and failure paths
 - Check the TX_DESC_PTR_TBL registers in the polling loop
 - Declare efx_mcdi_set_mac() extern]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosfc: Convert firmware subtypes to native byte order in efx_mcdi_get_board_cfg()
Ben Hutchings [Thu, 6 Sep 2012 23:58:10 +0000 (00:58 +0100)]
sfc: Convert firmware subtypes to native byte order in  efx_mcdi_get_board_cfg()

[ Upstream commit bfeed902946a31692e7a24ed355b6d13ac37d014 ]

On big-endian systems the MTD partition names currently have mangled
subtype numbers and are not recognised by the firmware update tool
(sfupdate).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
[bwh: Backported to 3.2: use old macros for length of firmware subtype array]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosfc: Do not attempt to flush queues if DMA is disabled
Stuart Hodgson [Fri, 30 Mar 2012 12:04:51 +0000 (13:04 +0100)]
sfc: Do not attempt to flush queues if DMA is disabled

[ Upstream commit 3dca9d2dc285faf1910d405b65df845cab061356 ]

efx_nic_fatal_interrupt() disables DMA before scheduling a reset.
After this, we need not and *cannot* flush queues.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoinet: limit length of fragment queue hash table bucket lists
Hannes Frederic Sowa [Fri, 15 Mar 2013 11:32:30 +0000 (11:32 +0000)]
inet: limit length of fragment queue hash table bucket  lists

[ Upstream commit 5a3da1fe9561828d0ca7eca664b16ec2b9bf0055 ]

This patch introduces a constant limit of the fragment queue hash
table bucket list lengths. Currently the limit 128 is choosen somewhat
arbitrary and just ensures that we can fill up the fragment cache with
empty packets up to the default ip_frag_high_thresh limits. It should
just protect from list iteration eating considerable amounts of cpu.

If we reach the maximum length in one hash bucket a warning is printed.
This is implemented on the caller side of inet_frag_find to distinguish
between the different users of inet_fragment.c.

I dropped the out of memory warning in the ipv4 fragment lookup path,
because we already get a warning by the slab allocator.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jesper Dangaard Brouer <jbrouer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agortnetlink: Mask the rta_type when range checking
Vlad Yasevich [Wed, 13 Mar 2013 04:18:58 +0000 (04:18 +0000)]
rtnetlink: Mask the rta_type when range checking

[ Upstream commit a5b8db91442fce9c9713fcd656c3698f1adde1d6 ]

Range/validity checks on rta_type in rtnetlink_rcv_msg() do
not account for flags that may be set.  This causes the function
to return -EINVAL when flags are set on the type (for example
NLA_F_NESTED).

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agotcp: fix skb_availroom()
Eric Dumazet [Thu, 14 Mar 2013 05:40:32 +0000 (05:40 +0000)]
tcp: fix skb_availroom()

[ Upstream commit 16fad69cfe4adbbfa813de516757b87bcae36d93 ]

Chrome OS team reported a crash on a Pixel ChromeBook in TCP stack :

https://code.google.com/p/chromium/issues/detail?id=182056

commit a21d45726acac (tcp: avoid order-1 allocations on wifi and tx
path) did a poor choice adding an 'avail_size' field to skb, while
what we really needed was a 'reserved_tailroom' one.

It would have avoided commit 22b4a4f22da (tcp: fix retransmit of
partially acked frames) and this commit.

Crash occurs because skb_split() is not aware of the 'avail_size'
management (and should not be aware)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Mukesh Agrawal <quiche@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoipv4: fix definition of FIB_TABLE_HASHSZ
Denis V. Lunev [Wed, 13 Mar 2013 00:24:15 +0000 (00:24 +0000)]
ipv4: fix definition of FIB_TABLE_HASHSZ

[ Upstream commit 5b9e12dbf92b441b37136ea71dac59f05f2673a9 ]

a long time ago by the commit

  commit 93456b6d7753def8760b423ac6b986eb9d5a4a95
  Author: Denis V. Lunev <den@openvz.org>
  Date:   Thu Jan 10 03:23:38 2008 -0800

    [IPV4]: Unify access to the routing tables.

the defenition of FIB_HASH_TABLE size has obtained wrong dependency:
it should depend upon CONFIG_IP_MULTIPLE_TABLES (as was in the original
code) but it was depended from CONFIG_IP_ROUTE_MULTIPATH

This patch returns the situation to the original state.

The problem was spotted by Tingwei Liu.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Tingwei Liu <tingw.liu@gmail.com>
CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosctp: don't break the loop while meeting the active_path so as to find the matched...
Xufeng Zhang [Thu, 7 Mar 2013 21:39:37 +0000 (21:39 +0000)]
sctp: don't break the loop while meeting the  active_path so as to find the matched transport

[ Upstream commit 2317f449af30073cfa6ec8352e4a65a89e357bdd ]

sctp_assoc_lookup_tsn() function searchs which transport a certain TSN
was sent on, if not found in the active_path transport, then go search
all the other transports in the peer's transport_addr_list, however, we
should continue to the next entry rather than break the loop when meet
the active_path transport.

Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosctp: Use correct sideffect command in duplicate cookie handling
Vlad Yasevich [Tue, 12 Mar 2013 15:53:23 +0000 (15:53 +0000)]
sctp: Use correct sideffect command in duplicate  cookie handling

[ Upstream commit f2815633504b442ca0b0605c16bf3d88a3a0fcea ]

When SCTP is done processing a duplicate cookie chunk, it tries
to delete a newly created association.  For that, it has to set
the right association for the side-effect processing to work.
However, when it uses the SCTP_CMD_NEW_ASOC command, that performs
more work then really needed (like hashing the associationa and
assigning it an id) and there is no point to do that only to
delete the association as a next step.  In fact, it also creates
an impossible condition where an association may be found by
the getsockopt() call, and that association is empty.  This
causes a crash in some sctp getsockopts.

The solution is rather simple.  We simply use SCTP_CMD_SET_ASOC
command that doesn't have all the overhead and does exactly
what we need.

Reported-by: Karl Heiss <kheiss@gmail.com>
Tested-by: Karl Heiss <kheiss@gmail.com>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agobonding: don't call update_speed_duplex() under spinlocks
Veaceslav Falico [Tue, 12 Mar 2013 06:31:32 +0000 (06:31 +0000)]
bonding: don't call update_speed_duplex() under  spinlocks

[ Upstream commit 876254ae2758d50dcb08c7bd00caf6a806571178 ]

bond_update_speed_duplex() might sleep while calling underlying slave's
routines. Move it out of atomic context in bond_enslave() and remove it
from bond_miimon_commit() - it was introduced by commit 546add79, however
when the slave interfaces go up/change state it's their responsibility to
fire NETDEV_UP/NETDEV_CHANGE events so that bonding can properly update
their speed.

I've tested it on all combinations of ifup/ifdown, autoneg/speed/duplex
changes, remote-controlled and local, on (not) MII-based cards. All changes
are visible.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agonetconsole: don't call __netpoll_cleanup() while atomic
Veaceslav Falico [Mon, 11 Mar 2013 00:21:48 +0000 (00:21 +0000)]
netconsole: don't call __netpoll_cleanup() while  atomic

[ Upstream commit 3f315bef23075ea8a98a6fe4221a83b83456d970 ]

__netpoll_cleanup() is called in netconsole_netdev_event() while holding a
spinlock. Release/acquire the spinlock before/after it and restart the
loop. Also, disable the netconsole completely, because we won't have chance
after the restart of the loop, and might end up in a situation where
nt->enabled == 1 and nt->np.dev == NULL.

Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agonet/ipv4: Ensure that location of timestamp option is stored
David Ward [Mon, 11 Mar 2013 10:43:39 +0000 (10:43 +0000)]
net/ipv4: Ensure that location of timestamp option is  stored

[ Upstream commit 4660c7f498c07c43173142ea95145e9dac5a6d14 ]

This is needed in order to detect if the timestamp option appears
more than once in a packet, to remove the option if the packet is
fragmented, etc. My previous change neglected to store the option
location when the router addresses were prespecified and Pointer >
Length. But now the option location is also stored when Flag is an
unrecognized value, to ensure these option handling behaviors are
still performed.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agosunsu: Fix panic in case of nonexistent port at "console=ttySY" cmdline option
Tkhai Kirill [Sat, 23 Feb 2013 23:01:15 +0000 (23:01 +0000)]
sunsu: Fix panic in case of nonexistent port at  "console=ttySY" cmdline option

[ Upstream commit cb29529ea0030e60ef1bbbf8399a43d397a51526 ]

If a machine has X (X < 4) sunsu ports and cmdline
option "console=ttySY" is passed, where X < Y <= 4,
than the following panic happens:

Unable to handle kernel NULL pointer dereference
TPC: <sunsu_console_setup+0x78/0xe0>
RPC: <sunsu_console_setup+0x74/0xe0>
I7: <register_console+0x378/0x3e0>
Call Trace:
 [0000000000453a38] register_console+0x378/0x3e0
 [0000000000576fa0] uart_add_one_port+0x2e0/0x340
 [000000000057af40] su_probe+0x160/0x2e0
 [00000000005b8a4c] platform_drv_probe+0xc/0x20
 [00000000005b6c2c] driver_probe_device+0x12c/0x220
 [00000000005b6da8] __driver_attach+0x88/0xa0
 [00000000005b4df4] bus_for_each_dev+0x54/0xa0
 [00000000005b5a54] bus_add_driver+0x154/0x260
 [00000000005b7190] driver_register+0x50/0x180
 [00000000006d250c] sunsu_init+0x18c/0x1e0
 [00000000006c2668] do_one_initcall+0xe8/0x160
 [00000000006c282c] kernel_init_freeable+0x12c/0x1e0
 [0000000000603764] kernel_init+0x4/0x100
 [0000000000405f64] ret_from_syscall+0x1c/0x2c
 [0000000000000000]           (null)

1)Fix the panic;
2)Increment registered port number every successful
probe.

Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoisofs: avoid info leak on export
Mathias Krause [Thu, 12 Jul 2012 06:46:54 +0000 (08:46 +0200)]
isofs: avoid info leak on export

commit fe685aabf7c8c9f138e5ea900954d295bf229175 upstream.

For type 1 the parent_offset member in struct isofs_fid gets copied
uninitialized to userland. Fix this by initializing it to 0.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoudf: avoid info leak on export
Mathias Krause [Thu, 12 Jul 2012 06:46:55 +0000 (08:46 +0200)]
udf: avoid info leak on export

commit 0143fc5e9f6f5aad4764801015bc8d4b4a278200 upstream.

For type 0x51 the udf.parent_partref member in struct fid gets copied
uninitialized to userland. Fix this by initializing it to 0.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agotools: hv: Netlink source address validation allows DoS
Tomas Hozza [Thu, 8 Nov 2012 09:53:29 +0000 (10:53 +0100)]
tools: hv: Netlink source address validation allows DoS

commit 95a69adab9acfc3981c504737a2b6578e4d846ef upstream.

The source code without this patch caused hypervkvpd to exit when it processed
a spoofed Netlink packet which has been sent from an untrusted local user.
Now Netlink messages with a non-zero nl_pid source address are ignored
and a warning is printed into the syslog.

Signed-off-by: Tomas Hozza <thozza@redhat.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoselinux: use GFP_ATOMIC under spin_lock
Dan Carpenter [Sat, 16 Mar 2013 09:48:11 +0000 (12:48 +0300)]
selinux: use GFP_ATOMIC under spin_lock

commit 4502403dcf8f5c76abd4dbab8726c8e4ecb5cd34 upstream.

The call tree here is:

sk_clone_lock()              <- takes bh_lock_sock(newsk);
xfrm_sk_clone_policy()
__xfrm_sk_clone_policy()
clone_policy()               <- uses GFP_ATOMIC for allocations
security_xfrm_policy_clone()
security_ops->xfrm_policy_clone_security()
selinux_xfrm_policy_clone()

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agovhost/net: fix heads usage of ubuf_info
Michael S. Tsirkin [Sun, 17 Mar 2013 02:46:09 +0000 (02:46 +0000)]
vhost/net: fix heads usage of ubuf_info

commit 46aa92d1ba162b4b3d6b7102440e459d4e4ee255 upstream.

ubuf info allocator uses guest controlled head as an index,
so a malicious guest could put the same head entry in the ring twice,
and we will get two callbacks on the same value.
To fix use upend_idx which is guaranteed to be unique.

Reported-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agortlwifi: rtl8192cu: Fix problem that prevents reassociation
Larry Finger [Wed, 13 Mar 2013 15:28:13 +0000 (10:28 -0500)]
rtlwifi: rtl8192cu: Fix problem that prevents reassociation

commit 9437a248e7cac427c898bdb11bd1ac6844a1ead4 upstream.

The driver was failing to clear the BSSID when a disconnect happened. That
prevented a reconnection. This problem is reported at
https://bugzilla.redhat.com/show_bug.cgi?id=789605,
https://bugzilla.redhat.com/show_bug.cgi?id=866786,
https://bugzilla.redhat.com/show_bug.cgi?id=906734, and
https://bugzilla.kernel.org/show_bug.cgi?id=46171.

Thanks to Jussi Kivilinna for making the critical observation
that led to the solution.

Reported-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Tested-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Tested-by: Alessandro Lannocca <alessandro.lannocca@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agopowerpc: Fix cputable entry for 970MP rev 1.0
Benjamin Herrenschmidt [Tue, 12 Mar 2013 22:55:02 +0000 (09:55 +1100)]
powerpc: Fix cputable entry for 970MP rev 1.0

commit d63ac5f6cf31c8a83170a9509b350c1489a7262b upstream.

Commit 44ae3ab3358e962039c36ad4ae461ae9fb29596c forgot to update
the entry for the 970MP rev 1.0 processor when moving some CPU
features bits to the MMU feature bit mask. This breaks booting
on some rare G5 models using that chip revision.

Reported-by: Phileas Fogg <phileas-fogg@mail.ru>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agortlwifi: rtl8192cu: Fix schedule while atomic bug splat
Larry Finger [Wed, 27 Feb 2013 20:10:30 +0000 (14:10 -0600)]
rtlwifi: rtl8192cu: Fix schedule while atomic bug splat

commit 664899786cb49cb52f620e06ac19c0be524a7cfa upstream.

When run at debug 3 or higher, rtl8192cu reports a BUG as follows:

BUG: scheduling while atomic: kworker/u:0/5281/0x00000002
INFO: lockdep is turned off.
Modules linked in: rtl8192cu rtl8192c_common rtlwifi fuse af_packet bnep bluetooth b43 mac80211 cfg80211 ipv6 snd_hda_codec_conexant kvm_amd k
vm snd_hda_intel snd_hda_codec bcma rng_core snd_pcm ssb mmc_core snd_seq snd_timer snd_seq_device snd i2c_nforce2 sr_mod pcmcia forcedeth i2c_core soundcore
 cdrom sg serio_raw k8temp hwmon joydev ac battery pcmcia_core snd_page_alloc video button wmi autofs4 ext4 mbcache jbd2 crc16 thermal processor scsi_dh_alua
 scsi_dh_hp_sw scsi_dh_rdac scsi_dh_emc scsi_dh ata_generic pata_acpi pata_amd [last unloaded: rtlwifi]
Pid: 5281, comm: kworker/u:0 Tainted: G        W    3.8.0-wl+ #119
Call Trace:
 [<ffffffff814531e7>] __schedule_bug+0x62/0x70
 [<ffffffff81459af0>] __schedule+0x730/0xa30
 [<ffffffff81326e49>] ? usb_hcd_link_urb_to_ep+0x19/0xa0
 [<ffffffff8145a0d4>] schedule+0x24/0x70
 [<ffffffff814575ec>] schedule_timeout+0x18c/0x2f0
 [<ffffffff81459ec0>] ? wait_for_common+0x40/0x180
 [<ffffffff8133f461>] ? ehci_urb_enqueue+0xf1/0xee0
 [<ffffffff810a579d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81459f65>] wait_for_common+0xe5/0x180
 [<ffffffff8107d1c0>] ? try_to_wake_up+0x2d0/0x2d0
 [<ffffffff8145a08e>] wait_for_completion_timeout+0xe/0x10
 [<ffffffff8132ab1c>] usb_start_wait_urb+0x8c/0x100
 [<ffffffff8132adf9>] usb_control_msg+0xd9/0x130
 [<ffffffffa057dd8d>] _usb_read_sync+0xcd/0x140 [rtlwifi]
 [<ffffffffa057de0e>] _usb_read32_sync+0xe/0x10 [rtlwifi]
 [<ffffffffa04b0555>] rtl92cu_update_hal_rate_table+0x1a5/0x1f0 [rtl8192cu]

The cause is a synchronous read from routine rtl92cu_update_hal_rate_table().
The resulting output is not critical, thus the debug statement is
deleted.

Reported-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
[bwh: Backported to 3.2: the deleted code is slightly different]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agomwifiex: fix potential out-of-boundary access to ibss rate table
Bing Zhao [Fri, 8 Mar 2013 04:00:16 +0000 (20:00 -0800)]
mwifiex: fix potential out-of-boundary access to ibss rate table

commit 5f0fabf84d7b52f979dcbafa3d3c530c60d9a92c upstream.

smatch found this error:

CHECK   drivers/net/wireless/mwifiex/join.c
  drivers/net/wireless/mwifiex/join.c:1121
  mwifiex_cmd_802_11_ad_hoc_join()
  error: testing array offset 'i' after use.

Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agos390/mm: fix flush_tlb_kernel_range()
Heiko Carstens [Mon, 4 Mar 2013 13:14:11 +0000 (14:14 +0100)]
s390/mm: fix flush_tlb_kernel_range()

commit f6a70a07079518280022286a1dceb797d12e1edf upstream.

Our flush_tlb_kernel_range() implementation calls __tlb_flush_mm() with
&init_mm as argument. __tlb_flush_mm() however will only flush tlbs
for the passed in mm if its mm_cpumask is not empty.

For the init_mm however its mm_cpumask has never any bits set. Which in
turn means that our flush_tlb_kernel_range() implementation doesn't
work at all.

This can be easily verified with a vmalloc/vfree loop which allocates
a page, writes to it and then frees the page again. A crash will follow
almost instantly.

To fix this remove the cpumask_empty() check in __tlb_flush_mm() since
there shouldn't be too many mms with a zero mm_cpumask, besides the
init_mm of course.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoi915: initialize CADL in opregion
Lekensteyn [Mon, 25 Jun 2012 22:36:24 +0000 (00:36 +0200)]
i915: initialize CADL in opregion

commit d627b62ff8d4d36761adbcd90ff143d79c94ab22 upstream.

This is rather a hack to fix brightness hotkeys on a Clevo laptop. CADL is not
used anywhere in the driver code at the moment, but it could be used in BIOS as
is the case with the Clevo laptop.

The Clevo B7130 requires the CADL field to contain at least the ID of
the LCD device. If this field is empty, the ACPI methods that are called
on pressing brightness / display switching hotkeys will not trigger a
notification. As a result, it appears as no hotkey has been pressed.

Reference: https://bugs.freedesktop.org/show_bug.cgi?id=45452
Tested-by: Peter Wu <lekensteyn@gmail.com>
Signed-off-by: Peter Wu <lekensteyn@gmail.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoperf: Revert duplicated commit
Ben Hutchings [Thu, 21 Mar 2013 03:25:51 +0000 (03:25 +0000)]
perf: Revert duplicated commit

This reverts commit 923415295307845e614589c1cce62abedd4d1731
'perf: Fix parsing of __print_flags() in TP_printk()'.  The same
change was already included in 3.2 as commit
d06c27b22aa66e48e32f03f9387328a9af9b0625 but in 3.2.1 this change
was wrongly applied to similar code in a different function.

Thanks to Jiri for pointing this out in 3.0.y.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
11 years agodrm/i915: Increase the RC6p threshold.
Stéphane Marchesin [Wed, 30 Jan 2013 03:41:59 +0000 (19:41 -0800)]
drm/i915: Increase the RC6p threshold.

commit 0920a48719f1ceefc909387a64f97563848c7854 upstream.

This increases GEN6_RC6p_THRESHOLD from 100000 to 150000. For some
reason this avoids the gen6_gt_check_fifodbg.isra warnings and
associated GPU lockups, which makes my ivy bridge machine stable.

Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years ago6lowpan: Fix endianness issue in is_addr_link_local().
YOSHIFUJI Hideaki / 吉藤英明 [Sat, 9 Mar 2013 09:11:57 +0000 (09:11 +0000)]
6lowpan: Fix endianness issue in is_addr_link_local().

[ Upstream commit 9026c4927254f5bea695cc3ef2e255280e6a3011 ]

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agodcbnl: fix various netlink info leaks
Mathias Krause [Sat, 9 Mar 2013 05:52:21 +0000 (05:52 +0000)]
dcbnl: fix various netlink info leaks

[ Upstream commit 29cd8ae0e1a39e239a3a7b67da1986add1199fc0 ]

The dcb netlink interface leaks stack memory in various places:
* perm_addr[] buffer is only filled at max with 12 of the 32 bytes but
  copied completely,
* no in-kernel driver fills all fields of an IEEE 802.1Qaz subcommand,
  so we're leaking up to 58 bytes for ieee_ets structs, up to 136 bytes
  for ieee_pfc structs, etc.,
* the same is true for CEE -- no in-kernel driver fills the whole
  struct,

Prevent all of the above stack info leaks by properly initializing the
buffers/structures involved.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agortnl: fix info leak on RTM_GETLINK request for VF devices
Mathias Krause [Sat, 9 Mar 2013 05:52:20 +0000 (05:52 +0000)]
rtnl: fix info leak on RTM_GETLINK request for VF  devices

[ Upstream commit 84d73cd3fb142bf1298a8c13fd4ca50fd2432372 ]

Initialize the mac address buffer with 0 as the driver specific function
will probably not fill the whole buffer. In fact, all in-kernel drivers
fill only ETH_ALEN of the MAX_ADDR_LEN bytes, i.e. 6 of the 32 possible
bytes. Therefore we currently leak 26 bytes of stack memory to userland
via the netlink interface.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoipv6: stop multicast forwarding to process interface scoped addresses
Hannes Frederic Sowa [Fri, 8 Mar 2013 02:07:23 +0000 (02:07 +0000)]
ipv6: stop multicast forwarding to process interface  scoped addresses

[ Upstream commit ddf64354af4a702ee0b85d0a285ba74c7278a460 ]

v2:
a) used struct ipv6_addr_props

v3:
a) reverted changes for ipv6_addr_props

v4:
a) do not use __ipv6_addr_needs_scope_id

Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agobridging: fix rx_handlers return code
Cristian Bercaru [Fri, 8 Mar 2013 07:03:38 +0000 (07:03 +0000)]
bridging: fix rx_handlers return code

[ Upstream commit 3bc1b1add7a8484cc4a261c3e128dbe1528ce01f ]

The frames for which rx_handlers return RX_HANDLER_CONSUMED are no longer
counted as dropped. They are counted as successfully received by
'netif_receive_skb'.

This allows network interface drivers to correctly update their RX-OK and
RX-DRP counters based on the result of 'netif_receive_skb'.

Signed-off-by: Cristian Bercaru <B43982@freescale.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agonetlabel: correctly list all the static label mappings
Paul Moore [Wed, 6 Mar 2013 11:45:24 +0000 (11:45 +0000)]
netlabel: correctly list all the static label mappings

[ Upstream commits 0c1233aba1e948c37f6dc7620cb7c253fcd71ce9 and
  a6a8fe950e1b8596bb06f2c89c3a1a4bf2011ba9 ]

When we have a large number of static label mappings that spill across
the netlink message boundary we fail to properly save our state in the
netlink_callback struct which causes us to repeat the same listings.
This patch fixes this problem by saving the state correctly between
calls to the NetLabel static label netlink "dumpit" routines.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agomacvlan: Set IFF_UNICAST_FLT flag to prevent unnecessary promisc mode.
Vlad Yasevich [Thu, 7 Mar 2013 10:21:48 +0000 (10:21 +0000)]
macvlan: Set IFF_UNICAST_FLT flag to prevent  unnecessary promisc mode.

[ Upstream commit 87ab7f6f2874f1115817e394a7ed2dea1c72549e ]

Macvlan already supports hw address filters.  Set the IFF_UNICAST_FLT
so that it doesn't needlesly enter PROMISC mode when macvlans are
stacked.

Signed-of-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agotun: add a missing nf_reset() in tun_net_xmit()
Eric Dumazet [Wed, 6 Mar 2013 11:02:37 +0000 (11:02 +0000)]
tun: add a missing nf_reset() in tun_net_xmit()

[ Upstream commit f8af75f3517a24838a36eb5797a1a3e60bf9e276 ]

Dave reported following crash :

general protection fault: 0000 [#1] SMP
CPU 2
Pid: 25407, comm: qemu-kvm Not tainted 3.7.9-205.fc18.x86_64 #1 Hewlett-Packard HP Z400 Workstation/0B4Ch
RIP: 0010:[<ffffffffa0399bd5>]  [<ffffffffa0399bd5>] destroy_conntrack+0x35/0x120 [nf_conntrack]
RSP: 0018:ffff880276913d78  EFLAGS: 00010206
RAX: 50626b6b7876376c RBX: ffff88026e530d68 RCX: ffff88028d158e00
RDX: ffff88026d0d5470 RSI: 0000000000000011 RDI: 0000000000000002
RBP: ffff880276913d88 R08: 0000000000000000 R09: ffff880295002900
R10: 0000000000000000 R11: 0000000000000003 R12: ffffffff81ca3b40
R13: ffffffff8151a8e0 R14: ffff880270875000 R15: 0000000000000002
FS:  00007ff3bce38a00(0000) GS:ffff88029fc40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fd1430bd000 CR3: 000000027042b000 CR4: 00000000000027e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu-kvm (pid: 25407, threadinfo ffff880276912000, task ffff88028c369720)
Stack:
 ffff880156f59100 ffff880156f59100 ffff880276913d98 ffffffff815534f7
 ffff880276913db8 ffffffff8151a74b ffff880270875000 ffff880156f59100
 ffff880276913dd8 ffffffff8151a5a6 ffff880276913dd8 ffff88026d0d5470
Call Trace:
 [<ffffffff815534f7>] nf_conntrack_destroy+0x17/0x20
 [<ffffffff8151a74b>] skb_release_head_state+0x7b/0x100
 [<ffffffff8151a5a6>] __kfree_skb+0x16/0xa0
 [<ffffffff8151a666>] kfree_skb+0x36/0xa0
 [<ffffffff8151a8e0>] skb_queue_purge+0x20/0x40
 [<ffffffffa02205f7>] __tun_detach+0x117/0x140 [tun]
 [<ffffffffa022184c>] tun_chr_close+0x3c/0xd0 [tun]
 [<ffffffff8119669c>] __fput+0xec/0x240
 [<ffffffff811967fe>] ____fput+0xe/0x10
 [<ffffffff8107eb27>] task_work_run+0xa7/0xe0
 [<ffffffff810149e1>] do_notify_resume+0x71/0xb0
 [<ffffffff81640152>] int_signal+0x12/0x17
Code: 00 00 04 48 89 e5 41 54 53 48 89 fb 4c 8b a7 e8 00 00 00 0f 85 de 00 00 00 0f b6 73 3e 0f b7 7b 2a e8 10 40 00 00 48 85 c0 74 0e <48> 8b 40 28 48 85 c0 74 05 48 89 df ff d0 48 c7 c7 08 6a 3a a0
RIP  [<ffffffffa0399bd5>] destroy_conntrack+0x35/0x120 [nf_conntrack]
 RSP <ffff880276913d78>

This is because tun_net_xmit() needs to call nf_reset()
before queuing skb into receive_queue

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agotcp: fix double-counted receiver RTT when leaving receiver fast path
Neal Cardwell [Mon, 4 Mar 2013 06:23:05 +0000 (06:23 +0000)]
tcp: fix double-counted receiver RTT when leaving  receiver fast path

[ Upstream commit aab2b4bf224ef8358d262f95b568b8ad0cecf0a0 ]

We should not update ts_recent and call tcp_rcv_rtt_measure_ts() both
before and after going to step5. That wastes CPU and double-counts the
receiver-side RTT sample.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agonet: ipv6: Don't purge default router if accept_ra=2
Lorenzo Colitti [Sun, 3 Mar 2013 20:46:46 +0000 (20:46 +0000)]
net: ipv6: Don't purge default router if accept_ra=2

[ Upstream commit 3e8b0ac3e41e3c882222a5522d5df7212438ab51 ]

Setting net.ipv6.conf.<interface>.accept_ra=2 causes the kernel
to accept RAs even when forwarding is enabled. However, enabling
forwarding purges all default routes on the system, breaking
connectivity until the next RA is received. Fix this by not
purging default routes on interfaces that have accept_ra=2.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agords: limit the size allocated by rds_message_alloc()
Cong Wang [Sun, 3 Mar 2013 16:18:11 +0000 (16:18 +0000)]
rds: limit the size allocated by rds_message_alloc()

[ Upstream commit ece6b0a2b25652d684a7ced4ae680a863af041e0 ]

Dave Jones reported the following bug:

"When fed mangled socket data, rds will trust what userspace gives it,
and tries to allocate enormous amounts of memory larger than what
kmalloc can satisfy."

WARNING: at mm/page_alloc.c:2393 __alloc_pages_nodemask+0xa0d/0xbe0()
Hardware name: GA-MA78GM-S2H
Modules linked in: vmw_vsock_vmci_transport vmw_vmci vsock fuse bnep dlci bridge 8021q garp stp mrp binfmt_misc l2tp_ppp l2tp_core rfcomm s
Pid: 24652, comm: trinity-child2 Not tainted 3.8.0+ #65
Call Trace:
 [<ffffffff81044155>] warn_slowpath_common+0x75/0xa0
 [<ffffffff8104419a>] warn_slowpath_null+0x1a/0x20
 [<ffffffff811444ad>] __alloc_pages_nodemask+0xa0d/0xbe0
 [<ffffffff8100a196>] ? native_sched_clock+0x26/0x90
 [<ffffffff810b2128>] ? trace_hardirqs_off_caller+0x28/0xc0
 [<ffffffff810b21cd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff811861f8>] alloc_pages_current+0xb8/0x180
 [<ffffffff8113eaaa>] __get_free_pages+0x2a/0x80
 [<ffffffff811934fe>] kmalloc_order_trace+0x3e/0x1a0
 [<ffffffff81193955>] __kmalloc+0x2f5/0x3a0
 [<ffffffff8104df0c>] ? local_bh_enable_ip+0x7c/0xf0
 [<ffffffffa0401ab3>] rds_message_alloc+0x23/0xb0 [rds]
 [<ffffffffa04043a1>] rds_sendmsg+0x2b1/0x990 [rds]
 [<ffffffff810b21cd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff81564620>] sock_sendmsg+0xb0/0xe0
 [<ffffffff810b2052>] ? get_lock_stats+0x22/0x70
 [<ffffffff810b24be>] ? put_lock_stats.isra.23+0xe/0x40
 [<ffffffff81567f30>] sys_sendto+0x130/0x180
 [<ffffffff810b872d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff816c547b>] ? _raw_spin_unlock_irq+0x3b/0x60
 [<ffffffff816cd767>] ? sysret_check+0x1b/0x56
 [<ffffffff810b8695>] ? trace_hardirqs_on_caller+0x115/0x1a0
 [<ffffffff81341d8e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff816cd742>] system_call_fastpath+0x16/0x1b
---[ end trace eed6ae990d018c8b ]---

Reported-by: Dave Jones <davej@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agol2tp: Restore socket refcount when sendmsg succeeds
Guillaume Nault [Fri, 1 Mar 2013 05:02:02 +0000 (05:02 +0000)]
l2tp: Restore socket refcount when sendmsg succeeds

[ Upstream commit 8b82547e33e85fc24d4d172a93c796de1fefa81a ]

The sendmsg() syscall handler for PPPoL2TP doesn't decrease the socket
reference counter after successful transmissions. Any successful
sendmsg() call from userspace will then increase the reference counter
forever, thus preventing the kernel's session and tunnel data from
being freed later on.

The problem only happens when writing directly on L2TP sockets.
PPP sockets attached to L2TP are unaffected as the PPP subsystem
uses pppol2tp_xmit() which symmetrically increase/decrease reference
counters.

This patch adds the missing call to sock_put() before returning from
pppol2tp_sendmsg().

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoperf,x86: fix link failure for non-Intel configs
David Rientjes [Sun, 17 Mar 2013 22:49:10 +0000 (15:49 -0700)]
perf,x86: fix link failure for non-Intel configs

commit 6c4d3bc99b3341067775efd4d9d13cc8e655fd7c upstream.

Commit 1d9d8639c063 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") introduces a link failure since
perf_restore_debug_store() is only defined for CONFIG_CPU_SUP_INTEL:

arch/x86/power/built-in.o: In function `restore_processor_state':
(.text+0x45c): undefined reference to `perf_restore_debug_store'

Fix it by defining the dummy function appropriately.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoperf,x86: fix wrmsr_on_cpu() warning on suspend/resume
Linus Torvalds [Sun, 17 Mar 2013 22:44:43 +0000 (15:44 -0700)]
perf,x86: fix wrmsr_on_cpu() warning on suspend/resume

commit 2a6e06b2aed6995af401dcd4feb5e79a0c7ea554 upstream.

Commit 1d9d8639c063 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") fixed a crash when doing PEBS performance profiling
after resuming, but in using init_debug_store_on_cpu() to restore the
DS_AREA mtrr it also resulted in a new WARN_ON() triggering.

init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU
cross-calls to do the MSR update.  Which is not really valid at the
early resume stage, and the warning is quite reasonable.  Now, it all
happens to _work_, for the simple reason that smp_call_function_single()
ends up just doing the call directly on the CPU when the CPU number
matches, but we really should just do the wrmsr() directly instead.

This duplicates the wrmsr() logic, but hopefully we can just remove the
wrmsr_on_cpu() version eventually.

Reported-and-tested-by: Parag Warudkar <parag.lkml@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoperf,x86: fix kernel crash with PEBS/BTS after suspend/resume
Stephane Eranian [Fri, 15 Mar 2013 13:26:07 +0000 (14:26 +0100)]
perf,x86: fix kernel crash with PEBS/BTS after suspend/resume

commit 1d9d8639c063caf6efc2447f5f26aa637f844ff6 upstream.

This patch fixes a kernel crash when using precise sampling (PEBS)
after a suspend/resume. Turns out the CPU notifier code is not invoked
on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly
by the kernel and keeps it power-on/resume value of 0 causing any PEBS
measurement to crash when running on CPU0.

The workaround is to add a hook in the actual resume code to restore
the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0,
the DS_AREA will be restored twice but this is harmless.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoTTY: do not reset master's packet mode
Jiri Slaby [Tue, 15 Jan 2013 22:26:22 +0000 (23:26 +0100)]
TTY: do not reset master's packet mode

commit b81273a132177edd806476b953f6afeb17b786d5 upstream.

Now that login from util-linux is forced to drop all references to a
TTY which it wants to hangup (to reach reference count 1) we are
seeing issues with telnet. When login closes its last reference to the
slave PTY, it also resets packet mode on the *master* side. And we
have a race here.

What telnet does is fork+exec of `login'. Then there are two
scenarios:
* `login' closes the slave TTY and resets thus master's packet mode,
  but even now telnet properly sets the mode, or
* `telnetd' sets packet mode on the master, `login' closes the slave
  TTY and resets master's packet mode.

The former case is OK. However the latter happens in much more cases,
by the order of magnitude to be precise. So when one tries to login to
such a messed telnet setup, they see the following:
inux login:
            ogin incorrect

Note the missing first letters -- telnet thinks it is still in the
packet mode, so when it receives "linux login" from `login', it
considers "l" as the type of the packet and strips it.

SuS does not mention how the implementation should behave. Both BSDs I
checked (Free and Net) do not reset the flag upon the last close.

By this I am resurrecting an old bug, see References. We are hitting
it regularly now, i.e. with updated util-linux, ergo login.

Here, I am changing a behavior introduced back in 2.1 times. It would
better have a long time testing before goes upstream.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Bryan Mason <bmason@redhat.com>
References: https://lkml.org/lkml/2009/11/11/223
References: https://bugzilla.redhat.com/show_bug.cgi?id=504703
References: https://bugzilla.novell.com/show_bug.cgi?id=797042
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoLinux 3.2.41 v3.2.41
Ben Hutchings [Wed, 20 Mar 2013 15:03:42 +0000 (15:03 +0000)]
Linux 3.2.41

11 years agoNLS: improve UTF8 -> UTF16 string conversion routine
Alan Stern [Thu, 17 Nov 2011 21:42:19 +0000 (16:42 -0500)]
NLS: improve UTF8 -> UTF16 string conversion routine

commit 0720a06a7518c9d0c0125bd5d1f3b6264c55c3dd upstream.

The utf8s_to_utf16s conversion routine needs to be improved.  Unlike
its utf16s_to_utf8s sibling, it doesn't accept arguments specifying
the maximum length of the output buffer or the endianness of its
16-bit output.

This patch (as1501) adds the two missing arguments, and adjusts the
only two places in the kernel where the function is called.  A
follow-on patch will add a third caller that does utilize the new
capabilities.

The two conversion routines are still annoyingly inconsistent in the
way they handle invalid byte combinations.  But that's a subject for a
different patch.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agobtrfs: use rcu_barrier() to wait for bdev puts at unmount
Eric Sandeen [Sat, 9 Mar 2013 15:18:39 +0000 (15:18 +0000)]
btrfs: use rcu_barrier() to wait for bdev puts at unmount

commit bc178622d40d87e75abc131007342429c9b03351 upstream.

Doing this would reliably fail with -EBUSY for me:

# mount /dev/sdb2 /mnt/scratch; umount /mnt/scratch; mkfs.btrfs -f /dev/sdb2
...
unable to open /dev/sdb2: Device or resource busy

because mkfs.btrfs tries to open the device O_EXCL, and somebody still has it.

Using systemtap to track bdev gets & puts shows a kworker thread doing a
blkdev put after mkfs attempts a get; this is left over from the unmount
path:

btrfs_close_devices
__btrfs_close_devices
call_rcu(&device->rcu, free_device);
free_device
INIT_WORK(&device->rcu_work, __free_device);
schedule_work(&device->rcu_work);

so unmount might complete before __free_device fires & does its blkdev_put.

Adding an rcu_barrier() to btrfs_close_devices() causes unmount to wait
until all blkdev_put()s are done, and the device is truly free once
unmount completes.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoloopdev: remove an user triggerable oops
Guo Chao [Thu, 21 Feb 2013 23:16:49 +0000 (15:16 -0800)]
loopdev: remove an user triggerable oops

commit b1a6650406875b9097a032eed89af50682fe1160 upstream.

When loopdev is built as module and we pass an invalid parameter,
loop_init() will return directly without deregister misc device, which
will cause an oops when insert loop module next time because we left some
garbage in the misc device list.

Test case:
sudo modprobe loop max_part=1024
(failed due to invalid parameter)
sudo modprobe loop
(oops)

Clean up nicely to avoid such oops.

Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: M. Hindess <hindessm@uk.ibm.com>
Cc: Nikanth Karthikesan <knikanth@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoloopdev: fix a deadlock
Guo Chao [Thu, 21 Feb 2013 23:16:45 +0000 (15:16 -0800)]
loopdev: fix a deadlock

commit 5370019dc2d2c2ff90e95d181468071362934f3a upstream.

bd_mutex and lo_ctl_mutex can be held in different order.

Path #1:

blkdev_open
 blkdev_get
  __blkdev_get (hold bd_mutex)
   lo_open (hold lo_ctl_mutex)

Path #2:

blkdev_ioctl
 lo_ioctl (hold lo_ctl_mutex)
  lo_set_capacity (hold bd_mutex)

Lockdep does not report it, because path #2 actually holds a subclass of
lo_ctl_mutex.  This subclass seems creep into the code by mistake.  The
patch author actually just mentioned it in the changelog, see commit
f028f3b2 ("loop: fix circular locking in loop_clr_fd()"), also see:

http://marc.info/?l=linux-kernel&m=123806169129727&w=2

Path #2 hold bd_mutex to call bd_set_size(), I've protected it
with i_mutex in a previous patch, so drop bd_mutex at this site.

Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: M. Hindess <hindessm@uk.ibm.com>
Cc: Nikanth Karthikesan <knikanth@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoblock: use i_size_write() in bd_set_size()
Guo Chao [Thu, 21 Feb 2013 23:16:42 +0000 (15:16 -0800)]
block: use i_size_write() in bd_set_size()

commit d646a02a9d44d1421f273ae3923d97b47b918176 upstream.

blkdev_ioctl(GETBLKSIZE) uses i_size_read() to read size of block device.
If we update block size directly, reader may see intermediate result in
some machines and configurations.  Use i_size_write() instead.

Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: M. Hindess <hindessm@uk.ibm.com>
Cc: Nikanth Karthikesan <knikanth@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agoxen-netfront: delay gARP until backend switches to Connected
Laszlo Ersek [Sun, 11 Dec 2011 01:48:59 +0000 (01:48 +0000)]
xen-netfront: delay gARP until backend switches to Connected

commit 08e34eb14fe4cfd934b5c169a7682a969457c4ea upstream.

After a guest is live migrated, the xen-netfront driver emits a gratuitous
ARP message, so that networking hardware on the target host's subnet can
take notice, and public routing to the guest is re-established. However,
if the packet appears on the backend interface before the backend is added
to the target host's bridge, the packet is lost, and the migrated guest's
peers become unable to talk to the guest.

A sufficient two-parts condition to prevent the above is:

(1) ensure that the backend only moves to Connected xenbus state after its
hotplug scripts completed, ie. the netback interface got added to the
bridge; and

(2) ensure the frontend only queues the gARP when it sees the backend move
to Connected.

These two together provide complete ordering. Sub-condition (1) is already
satisfied by commit f942dc2552b8 in Linus' tree, based on commit
6b0b80ca7165 from [1].

In general, the full condition is sufficient, not necessary, because,
according to [2], live migration has been working for a long time without
satisfying sub-condition (2). However, after 6b0b80ca7165 was backported
to the RHEL-5 host to ensure (1), (2) still proved necessary in the RHEL-6
guest. This patch intends to provide (2) for upstream.

The Reviewed-by line comes from [3].

[1] git://xenbits.xen.org/people/ianc/linux-2.6.git#upstream/dom0/backend/netback-history
[2] http://old-list-archives.xen.org/xen-devel/2011-06/msg01969.html
[3] http://old-list-archives.xen.org/xen-devel/2011-07/msg00484.html

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agomm/hotplug: correctly add new zone to all other nodes' zone lists
Jiang Liu [Tue, 31 Jul 2012 23:43:30 +0000 (16:43 -0700)]
mm/hotplug: correctly add new zone to all other nodes' zone lists

commit 08dff7b7d629807dbb1f398c68dd9cd58dd657a1 upstream.

When online_pages() is called to add new memory to an empty zone, it
rebuilds all zone lists by calling build_all_zonelists().  But there's a
bug which prevents the new zone to be added to other nodes' zone lists.

online_pages() {
build_all_zonelists()
.....
node_set_state(zone_to_nid(zone), N_HIGH_MEMORY)
}

Here the node of the zone is put into N_HIGH_MEMORY state after calling
build_all_zonelists(), but build_all_zonelists() only adds zones from
nodes in N_HIGH_MEMORY state to the fallback zone lists.
build_all_zonelists()

    ->__build_all_zonelists()
->build_zonelists()
    ->find_next_best_node()
->for_each_node_state(n, N_HIGH_MEMORY)

So memory in the new zone will never be used by other nodes, and it may
cause strange behavor when system is under memory pressure.  So put node
into N_HIGH_MEMORY state before calling build_all_zonelists().

Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Jiang Liu <liuj97@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Keping Chen <chenkeping@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agobatman-adv: Only write requested number of byte to user buffer
Sven Eckelmann [Sat, 10 Dec 2011 14:28:36 +0000 (15:28 +0100)]
batman-adv: Only write requested number of byte to user buffer

commit b5a1eeef04cc7859f34dec9b72ea1b28e4aba07c upstream.

Don't write more than the requested number of bytes of an batman-adv icmp
packet to the userspace buffer. Otherwise unrelated userspace memory might get
overridden by the kernel.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
11 years agobatman-adv: bat_socket_read missing checks
Paul Kot [Sat, 10 Dec 2011 14:28:34 +0000 (15:28 +0100)]
batman-adv: bat_socket_read missing checks

commit c00b6856fc642b234895cfabd15b289e76726430 upstream.

Writing a icmp_packet_rr and then reading icmp_packet can lead to kernel
memory corruption, if __user *buf is just below TASK_SIZE.

Signed-off-by: Paul Kot <pawlkt@gmail.com>
[sven@narfation.org: made it checkpatch clean]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>