pandora-kernel.git
7 years agokvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601)
Michael S. Tsirkin [Tue, 19 Aug 2014 11:14:50 +0000 (19:14 +0800)]
kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601)

commit 350b8bdd689cd2ab2c67c8a86a0be86cfa0751a7 upstream.

The third parameter of kvm_iommu_put_pages is wrong,
It should be 'gfn - slot->base_gfn'.

By making gfn very large, malicious guest or userspace can cause kvm to
go to this error path, and subsequently to pass a huge value as size.
Alternatively if gfn is small, then pages would be pinned but never
unpinned, causing host memory leak and local DOS.

Passing a reasonable but large value could be the most dangerous case,
because it would unpin a page that should have stayed pinned, and thus
allow the device to DMA into arbitrary memory.  However, this cannot
happen because of the condition that can trigger the error:

- out of memory (where you can't allocate even a single page)
  should not be possible for the attacker to trigger

- when exceeding the iommu's address space, guest pages after gfn
  will also exceed the iommu's address space, and inside
  kvm_iommu_put_pages() the iommu_iova_to_phys() will fail.  The
  page thus would not be unpinned at all.

Reported-by: Jack Morgenstein <jackm@mellanox.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agopata_scc: propagate return value of scc_wait_after_reset
Arjun Sreedharan [Sun, 17 Aug 2014 14:30:09 +0000 (20:00 +0530)]
pata_scc: propagate return value of scc_wait_after_reset

commit 4dc7c76cd500fa78c64adfda4b070b870a2b993c upstream.

scc_bus_softreset not necessarily should return zero.
Propagate the error code.

Signed-off-by: Arjun Sreedharan <arjun024@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoiommu/amd: Fix cleanup_domain for mass device removal
Joerg Roedel [Tue, 5 Aug 2014 15:50:15 +0000 (17:50 +0200)]
iommu/amd: Fix cleanup_domain for mass device removal

commit 9b29d3c6510407d91786c1cf9183ff4debb3473a upstream.

When multiple devices are detached in __detach_device, they
are also removed from the domains dev_list. This makes it
unsafe to use list_for_each_entry_safe, as the next pointer
might also not be in the list anymore after __detach_device
returns. So just repeatedly remove the first element of the
list until it is empty.

Tested-by: Marti Raudsepp <marti@juffo.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: ftdi_sio: Added PID for new ekey device
Jaša Bartelj [Sat, 16 Aug 2014 10:44:27 +0000 (12:44 +0200)]
USB: ftdi_sio: Added PID for new ekey device

commit 646907f5bfb0782c731ae9ff6fb63471a3566132 upstream.

Added support to the ftdi_sio driver for ekey Converter USB which
uses an FT232BM chip.

Signed-off-by: Jaša Bartelj <jasa.bartelj@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: serial: pl2303: add device id for ztek device
Greg KH [Fri, 15 Aug 2014 07:22:21 +0000 (15:22 +0800)]
USB: serial: pl2303: add device id for ztek device

commit 91fcb1ce420e0a5f8d92d556d7008a78bc6ce1eb upstream.

This adds a new device id to the pl2303 driver for the ZTEK device.

Reported-by: Mike Chu <Mike-Chu@prolific.com.tw>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: ftdi_sio: add Basic Micro ATOM Nano USB2Serial PID
Johan Hovold [Wed, 13 Aug 2014 15:56:52 +0000 (17:56 +0200)]
USB: ftdi_sio: add Basic Micro ATOM Nano USB2Serial PID

commit 6552cc7f09261db2aeaae389aa2c05a74b3a93b4 upstream.

Add device id for Basic Micro ATOM Nano USB2Serial adapters.

Reported-by: Nicolas Alt <n.alt@mytum.de>
Tested-by: Nicolas Alt <n.alt@mytum.de>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: option: add VIA Telecom CDS7 chipset device id
Brennan Ashton [Wed, 6 Aug 2014 15:46:44 +0000 (08:46 -0700)]
USB: option: add VIA Telecom CDS7 chipset device id

commit d77302739d900bbca5e901a3b7ac48c907ee6c93 upstream.

This VIA Telecom baseband processor is used is used by by u-blox in both the
FW2770 and FW2760 products and may be used in others as well.

This patch has been tested on both of these modem versions.

Signed-off-by: Brennan Ashton <bashton@brennanashton.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agomd/raid6: avoid data corruption during recovery of double-degraded RAID6
NeilBrown [Tue, 12 Aug 2014 23:57:07 +0000 (09:57 +1000)]
md/raid6: avoid data corruption during recovery of double-degraded RAID6

commit 9c4bdf697c39805078392d5ddbbba5ae5680e0dd upstream.

During recovery of a double-degraded RAID6 it is possible for
some blocks not to be recovered properly, leading to corruption.

If a write happens to one block in a stripe that would be written to a
missing device, and at the same time that stripe is recovering data
to the other missing device, then that recovered data may not be written.

This patch skips, in the double-degraded case, an optimisation that is
only safe for single-degraded arrays.

Bug was introduced in 2.6.32 and fix is suitable for any kernel since
then.  In an older kernel with separate handle_stripe5() and
handle_stripe6() functions the patch must change handle_stripe6().

Fixes: 6c0069c0ae9659e3a91b68eaed06a5c6c37f45c8
Cc: Yuri Tikhonov <yur@emcraft.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Reported-by: "Manibalan P" <pmanibalan@amiindia.co.in>
Tested-by: "Manibalan P" <pmanibalan@amiindia.co.in>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1090423
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoCIFS: Fix wrong directory attributes after rename
Pavel Shilovsky [Mon, 18 Aug 2014 16:49:58 +0000 (20:49 +0400)]
CIFS: Fix wrong directory attributes after rename

commit b46799a8f28c43c5264ac8d8ffa28b311b557e03 upstream.

When we requests rename we also need to update attributes
of both source and target parent directories. Not doing it
causes generic/309 xfstest to fail on SMB2 mounts. Fix this
by marking these directories for force revalidating.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoALSA: hda/realtek - Avoid setting wrong COEF on ALC269 & co
Takashi Iwai [Fri, 15 Aug 2014 15:35:00 +0000 (17:35 +0200)]
ALSA: hda/realtek - Avoid setting wrong COEF on ALC269 & co

commit f3ee07d8b6e061bf34a7167c3f564e8da4360a99 upstream.

ALC269 & co have many vendor-specific setups with COEF verbs.
However, some verbs seem specific to some codec versions and they
result in the codec stalling.  Typically, such a case can be avoided
by checking the return value from reading a COEF.  If the return value
is -1, it implies that the COEF is invalid, thus it shouldn't be
written.

This patch adds the invalid COEF checks in appropriate places
accessing ALC269 and its variants.  The patch actually fixes the
resume problem on Acer AO725 laptop.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=52181
Tested-by: Francesco Muzio <muziofg@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoBtrfs: fix csum tree corruption, duplicate and outdated checksums
Filipe Manana [Sat, 9 Aug 2014 20:22:27 +0000 (21:22 +0100)]
Btrfs: fix csum tree corruption, duplicate and outdated checksums

commit 27b9a8122ff71a8cadfbffb9c4f0694300464f3b upstream.

Under rare circumstances we can end up leaving 2 versions of a checksum
for the same file extent range.

The reason for this is that after calling btrfs_next_leaf we process
slot 0 of the leaf it returns, instead of processing the slot set in
path->slots[0]. Most of the time (by far) path->slots[0] is 0, but after
btrfs_next_leaf() releases the path and before it searches for the next
leaf, another task might cause a split of the next leaf, which migrates
some of its keys to the leaf we were processing before calling
btrfs_next_leaf(). In this case btrfs_next_leaf() returns again the
same leaf but with path->slots[0] having a slot number corresponding
to the first new key it got, that is, a slot number that didn't exist
before calling btrfs_next_leaf(), as the leaf now has more keys than
it had before. So we must really process the returned leaf starting at
path->slots[0] always, as it isn't always 0, and the key at slot 0 can
have an offset much lower than our search offset/bytenr.

For example, consider the following scenario, where we have:

sums->bytenr: 40157184, sums->len: 16384, sums end: 40173568
four 4kb file data blocks with offsets 40157184401612804016537640169472

  Leaf N:

    slot = 0                           slot = btrfs_header_nritems() - 1
  |-------------------------------------------------------------------|
  | [(CSUM CSUM 39239680), size 8] ... [(CSUM CSUM 40116224), size 4] |
  |-------------------------------------------------------------------|

  Leaf N + 1:

      slot = 0                          slot = btrfs_header_nritems() - 1
  |--------------------------------------------------------------------|
  | [(CSUM CSUM 40161280), size 32] ... [((CSUM CSUM 40615936), size 8 |
  |--------------------------------------------------------------------|

Because we are at the last slot of leaf N, we call btrfs_next_leaf() to
find the next highest key, which releases the current path and then searches
for that next key. However after releasing the path and before finding that
next key, the item at slot 0 of leaf N + 1 gets moved to leaf N, due to a call
to ctree.c:push_leaf_left() (via ctree.c:split_leaf()), and therefore
btrfs_next_leaf() will returns us a path again with leaf N but with the slot
pointing to its new last key (CSUM CSUM 40161280). This new version of leaf N
is then:

    slot = 0                        slot = btrfs_header_nritems() - 2  slot = btrfs_header_nritems() - 1
  |----------------------------------------------------------------------------------------------------|
  | [(CSUM CSUM 39239680), size 8] ... [(CSUM CSUM 40116224), size 4]  [(CSUM CSUM 40161280), size 32] |
  |----------------------------------------------------------------------------------------------------|

And incorrecly using slot 0, makes us set next_offset to 39239680 and we jump
into the "insert:" label, which will set tmp to:

    tmp = min((sums->len - total_bytes) >> blocksize_bits,
        (next_offset - file_key.offset) >> blocksize_bits) =
    min((16384 - 0) >> 12, (39239680 - 40157184) >> 12) =
    min(4, (u64)-917504 = 18446744073708634112 >> 12) = 4

and

   ins_size = csum_size * tmp = 4 * 4 = 16 bytes.

In other words, we insert a new csum item in the tree with key
(CSUM_OBJECTID CSUM_KEY 40157184 = sums->bytenr) that contains the checksums
for all the data (4 blocks of 4096 bytes each = sums->len). Which is wrong,
because the item with key (CSUM CSUM 40161280) (the one that was moved from
leaf N + 1 to the end of leaf N) contains the old checksums of the last 12288
bytes of our data and won't get those old checksums removed.

So this leaves us 2 different checksums for 3 4kb blocks of data in the tree,
and breaks the logical rule:

   Key_N+1.offset >= Key_N.offset + length_of_data_its_checksums_cover

An obvious bad effect of this is that a subsequent csum tree lookup to get
the checksum of any of the blocks with logical offset of 4016128040165376
or 40169472 (the last 3 4kb blocks of file data), will get the old checksums.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoASoC: pxa-ssp: drop SNDRV_PCM_FMTBIT_S24_LE
Daniel Mack [Wed, 13 Aug 2014 19:51:06 +0000 (21:51 +0200)]
ASoC: pxa-ssp: drop SNDRV_PCM_FMTBIT_S24_LE

commit 9301503af016eb537ccce76adec0c1bb5c84871e upstream.

This mode is unsupported, as the DMA controller can't do zero-padding
of samples.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Reported-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agopowerpc/mm: Use read barrier when creating real_pte
Aneesh Kumar K.V [Wed, 13 Aug 2014 07:02:03 +0000 (12:32 +0530)]
powerpc/mm: Use read barrier when creating real_pte

commit 85c1fafd7262e68ad821ee1808686b1392b1167d upstream.

On ppc64 we support 4K hash pte with 64K page size. That requires
us to track the hash pte slot information on a per 4k basis. We do that
by storing the slot details in the second half of pte page. The pte bit
_PAGE_COMBO is used to indicate whether the second half need to be
looked while building real_pte. We need to use read memory barrier while
doing that so that load of hidx is not reordered w.r.t _PAGE_COMBO
check. On the store side we already do a lwsync in __hash_page_4K

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[bwh: Backported to 3.2: include <asm/system.h> to ensure smp_rmb()
 is defined; cell_defconfig fails to build without this]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agopowerpc: Fix build errors STRICT_MM_TYPECHECKS
Aneesh Kumar K.V [Mon, 6 May 2013 10:51:00 +0000 (10:51 +0000)]
powerpc: Fix build errors STRICT_MM_TYPECHECKS

commit 83d5e64b7efa7f39b10ff5e92792e807a720289c upstream.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoreiserfs: Fix use after free in journal teardown
Jan Kara [Wed, 6 Aug 2014 17:43:56 +0000 (19:43 +0200)]
reiserfs: Fix use after free in journal teardown

commit 01777836c87081e4f68c4a43c9abe6114805f91e upstream.

If do_journal_release() races with do_journal_end() which requeues
delayed works for transaction flushing, we can leave work items for
flushing outstanding transactions queued while freeing them. That
results in use after free and possible crash in run_timers_softirq().

Fix the problem by not requeueing works if superblock is being shut down
(MS_ACTIVE not set) and using cancel_delayed_work_sync() in
do_journal_release().

Signed-off-by: Jan Kara <jack@suse.cz>
[bwh: Backported to 3.2:
 - Adjust context
 - commit_wq is global, not per-superblock
 - Change comment about 'these works'; we only have one work item
 - Drop inapplicable changes to reiserfs_schedule_old_flush()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agocarl9170: fix sending URBs with wrong type when using full-speed
Ronald Wahl [Thu, 7 Aug 2014 12:15:50 +0000 (14:15 +0200)]
carl9170: fix sending URBs with wrong type when using full-speed

commit 671796dd96b6cd85b75fba9d3007bcf7e5f7c309 upstream.

The driver assumes that endpoint 4 is always an interrupt endpoint.
Unfortunately the type differs between high-speed and full-speed
configurations while in the former case it is indeed an interrupt
endpoint this is not true for the latter case - here it is a bulk
endpoint. When sending URBs with the wrong type the kernel will
generate a warning message including backtrace. In this specific
case there will be a huge amount of warnings which can bring the system
to freeze.

To fix this we are now sending URBs to endpoint 4 using the type
found in the endpoint descriptor.

A side note: The carl9170 firmware currently specifies endpoint 4 as
interrupt endpoint even in the full-speed configuration but this has
no relevance because before this firmware is loaded the endpoint type
is as described above and after the firmware is running the stick is not
reenumerated and so the old descriptor is used.

Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agox86/xen: resume timer irqs early
David Vrabel [Thu, 7 Aug 2014 16:06:06 +0000 (17:06 +0100)]
x86/xen: resume timer irqs early

commit 8d5999df35314607c38fbd6bdd709e25c3a4eeab upstream.

If the timer irqs are resumed during device resume it is possible in
certain circumstances for the resume to hang early on, before device
interrupts are resumed.  For an Ubuntu 14.04 PVHVM guest this would
occur in ~0.5% of resume attempts.

It is not entirely clear what is occuring the point of the hang but I
think a task necessary for the resume calls schedule_timeout(),
waiting for a timer interrupt (which never arrives).  This failure may
require specific tasks to be running on the other VCPUs to trigger
(processes are not frozen during a suspend/resume if PREEMPT is
disabled).

Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
syscore_resume().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoring-buffer: Always reset iterator to reader page
Steven Rostedt (Red Hat) [Wed, 6 Aug 2014 18:11:33 +0000 (14:11 -0400)]
ring-buffer: Always reset iterator to reader page

commit 651e22f2701b4113989237c3048d17337dd2185c upstream.

When performing a consuming read, the ring buffer swaps out a
page from the ring buffer with a empty page and this page that
was swapped out becomes the new reader page. The reader page
is owned by the reader and since it was swapped out of the ring
buffer, writers do not have access to it (there's an exception
to that rule, but it's out of scope for this commit).

When reading the "trace" file, it is a non consuming read, which
means that the data in the ring buffer will not be modified.
When the trace file is opened, a ring buffer iterator is allocated
and writes to the ring buffer are disabled, such that the iterator
will not have issues iterating over the data.

Although the ring buffer disabled writes, it does not disable other
reads, or even consuming reads. If a consuming read happens, then
the iterator is reset and starts reading from the beginning again.

My tests would sometimes trigger this bug on my i386 box:

WARNING: CPU: 0 PID: 5175 at kernel/trace/trace.c:1527 __trace_find_cmdline+0x66/0xaa()
Modules linked in:
CPU: 0 PID: 5175 Comm: grep Not tainted 3.16.0-rc3-test+ #8
Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
 00000000 00000000 f09c9e1c c18796b3 c1b5d74c f09c9e4c c103a0e3 c1b5154b
 f09c9e78 00001437 c1b5d74c 000005f7 c10bd85a c10bd85a c1cac57c f09c9eb0
 ed0e0000 f09c9e64 c103a185 00000009 f09c9e5c c1b5154b f09c9e78 f09c9e80^M
Call Trace:
 [<c18796b3>] dump_stack+0x4b/0x75
 [<c103a0e3>] warn_slowpath_common+0x7e/0x95
 [<c10bd85a>] ? __trace_find_cmdline+0x66/0xaa
 [<c10bd85a>] ? __trace_find_cmdline+0x66/0xaa
 [<c103a185>] warn_slowpath_fmt+0x33/0x35
 [<c10bd85a>] __trace_find_cmdline+0x66/0xaa^M
 [<c10bed04>] trace_find_cmdline+0x40/0x64
 [<c10c3c16>] trace_print_context+0x27/0xec
 [<c10c4360>] ? trace_seq_printf+0x37/0x5b
 [<c10c0b15>] print_trace_line+0x319/0x39b
 [<c10ba3fb>] ? ring_buffer_read+0x47/0x50
 [<c10c13b1>] s_show+0x192/0x1ab
 [<c10bfd9a>] ? s_next+0x5a/0x7c
 [<c112e76e>] seq_read+0x267/0x34c
 [<c1115a25>] vfs_read+0x8c/0xef
 [<c112e507>] ? seq_lseek+0x154/0x154
 [<c1115ba2>] SyS_read+0x54/0x7f
 [<c188488e>] syscall_call+0x7/0xb
---[ end trace 3f507febd6b4cc83 ]---
>>>> ##### CPU 1 buffer started ####

Which was the __trace_find_cmdline() function complaining about the pid
in the event record being negative.

After adding more test cases, this would trigger more often. Strangely
enough, it would never trigger on a single test, but instead would trigger
only when running all the tests. I believe that was the case because it
required one of the tests to be shutting down via delayed instances while
a new test started up.

After spending several days debugging this, I found that it was caused by
the iterator becoming corrupted. Debugging further, I found out why
the iterator became corrupted. It happened with the rb_iter_reset().

As consuming reads may not read the full reader page, and only part
of it, there's a "read" field to know where the last read took place.
The iterator, must also start at the read position. In the rb_iter_reset()
code, if the reader page was disconnected from the ring buffer, the iterator
would start at the head page within the ring buffer (where writes still
happen). But the mistake there was that it still used the "read" field
to start the iterator on the head page, where it should always start
at zero because readers never read from within the ring buffer where
writes occur.

I originally wrote a patch to have it set the iter->head to 0 instead
of iter->head_page->read, but then I questioned why it wasn't always
setting the iter to point to the reader page, as the reader page is
still valid.  The list_empty(reader_page->list) just means that it was
successful in swapping out. But the reader_page may still have data.

There was a bug report a long time ago that was not reproducible that
had something about trace_pipe (consuming read) not matching trace
(iterator read). This may explain why that happened.

Anyway, the correct answer to this bug is to always use the reader page
an not reset the iterator to inside the writable ring buffer.

Fixes: d769041f8653 "ring_buffer: implement new locking"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoring-buffer: Up rb_iter_peek() loop count to 3
Steven Rostedt (Red Hat) [Wed, 6 Aug 2014 19:36:31 +0000 (15:36 -0400)]
ring-buffer: Up rb_iter_peek() loop count to 3

commit 021de3d904b88b1771a3a2cfc5b75023c391e646 upstream.

After writting a test to try to trigger the bug that caused the
ring buffer iterator to become corrupted, I hit another bug:

 WARNING: CPU: 1 PID: 5281 at kernel/trace/ring_buffer.c:3766 rb_iter_peek+0x113/0x238()
 Modules linked in: ipt_MASQUERADE sunrpc [...]
 CPU: 1 PID: 5281 Comm: grep Tainted: G        W     3.16.0-rc3-test+ #143
 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
  0000000000000000 ffffffff81809a80 ffffffff81503fb0 0000000000000000
  ffffffff81040ca1 ffff8800796d6010 ffffffff810c138d ffff8800796d6010
  ffff880077438c80 ffff8800796d6010 ffff88007abbe600 0000000000000003
 Call Trace:
  [<ffffffff81503fb0>] ? dump_stack+0x4a/0x75
  [<ffffffff81040ca1>] ? warn_slowpath_common+0x7e/0x97
  [<ffffffff810c138d>] ? rb_iter_peek+0x113/0x238
  [<ffffffff810c138d>] ? rb_iter_peek+0x113/0x238
  [<ffffffff810c14df>] ? ring_buffer_iter_peek+0x2d/0x5c
  [<ffffffff810c6f73>] ? tracing_iter_reset+0x6e/0x96
  [<ffffffff810c74a3>] ? s_start+0xd7/0x17b
  [<ffffffff8112b13e>] ? kmem_cache_alloc_trace+0xda/0xea
  [<ffffffff8114cf94>] ? seq_read+0x148/0x361
  [<ffffffff81132d98>] ? vfs_read+0x93/0xf1
  [<ffffffff81132f1b>] ? SyS_read+0x60/0x8e
  [<ffffffff8150bf9f>] ? tracesys+0xdd/0xe2

Debugging this bug, which triggers when the rb_iter_peek() loops too
many times (more than 2 times), I discovered there's a case that can
cause that function to legitimately loop 3 times!

rb_iter_peek() is different than rb_buffer_peek() as the rb_buffer_peek()
only deals with the reader page (it's for consuming reads). The
rb_iter_peek() is for traversing the buffer without consuming it, and as
such, it can loop for one more reason. That is, if we hit the end of
the reader page or any page, it will go to the next page and try again.

That is, we have this:

 1. iter->head > iter->head_page->page->commit
    (rb_inc_iter() which moves the iter to the next page)
    try again

 2. event = rb_iter_head_event()
    event->type_len == RINGBUF_TYPE_TIME_EXTEND
    rb_advance_iter()
    try again

 3. read the event.

But we never get to 3, because the count is greater than 2 and we
cause the WARNING and return NULL.

Up the counter to 3.

Fixes: 69d1b839f7ee "ring-buffer: Bind time extend and data events together"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bwh: Backported to 3.2: drop inapplicable spelling correction]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agos390/locking: Reenable optimistic spinning
Christian Borntraeger [Tue, 5 Aug 2014 07:57:51 +0000 (09:57 +0200)]
s390/locking: Reenable optimistic spinning

commit 36e7fdaa1a04fcf65b864232e1af56a51c7814d6 upstream.

commit 4badad352a6bb202ec68afa7a574c0bb961e5ebc (locking/mutex: Disable
optimistic spinning on some architectures) fenced spinning for
architectures without proper cmpxchg.
There is no need to disable mutex spinning on s390, though:
The instructions CS,CSG and friends provide the proper guarantees.
(We dont implement cmpxchg with locks).

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohwmon: (ads1015) Fix out-of-bounds array access
Axel Lin [Tue, 5 Aug 2014 01:59:49 +0000 (09:59 +0800)]
hwmon: (ads1015) Fix out-of-bounds array access

commit e981429557cbe10c780fab1c1a237cb832757652 upstream.

Current code uses data_rate as array index in ads1015_read_adc() and uses pga
as array index in ads1015_reg_to_mv, so we must make sure both data_rate and
pga settings are in valid value range.
Return -EINVAL if the setting is out-of-range.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohwmon: (lm92) Prevent overflow problem when writing large limits
Axel Lin [Tue, 5 Aug 2014 02:08:31 +0000 (10:08 +0800)]
hwmon: (lm92) Prevent overflow problem when writing large limits

commit 5b963089161b8fb244889c972edf553b9d737545 upstream.

On platforms with sizeof(int) < sizeof(long), writing a temperature
limit larger than MAXINT will result in unpredictable limit values
written to the chip. Avoid auto-conversion from long to int to fix
the problem.

The hysteresis temperature range depends on the value of
data->temp[attr->index], since val is subtracted from it.
Use a wider clamp, [-120000, 220000] should do to cover the
possible range. Also add missing TEMP_TO_REG() on writes into
cached hysteresis value.

Also uses clamp_val to simplify the code a bit.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
[Guenter Roeck: Fixed double TEMP_TO_REG on hysteresis updates]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[bwh: Backported to 3.2:
 - s/temp\[attr->index\]/temp1_crit/
 - s/temp\[t_hyst\]/temp1_hyst/]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoRDMA/iwcm: Use a default listen backlog if needed
Steve Wise [Fri, 25 Jul 2014 14:11:33 +0000 (09:11 -0500)]
RDMA/iwcm: Use a default listen backlog if needed

commit 2f0304d21867476394cd51a54e97f7273d112261 upstream.

If the user creates a listening cm_id with backlog of 0 the IWCM ends
up not allowing any connection requests at all.  The correct behavior
is for the IWCM to pick a default value if the user backlog parameter
is zero.

Lustre from version 1.8.8 onward uses a backlog of 0, which breaks
iwarp support without this fix.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
[bwh: Backported to 3.2: use register_net_sysctl_table()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agodrm/radeon: load the lm63 driver for an lm64 thermal chip.
Alex Deucher [Mon, 28 Jul 2014 03:21:50 +0000 (23:21 -0400)]
drm/radeon: load the lm63 driver for an lm64 thermal chip.

commit 5dc355325b648dc9b4cf3bea4d968de46fd59215 upstream.

Looks like the lm63 driver supports the lm64 as well.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agopowerpc/mm/numa: Fix break placement
Andrey Utkin [Mon, 4 Aug 2014 20:13:10 +0000 (23:13 +0300)]
powerpc/mm/numa: Fix break placement

commit b00fc6ec1f24f9d7af9b8988b6a198186eb3408c upstream.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81631
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Andrey Utkin <andrey.krieger.utkin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agodrm/ttm: Fix possible stack overflow by recursive shrinker calls.
Tetsuo Handa [Sun, 3 Aug 2014 11:02:03 +0000 (20:02 +0900)]
drm/ttm: Fix possible stack overflow by recursive shrinker calls.

commit 71336e011d1d2312bcbcaa8fcec7365024f3a95d upstream.

While ttm_dma_pool_shrink_scan() tries to take mutex before doing GFP_KERNEL
allocation, ttm_pool_shrink_scan() does not do it. This can result in stack
overflow if kmalloc() in ttm_page_pool_free() triggered recursion due to
memory pressure.

  shrink_slab()
  => ttm_pool_shrink_scan()
     => ttm_page_pool_free()
        => kmalloc(GFP_KERNEL)
           => shrink_slab()
              => ttm_pool_shrink_scan()
                 => ttm_page_pool_free()
                    => kmalloc(GFP_KERNEL)

Change ttm_pool_shrink_scan() to do like ttm_dma_pool_shrink_scan() does.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Dave Airlie <airlied@redhat.com>
[bwh: Backported to 3.2:
 - Adjust context
 - Change return value in the contended case to follow the old shrinker
   API]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohwmon: (sis5595) Prevent overflow problem when writing large limits
Axel Lin [Thu, 31 Jul 2014 14:27:04 +0000 (22:27 +0800)]
hwmon: (sis5595) Prevent overflow problem when writing large limits

commit cc336546ddca8c22de83720632431c16a5f9fe9a upstream.

On platforms with sizeof(int) < sizeof(long), writing a temperature
limit larger than MAXINT will result in unpredictable limit values
written to the chip. Avoid auto-conversion from long to int to fix
the problem.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohwmon: (gpio-fan) Prevent overflow problem when writing large limits
Axel Lin [Sat, 2 Aug 2014 05:36:38 +0000 (13:36 +0800)]
hwmon: (gpio-fan) Prevent overflow problem when writing large limits

commit 2565fb05d1e9fc0831f7b1c083bcfcb1cba1f020 upstream.

On platforms with sizeof(int) < sizeof(unsigned long), writing a rpm value
larger than MAXINT will result in unpredictable limit values written to the
chip. Avoid auto-conversion from unsigned long to int to fix the problem.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoALSA: virtuoso: add Xonar Essence STX II support
Clemens Ladisch [Mon, 4 Aug 2014 13:17:55 +0000 (15:17 +0200)]
ALSA: virtuoso: add Xonar Essence STX II support

commit f42bb22243d2ae264d721b055f836059fe35321f upstream.

Just add the PCI ID for the STX II.  It appears to work the same as the
STX, except for the addition of the not-yet-supported daughterboard.

Tested-by: Mario <fugazzi99@gmail.com>
Tested-by: corubba <corubba@gmx.de>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoALSA: virtuoso: Xonar DSX support
Sergiu Giurgiu [Sun, 9 Sep 2012 09:14:15 +0000 (05:14 -0400)]
ALSA: virtuoso: Xonar DSX support

commit 4492363251235c4499a2d073c5f09121ea23d39d upstream.

This patch adds support for ASUS - Xonar DSX sound cards. Tested on
openSUSE 12.2 with kernel:
Linux 3.4.6-2.10-desktop #1 SMP PREEMPT Thu Jul 26 09:36:26 UTC 2012 (641c197) x86_64 x86_64 x86_64 GNU/Linux
Works:
 - play sounds
 - adjust volume on master channel.
 - mute .

Since Xonar DS uses the same chip, everything that works for DS should
work for DSX as well.

Signed-off-by: Sergiu Giurgiu <sgiurgiu11@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: serial: ftdi_sio: Add support for new Xsens devices
Patrick Riphagen [Thu, 24 Jul 2014 07:09:50 +0000 (09:09 +0200)]
USB: serial: ftdi_sio: Add support for new Xsens devices

commit 4bdcde358b4bda74e356841d351945ca3f2245dd upstream.

This adds support for new Xsens devices, using Xsens' own Vendor ID.

Signed-off-by: Patrick Riphagen <patrick.riphagen@xsens.com>
Signed-off-by: Frans Klaver <frans.klaver@xsens.com>
Cc: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: serial: ftdi_sio: Annotate the current Xsens PID assignments
Patrick Riphagen [Thu, 24 Jul 2014 07:12:52 +0000 (09:12 +0200)]
USB: serial: ftdi_sio: Annotate the current Xsens PID assignments

commit 9273b8a270878906540349422ab24558b9d65716 upstream.

The converters are used in specific products. It can be useful to know
which they are exactly.

Signed-off-by: Patrick Riphagen <patrick.riphagen@xsens.com>
Signed-off-by: Frans Klaver <frans.klaver@xsens.com>
Cc: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agonetlabel: fix a problem when setting bits below the previously lowest bit
Paul Moore [Fri, 1 Aug 2014 15:17:03 +0000 (11:17 -0400)]
netlabel: fix a problem when setting bits below the previously lowest bit

commit 41c3bd2039e0d7b3dc32313141773f20716ec524 upstream.

The NetLabel category (catmap) functions have a problem in that they
assume categories will be set in an increasing manner, e.g. the next
category set will always be larger than the last.  Unfortunately, this
is not a valid assumption and could result in problems when attempting
to set categories less than the startbit in the lowest catmap node.
In some cases kernel panics and other nasties can result.

This patch corrects the problem by checking for this and allocating a
new catmap node instance and placing it at the front of the list.

Reported-by: Christian Evans <frodox@zoho.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
[bwh: Backported to 3.2: adjust filename for SMACK]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agonetlabel: use GFP flags from caller instead of GFP_ATOMIC
Dan Carpenter [Wed, 21 Mar 2012 20:41:01 +0000 (20:41 +0000)]
netlabel: use GFP flags from caller instead of GFP_ATOMIC

commit 64b5fad526f63e9b56752a7e8e153b99ec0ddecd upstream.

This function takes a GFP flags as a parameter, but they are never used.
We don't take a lock in this function so there is no reason to prefer
GFP_ATOMIC over the caller's GFP flags.

There is only one caller, cipso_v4_map_cat_rng_ntoh(), and it passes
GFP_ATOMIC as the GFP flags so this doesn't change how the code works.
It's just a cleanup.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoARM: OMAP3: Fix choice of omap3_restore_es function in OMAP34XX rev3.1.2 case.
Jeremy Vial [Thu, 31 Jul 2014 13:10:33 +0000 (15:10 +0200)]
ARM: OMAP3: Fix choice of omap3_restore_es function in OMAP34XX rev3.1.2 case.

commit 9b5f7428f8b16bd8980213f2b70baf1dd0b9e36c upstream.

According to the comment “restore_es3: applies to 34xx >= ES3.0" in
"arch/arm/mach-omap2/sleep34xx.S”, omap3_restore_es3 should be used
if the revision of an OMAP34xx is ES3.1.2.

Signed-off-by: Jeremy Vial <jvial@adeneo-embedded.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agomnt: Change the default remount atime from relatime to the existing value
Eric W. Biederman [Tue, 29 Jul 2014 00:36:04 +0000 (17:36 -0700)]
mnt: Change the default remount atime from relatime to the existing value

commit ffbc6f0ead47fa5a1dc9642b0331cb75c20a640e upstream.

Since March 2009 the kernel has treated the state that if no
MS_..ATIME flags are passed then the kernel defaults to relatime.

Defaulting to relatime instead of the existing atime state during a
remount is silly, and causes problems in practice for people who don't
specify any MS_...ATIME flags and to get the default filesystem atime
setting.  Those users may encounter a permission error because the
default atime setting does not work.

A default that does not work and causes permission problems is
ridiculous, so preserve the existing value to have a default
atime setting that is always guaranteed to work.

Using the default atime setting in this way is particularly
interesting for applications built to run in restricted userspace
environments without /proc mounted, as the existing atime mount
options of a filesystem can not be read from /proc/mounts.

In practice this fixes user space that uses the default atime
setting on remount that are broken by the permission checks
keeping less privileged users from changing more privileged users
atime settings.

Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
[bwh: Backported to 3.2: add definition of MNT_ATIME_MASK, as we don't
 need the fix that introduced that definition upstream]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agocrypto: af_alg - properly label AF_ALG socket
Milan Broz [Tue, 29 Jul 2014 18:41:09 +0000 (18:41 +0000)]
crypto: af_alg - properly label AF_ALG socket

commit 4c63f83c2c2e16a13ce274ee678e28246bd33645 upstream.

Th AF_ALG socket was missing a security label (e.g. SELinux)
which means that socket was in "unlabeled" state.

This was recently demonstrated in the cryptsetup package
(cryptsetup v1.6.5 and later.)
See https://bugzilla.redhat.com/show_bug.cgi?id=1115120

This patch clones the sock's label from the parent sock
and resolves the issue (similar to AF_BLUETOOTH protocol family).

Signed-off-by: Milan Broz <gmazyland@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoMIPS: GIC: Prevent array overrun
Jeffrey Deans [Thu, 17 Jul 2014 08:20:56 +0000 (09:20 +0100)]
MIPS: GIC: Prevent array overrun

commit ffc8415afab20bd97754efae6aad1f67b531132b upstream.

A GIC interrupt which is declared as having a GIC_MAP_TO_NMI_MSK
mapping causes the cpu parameter to gic_setup_intr() to be increased
to 32, causing memory corruption when pcpu_masks[] is written to again
later in the function.

Signed-off-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7375/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohwmon: (amc6821) Fix possible race condition bug
Axel Lin [Thu, 31 Jul 2014 01:43:19 +0000 (09:43 +0800)]
hwmon: (amc6821) Fix possible race condition bug

commit cf44819c98db11163f58f08b822d626c7a8f5188 upstream.

Ensure mutex lock protects the read-modify-write period to prevent possible
race condition bug.
In additional, update data->valid should also be protected by the mutex lock.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohwmon: (amc6821) Fix return value
Sachin Kamat [Wed, 11 Sep 2013 04:19:50 +0000 (09:49 +0530)]
hwmon: (amc6821) Fix return value

commit 3499e5b2e14b792fe411302fea3b6fcc4ba40ef2 upstream.

Propagate return value obtained from i2c_smbus_read_byte_data()
instead of hardcoding.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: T. Mertelj <tomaz.mertelj@guest.arnes.si>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohwmon: (lm78) Fix overflow problems seen when writing large temperature limits
Guenter Roeck [Wed, 30 Jul 2014 03:48:59 +0000 (20:48 -0700)]
hwmon: (lm78) Fix overflow problems seen when writing large temperature limits

commit 1074d683a51f1aded3562add9ef313e75d557327 upstream.

On platforms with sizeof(int) < sizeof(long), writing a temperature
limit larger than MAXINT will result in unpredictable limit values
written to the chip. Avoid auto-conversion from long to int to fix
the problem.

Cc: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohwmon: (lm85) Fix various errors on attribute writes
Guenter Roeck [Wed, 30 Jul 2014 05:23:12 +0000 (22:23 -0700)]
hwmon: (lm85) Fix various errors on attribute writes

commit 3248c3b771ddd9d31695da17ba350eb6e1b80a53 upstream.

Temperature limit register writes did not account for negative numbers.
As a result, writing -127000 resulted in -126000 written into the
temperature limit register. This problem affected temp[1-3]_min,
temp[1-3]_max, temp[1-3]_auto_temp_crit, and temp[1-3]_auto_temp_min.

When writing pwm[1-3]_freq, a long variable was auto-converted into an int
without range check. Wiring values larger than MAXINT resulted in unexpected
register values.

When writing temp[1-3]_auto_temp_max, an unsigned long variable was
auto-converted into an int without range check. Writing values larger than
MAXINT resulted in unexpected register values.

vrm is an u8, so the written value needs to be limited to [0, 255].

Cc: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[bwh: Backported to 3.2:
 - Driver is not using clamp_val(); keep using SENSORS_LIMIT() for consistency
 - Driver is not using kstrtoul(); make the minimum change to store_vrm_reg()
   so we can validate the value before assigning]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoext4: fix ext4_discard_allocated_blocks() if we can't allocate the pa struct
Theodore Ts'o [Thu, 31 Jul 2014 02:17:17 +0000 (22:17 -0400)]
ext4: fix ext4_discard_allocated_blocks() if we can't allocate the pa struct

commit 86f0afd463215fc3e58020493482faa4ac3a4d69 upstream.

If there is a failure while allocating the preallocation structure, a
number of blocks can end up getting marked in the in-memory buddy
bitmap, and then not getting released.  This can result in the
following corruption getting reported by the kernel:

EXT4-fs error (device sda3): ext4_mb_generate_buddy:758: group 1126,
12793 clusters in bitmap, 12729 in gd

In that case, we need to release the blocks using mb_free_blocks().

Tested: fs smoke test; also demonstrated that with injected errors,
the file system is no longer getting corrupted

Google-Bug-Id: 16657874

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoext4: cleanup in ext4_discard_allocated_blocks()
Zheng Liu [Mon, 28 May 2012 21:53:53 +0000 (17:53 -0400)]
ext4: cleanup in ext4_discard_allocated_blocks()

commit 400db9d30146dc062aaba97a6301b425eb6015bc upstream.

remove 'len' variable in ext4_discard_allocated_blocks() because it is
useless.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agomd/raid1,raid10: always abort recover on write error.
NeilBrown [Thu, 31 Jul 2014 00:16:29 +0000 (10:16 +1000)]
md/raid1,raid10: always abort recover on write error.

commit 2446dba03f9dabe0b477a126cbeb377854785b47 upstream.

Currently we don't abort recovery on a write error if the write error
to the recovering device was triggerd by normal IO (as opposed to
recovery IO).

This means that for one bitmap region, the recovery might write to the
recovering device for a few sectors, then not bother for subsequent
sectors (as it never writes to failed devices).  In this case
the bitmap bit will be cleared, but it really shouldn't.

The result is that if the recovering device fails and is then re-added
(after fixing whatever hardware problem triggerred the failure),
the second recovery won't redo the region it was in the middle of,
so some of the device will not be recovered properly.

If we abort the recovery, the region being processes will be cancelled
(bit not cleared) and the whole region will be retried.

As the bug can result in data corruption the patch is suitable for
-stable.  For kernels prior to 3.11 there is a conflict in raid10.c
which will require care.

Original-from: jiao hui <jiaohui@bwstor.com.cn>
Reported-and-tested-by: jiao hui <jiaohui@bwstor.com.cn>
Signed-off-by: NeilBrown <neilb@suse.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agomm, thp: do not allow thp faults to avoid cpuset restrictions
David Rientjes [Wed, 30 Jul 2014 23:08:24 +0000 (16:08 -0700)]
mm, thp: do not allow thp faults to avoid cpuset restrictions

commit b104a35d32025ca740539db2808aa3385d0f30eb upstream.

The page allocator relies on __GFP_WAIT to determine if ALLOC_CPUSET
should be set in allocflags.  ALLOC_CPUSET controls if a page allocation
should be restricted only to the set of allowed cpuset mems.

Transparent hugepages clears __GFP_WAIT when defrag is disabled to prevent
the fault path from using memory compaction or direct reclaim.  Thus, it
is unfairly able to allocate outside of its cpuset mems restriction as a
side-effect.

This patch ensures that ALLOC_CPUSET is only cleared when the gfp mask is
truly GFP_ATOMIC by verifying it is also not a thp allocation.

Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Alex Thorlton <athorlton@sgi.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Bob Liu <lliubbo@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Hedi Berriche <hedi@sgi.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoMIPS: Prevent user from setting FCSR cause bits
Paul Burton [Tue, 22 Jul 2014 13:21:21 +0000 (14:21 +0100)]
MIPS: Prevent user from setting FCSR cause bits

commit b1442d39fac2fcfbe6a4814979020e993ca59c9e upstream.

If one or more matching FCSR cause & enable bits are set in saved thread
context then when that context is restored the kernel will take an FP
exception. This is of course undesirable and considered an oops, leading
to the kernel writing a backtrace to the console and potentially
rebooting depending upon the configuration. Thus the kernel avoids this
situation by clearing the cause bits of the FCSR register when handling
FP exceptions and after emulating FP instructions.

However the kernel does not prevent userland from setting arbitrary FCSR
cause & enable bits via ptrace, using either the PTRACE_POKEUSR or
PTRACE_SETFPREGS requests. This means userland can trivially cause the
kernel to oops on any system with an FPU. Prevent this from happening
by clearing the cause bits when writing to the saved FCSR context via
ptrace.

This problem appears to exist at least back to the beginning of the git
era in the PTRACE_POKEUSR case.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Paul Burton <paul.burton@imgtec.com>
Patchwork: https://patchwork.linux-mips.org/patch/7438/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoMIPS: tlbex: Fix a missing statement for HUGETLB
Huacai Chen [Tue, 29 Jul 2014 06:54:40 +0000 (14:54 +0800)]
MIPS: tlbex: Fix a missing statement for HUGETLB

commit 8393c524a25609a30129e4a8975cf3b91f6c16a5 upstream.

In commit 2c8c53e28f1 (MIPS: Optimize TLB handlers for Octeon CPUs)
build_r4000_tlb_refill_handler() is modified. But it doesn't compatible
with the original code in HUGETLB case. Because there is a copy & paste
error and one line of code is missing. It is very easy to produce a bug
with LTP's hugemmap05 test.

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Binbin Zhou <zhoubb@lemote.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/7496/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohwmon: (ads1015) Fix off-by-one for valid channel index checking
Axel Lin [Wed, 30 Jul 2014 03:13:52 +0000 (11:13 +0800)]
hwmon: (ads1015) Fix off-by-one for valid channel index checking

commit 56de1377ad92f72ee4e5cb0faf7a9b6048fdf0bf upstream.

Current code uses channel as array index, so the valid channel value is
0 .. ADS1015_CHANNELS - 1.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agotpm: Provide a generic means to override the chip returned timeouts
Jason Gunthorpe [Thu, 22 May 2014 00:26:44 +0000 (18:26 -0600)]
tpm: Provide a generic means to override the chip returned timeouts

commit 8e54caf407b98efa05409e1fee0e5381abd2b088 upstream.

Some Atmel TPMs provide completely wrong timeouts from their
TPM_CAP_PROP_TIS_TIMEOUT query. This patch detects that and returns
new correct values via a DID/VID table in the TIS driver.

Tested on ARM using an AT97SC3204T FW version 37.16

[PHuewe: without this fix these 'broken' Atmel TPMs won't function on
older kernels]
Signed-off-by: "Berg, Christopher" <Christopher.Berg@atmel.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
[bwh: Backported to 3.2:
 - Adjust filename, context
 - s/chip->ops->/chip->vendor./]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agonet: sendmsg: fix NULL pointer dereference
Andrey Ryabinin [Sat, 26 Jul 2014 17:26:58 +0000 (21:26 +0400)]
net: sendmsg: fix NULL pointer dereference

commit 40eea803c6b2cfaab092f053248cbeab3f368412 upstream.

Sasha's report:
> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel with the KASAN patchset, I've stumbled on the following spew:
>
> [ 4448.949424] ==================================================================
> [ 4448.951737] AddressSanitizer: user-memory-access on address 0
> [ 4448.952988] Read of size 2 by thread T19638:
> [ 4448.954510] CPU: 28 PID: 19638 Comm: trinity-c76 Not tainted 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813
> [ 4448.956823]  ffff88046d86ca40 0000000000000000 ffff880082f37e78 ffff880082f37a40
> [ 4448.958233]  ffffffffb6e47068 ffff880082f37a68 ffff880082f37a58 ffffffffb242708d
> [ 4448.959552]  0000000000000000 ffff880082f37a88 ffffffffb24255b1 0000000000000000
> [ 4448.961266] Call Trace:
> [ 4448.963158] dump_stack (lib/dump_stack.c:52)
> [ 4448.964244] kasan_report_user_access (mm/kasan/report.c:184)
> [ 4448.965507] __asan_load2 (mm/kasan/kasan.c:352)
> [ 4448.966482] ? netlink_sendmsg (net/netlink/af_netlink.c:2339)
> [ 4448.967541] netlink_sendmsg (net/netlink/af_netlink.c:2339)
> [ 4448.968537] ? get_parent_ip (kernel/sched/core.c:2555)
> [ 4448.970103] sock_sendmsg (net/socket.c:654)
> [ 4448.971584] ? might_fault (mm/memory.c:3741)
> [ 4448.972526] ? might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3740)
> [ 4448.973596] ? verify_iovec (net/core/iovec.c:64)
> [ 4448.974522] ___sys_sendmsg (net/socket.c:2096)
> [ 4448.975797] ? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
> [ 4448.977030] ? lock_release_holdtime (kernel/locking/lockdep.c:273)
> [ 4448.978197] ? lock_release_non_nested (kernel/locking/lockdep.c:3434 (discriminator 1))
> [ 4448.979346] ? check_chain_key (kernel/locking/lockdep.c:2188)
> [ 4448.980535] __sys_sendmmsg (net/socket.c:2181)
> [ 4448.981592] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
> [ 4448.982773] ? trace_hardirqs_on (kernel/locking/lockdep.c:2607)
> [ 4448.984458] ? syscall_trace_enter (arch/x86/kernel/ptrace.c:1500 (discriminator 2))
> [ 4448.985621] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
> [ 4448.986754] SyS_sendmmsg (net/socket.c:2201)
> [ 4448.987708] tracesys (arch/x86/kernel/entry_64.S:542)
> [ 4448.988929] ==================================================================

This reports means that we've come to netlink_sendmsg() with msg->msg_name == NULL and msg->msg_namelen > 0.

After this report there was no usual "Unable to handle kernel NULL pointer dereference"
and this gave me a clue that address 0 is mapped and contains valid socket address structure in it.

This bug was introduced in f3d3342602f8bcbf37d7c46641cb9bca7618eb1c
(net: rework recvmsg handler msg_name and msg_namelen logic).
Commit message states that:
"Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
 non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
 affect sendto as it would bail out earlier while trying to copy-in the
 address."
But in fact this affects sendto when address 0 is mapped and contains
socket address structure in it. In such case copy-in address will succeed,
verify_iovec() function will successfully exit with msg->msg_namelen > 0
and msg->msg_name == NULL.

This patch fixes it by setting msg_namelen to 0 if msg_name == NULL.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Eric Dumazet <edumazet@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoiommu/vt-d: Exclude devices using RMRRs from IOMMU API domains
Alex Williamson [Thu, 3 Jul 2014 15:57:02 +0000 (09:57 -0600)]
iommu/vt-d: Exclude devices using RMRRs from IOMMU API domains

commit c875d2c1b8083cd627ea0463e20bf22c2d7421ee upstream.

The user of the IOMMU API domain expects to have full control of
the IOVA space for the domain.  RMRRs are fundamentally incompatible
with that idea.  We can neither map the RMRR into the IOMMU API
domain, nor can we guarantee that the device won't continue DMA with
the area described by the RMRR as part of the new domain.  Therefore
we must prevent such devices from being used by the IOMMU API.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
[bwh: Backported to 3.2: driver only operates on PCI devices]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoFix gcc-4.9.0 miscompilation of load_balance() in scheduler
Linus Torvalds [Sat, 26 Jul 2014 21:52:01 +0000 (14:52 -0700)]
Fix gcc-4.9.0 miscompilation of load_balance() in scheduler

commit 2062afb4f804afef61cbe62a30cac9a46e58e067 upstream.

Michel Dänzer and a couple of other people reported inexplicable random
oopses in the scheduler, and the cause turns out to be gcc mis-compiling
the load_balance() function when debugging is enabled.  The gcc bug
apparently goes back to gcc-4.5, but slight optimization changes means
that it now showed up as a problem in 4.9.0 and 4.9.1.

The instruction scheduling problem causes gcc to schedule a spill
operation to before the stack frame has been created, which in turn can
corrupt the spilled value if an interrupt comes in.  There may be other
effects of this bug too, but that's the code generation problem seen in
Michel's case.

This is fixed in current gcc HEAD, but the workaround as suggested by
Markus Trippelsdorf is pretty simple: use -fno-var-tracking-assignments
when compiling the kernel, which disables the gcc code that causes the
problem.  This can result in slightly worse debug information for
variable accesses, but that is infinitely preferable to actual code
generation problems.

Doing this unconditionally (not just for CONFIG_DEBUG_INFO) also allows
non-debug builds to verify that the debug build would be identical: we
can do

    export GCC_COMPARE_DEBUG=1

to make gcc internally verify that the result of the build is
independent of the "-g" flag (it will make the compiler build everything
twice, toggling the debug flag, and compare the results).

Without the "-fno-var-tracking-assignments" option, the build would fail
(even with 4.8.3 that didn't show the actual stack frame bug) with a gcc
compare failure.

See also gcc bugzilla:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61801

Reported-by: Michel Dänzer <michel@daenzer.net>
Suggested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoDrivers: scsi: storvsc: Implement a eh_timed_out handler
K. Y. Srinivasan [Sat, 12 Jul 2014 16:48:30 +0000 (09:48 -0700)]
Drivers: scsi: storvsc: Implement a eh_timed_out handler

commit 56b26e69c8283121febedd12b3cc193384af46b9 upstream.

On Azure, we have seen instances of unbounded I/O latencies. To deal with
this issue, implement handler that can reset the timeout. Note that the
host gaurantees that it will respond to each command that has been issued.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
[hch: added a better comment explaining the issue]
Signed-off-by: Christoph Hellwig <hch@lst.de>
[bwh: Backported to 3.2: adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohpsa: fix bad -ENOMEM return value in hpsa_big_passthru_ioctl
Stephen M. Cameron [Thu, 3 Jul 2014 15:18:03 +0000 (10:18 -0500)]
hpsa: fix bad -ENOMEM return value in hpsa_big_passthru_ioctl

commit 0758f4f732b08b6ef07f2e5f735655cf69fea477 upstream.

When copy_from_user fails, return -EFAULT, not -ENOMEM

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-by: Robert Elliott <elliott@hp.com>
Reviewed-by: Joe Handzik <joseph.t.handzik@hp.com>
Reviewed-by: Scott Teel <scott.teel@hp.com>
Reviewed by: Mike MIller <michael.miller@canonical.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agobfa: Fix undefined bit shift on big-endian architectures with 32-bit DMA address
Ben Hutchings [Sun, 8 Jun 2014 22:33:25 +0000 (23:33 +0100)]
bfa: Fix undefined bit shift on big-endian architectures with 32-bit DMA address

commit 03a6c3ff3282ee9fa893089304d951e0be93a144 upstream.

bfa_swap_words() shifts its argument (assumed to be 64-bit) by 32 bits
each way.  In two places the argument type is dma_addr_t, which may be
32-bit, in which case the effect of the bit shift is undefined:

drivers/scsi/bfa/bfa_fcpim.c: In function 'bfa_ioim_send_ioreq':
drivers/scsi/bfa/bfa_fcpim.c:2497:4: warning: left shift count >= width of type [enabled by default]
    addr = bfa_sgaddr_le(sg_dma_address(sg));
    ^
drivers/scsi/bfa/bfa_fcpim.c:2497:4: warning: right shift count >= width of type [enabled by default]
drivers/scsi/bfa/bfa_fcpim.c:2509:4: warning: left shift count >= width of type [enabled by default]
    addr = bfa_sgaddr_le(sg_dma_address(sg));
    ^
drivers/scsi/bfa/bfa_fcpim.c:2509:4: warning: right shift count >= width of type [enabled by default]

Avoid this by adding casts to u64 in bfa_swap_words().

Compile-tested only.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Fixes: f16a17507b09 ('[SCSI] bfa: remove all OS wrappers')
Signed-off-by: Christoph Hellwig <hch@lst.de>
7 years agostaging: vt6655: Fix disassociated messages every 10 seconds
Malcolm Priestley [Wed, 23 Jul 2014 20:35:12 +0000 (21:35 +0100)]
staging: vt6655: Fix disassociated messages every 10 seconds

commit 4aa0abed3a2a11b7d71ad560c1a3e7631c5a31cd upstream.

byReAssocCount is incremented every second resulting in
disassociated message being send every 10 seconds whether
connection or not.

byReAssocCount should only advance while eCommandState
is in WLAN_ASSOCIATE_WAIT

Change existing scope to if condition.

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context, indentation]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agostaging: vt6655: Fix Warning on boot handle_irq_event_percpu.
Malcolm Priestley [Wed, 23 Jul 2014 20:35:11 +0000 (21:35 +0100)]
staging: vt6655: Fix Warning on boot handle_irq_event_percpu.

commit 6cff1f6ad4c615319c1a146b2aa0af1043c5e9f5 upstream.

WARNING: CPU: 0 PID: 929 at /home/apw/COD/linux/kernel/irq/handle.c:147 handle_irq_event_percpu+0x1d1/0x1e0()
irq 17 handler device_intr+0x0/0xa80 [vt6655_stage] enabled interrupts

Using spin_lock_irqsave appears to fix this.

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context, indentation]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agohwmon: (smsc47m192) Fix temperature limit and vrm write operations
Guenter Roeck [Fri, 18 Jul 2014 14:31:18 +0000 (07:31 -0700)]
hwmon: (smsc47m192) Fix temperature limit and vrm write operations

commit 043572d5444116b9d9ad8ae763cf069e7accbc30 upstream.

Temperature limit clamps are applied after converting the temperature
from milli-degrees C to degrees C, so either the clamp limit needs
to be specified in degrees C, not milli-degrees C, or clamping must
happen before converting to degrees C. Use the latter method to avoid
overflows.

vrm is an u8, so the written value needs to be limited to [0, 255].

Cc: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
[bwh: Backported to 3.2:
 - Driver is not using clamp_val(); keep using SENSORS_LIMIT() for consistency
 - Driver is not using kstrtoul(); make the minimum change to set_vrm() so
   we can validate the value before assigning]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agodrm/radeon: fix irq ring buffer overflow handling
Christian König [Wed, 23 Jul 2014 07:47:58 +0000 (09:47 +0200)]
drm/radeon: fix irq ring buffer overflow handling

commit e8c214d22e76dd0ead38f97f8d2dc09aac70d651 upstream.

We must mask out the overflow bit as well, otherwise
the wptr will never match the rptr again and the interrupt
handler will loop forever.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
[bwh: Backported to 3.2: drop changes for unsupported GPUs]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: Fix persist resume of some SS USB devices
Pratyush Anand [Fri, 18 Jul 2014 07:07:10 +0000 (12:37 +0530)]
USB: Fix persist resume of some SS USB devices

commit a40178b2fa6ad87670fb1e5fa4024db00c149629 upstream.

Problem Summary: Problem has been observed generally with PM states
where VBUS goes off during suspend. There are some SS USB devices which
take longer time for link training compared to many others.  Such
devices fail to reconnect with same old address which was associated
with it before suspend.

When system resumes, at some point of time (dpm_run_callback->
usb_dev_resume->usb_resume->usb_resume_both->usb_resume_device->
usb_port_resume) SW reads hub status. If device is present,
then it finishes port resume and re-enumerates device with same
address. If device is not present then, SW thinks that device was
removed during suspend and therefore does logical disconnection
and removes all the resource allocated for this device.

Now, if I put sufficient delay just before root hub status read in
usb_resume_device then, SW sees always that device is present. In normal
course(without any delay) SW sees that no device is present and then SW
removes all resource associated with the device at this port.  In the
latter case, after sometime, device says that hey I am here, now host
enumerates it, but with new address.

Problem had been reproduced when I connect verbatim USB3.0 hard disc
with my STiH407 XHCI host running with 3.10 kernel.

I see that similar problem has been reported here.
https://bugzilla.kernel.org/show_bug.cgi?id=53211
Reading above it seems that bug was not in 3.6.6 and was present in 3.8
and again it was not present for some in 3.12.6, while it was present
for few others. I tested with 3.13-FC19 running at i686 desktop, problem
was still there. However, I was failed to reproduce it with 3.16-RC4
running at same i686 machine. I would say it is just a random
observation. Problem for few devices is always there, as I am unable to
find a proper fix for the issue.

So, now question is what should be the amount of delay so that host is
always able to recognize suspended device after resume.

XHCI specs 4.19.4 says that when Link training is successful, port sets
CSC bit to 1. So if SW reads port status before successful link
training, then it will not find device to be present.  USB Analyzer log
with such buggy devices show that in some cases device switch on the
RX termination after long delay of host enabling the VBUS. In few other
cases it has been seen that device fails to negotiate link training in
first attempt. It has been reported till now that few devices take as
long as 2000 ms to train the link after host enabling its VBUS and
RX termination. This patch implements a 2000 ms timeout for CSC bit to set
ie for link training. If in a case link trains before timeout, loop will
exit earlier.

This patch implements above delay, but only for SS device and when
persist is enabled.

So, for the good device overhead is almost none. While for the bad
devices penalty could be the time which it take for link training.
But, If a device was connected before suspend, and was removed
while system was asleep, then the penalty would be the timeout ie
2000 ms.

Results:

Verbatim USB SS hard disk connected with STiH407 USB host running 3.10
Kernel resumes in 461 msecs without this patch, but hard disk is
assigned a new device address. Same system resumes in 790 msecs with
this patch, but with old device address.

Signed-off-by: Pratyush Anand <pratyush.anand@st.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agousbcore: don't log on consecutive debounce failures of the same port
Oliver Neukum [Mon, 14 Jul 2014 13:39:49 +0000 (15:39 +0200)]
usbcore: don't log on consecutive debounce failures of the same port

commit 5ee0f803cc3a0738a63288e4a2f453c85889fbda upstream.

Some laptops have an internal port for a BT device which picks
up noise when the kill switch is used, but not enough to trigger
printk_rlimit(). So we shouldn't log consecutive faults of this kind.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2:
 - Adjust context
 - Error message already includes the port number]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoahci: add support for the Promise FastTrak TX8660 SATA HBA (ahci mode)
Romain Degez [Fri, 11 Jul 2014 16:08:13 +0000 (18:08 +0200)]
ahci: add support for the Promise FastTrak TX8660 SATA HBA (ahci mode)

commit b32bfc06aefab61acc872dec3222624e6cd867ed upstream.

Add support of the Promise FastTrak TX8660 SATA HBA in ahci mode by
registering the board in the ahci_pci_tbl[].

Note: this HBA also provide a hardware RAID mode when activated in
BIOS but specific drivers from the manufacturer are required in this
case.

Signed-off-by: Romain Degez <romain.degez@gmail.com>
Tested-by: Romain Degez <romain.degez@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: OHCI: don't lose track of EDs when a controller dies
Alan Stern [Thu, 17 Jul 2014 20:34:29 +0000 (16:34 -0400)]
USB: OHCI: don't lose track of EDs when a controller dies

commit 977dcfdc60311e7aa571cabf6f39c36dde13339e upstream.

This patch fixes a bug in ohci-hcd.  When an URB is unlinked, the
corresponding Endpoint Descriptor is added to the ed_rm_list and taken
off the hardware schedule.  Once the ED is no longer visible to the
hardware, finish_unlinks() handles the URBs that were unlinked or have
completed.  If any URBs remain attached to the ED, the ED is added
back to the hardware schedule -- but only if the controller is
running.

This fails when a controller dies.  A non-empty ED does not get added
back to the hardware schedule and does not remain on the ed_rm_list;
ohci-hcd loses track of it.  The remaining URBs cannot be unlinked,
which causes the USB stack to hang.

The patch changes finish_unlinks() so that non-empty EDs remain on
the ed_rm_list if the controller isn't running.  This requires moving
some of the existing code around, to avoid modifying the ED's hardware
fields more than once.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: keep using HC_IS_RUNNING()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoscsi: handle flush errors properly
James Bottomley [Thu, 3 Jul 2014 17:17:34 +0000 (19:17 +0200)]
scsi: handle flush errors properly

commit 89fb4cd1f717a871ef79fa7debbe840e3225cd54 upstream.

Flush commands don't transfer data and thus need to be special cased
in the I/O completion handler so that we can propagate errors to
the block layer and filesystem.

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Reported-by: Steven Haber <steven@qumulo.com>
Tested-by: Steven Haber <steven@qumulo.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoBluetooth: never linger on process exit
Vladimir Davydov [Tue, 15 Jul 2014 08:25:28 +0000 (12:25 +0400)]
Bluetooth: never linger on process exit

commit 093facf3634da1b0c2cc7ed106f1983da901bbab upstream.

If the current process is exiting, lingering on socket close will make
it unkillable, so we should avoid it.

Reproducer:

  #include <sys/types.h>
  #include <sys/socket.h>

  #define BTPROTO_L2CAP   0
  #define BTPROTO_SCO     2
  #define BTPROTO_RFCOMM  3

  int main()
  {
          int fd;
          struct linger ling;

          fd = socket(PF_BLUETOOTH, SOCK_STREAM, BTPROTO_RFCOMM);
          //or: fd = socket(PF_BLUETOOTH, SOCK_DGRAM, BTPROTO_L2CAP);
          //or: fd = socket(PF_BLUETOOTH, SOCK_SEQPACKET, BTPROTO_SCO);

          ling.l_onoff = 1;
          ling.l_linger = 1000000000;
          setsockopt(fd, SOL_SOCKET, SO_LINGER, &ling, sizeof(ling));

          return 0;
  }

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agox86: don't exclude low BIOS area when allocating address space for non-PCI cards
Christoph Schulz [Wed, 16 Jul 2014 08:00:57 +0000 (10:00 +0200)]
x86: don't exclude low BIOS area when allocating address space for non-PCI cards

commit cbace46a9710a480cae51e4611697df5de41713e upstream.

Commit 30919b0bf356 ("x86: avoid low BIOS area when allocating address
space") moved the test for resource allocations that fall within the first
1MB of address space from the PCI-specific path to a generic path, such
that all resource allocations will avoid this area.  However, this breaks
ISA cards which need to allocate a memory region within the first 1MB.  An
example is the i82365 PCMCIA controller and derivatives like the Ricoh
RF5C296/396 which map part of the PCMCIA socket memory address space into
the first 1MB of system memory address space.  They do not work anymore as
no usable memory region exists due to this change:

  Intel ISA PCIC probe: Ricoh RF5C296/396 ISA-to-PCMCIA at port 0x3e0 ofs 0x00, 2 sockets
  host opts [0]: none
  host opts [1]: none
  ISA irqs (scanned) = 3,4,5,9,10 status change on irq 10
  pcmcia_socket pcmcia_socket1: pccard: PCMCIA card inserted into slot 1
  pcmcia_socket pcmcia_socket0: cs: IO port probe 0xc00-0xcff: excluding 0xcf8-0xcff
  pcmcia_socket pcmcia_socket0: cs: IO port probe 0xa00-0xaff: clean.
  pcmcia_socket pcmcia_socket0: cs: IO port probe 0x100-0x3ff: excluding 0x170-0x177 0x1f0-0x1f7 0x2f8-0x2ff 0x370-0x37f 0x3c0-0x3e7 0x3f0-0x3ff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0a0000-0x0affff: excluding 0xa0000-0xaffff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0b0000-0x0bffff: excluding 0xb0000-0xbffff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0c0000-0x0cffff: excluding 0xc0000-0xcbfff
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0d0000-0x0dffff: clean.
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x0e0000-0x0effff: clean.
  pcmcia_socket pcmcia_socket0: cs: memory probe 0x60000000-0x60ffffff: clean.
  pcmcia_socket pcmcia_socket0: cs: memory probe 0xa0000000-0xa0ffffff: clean.
  pcmcia_socket pcmcia_socket1: cs: IO port probe 0xc00-0xcff: excluding 0xcf8-0xcff
  pcmcia_socket pcmcia_socket1: cs: IO port probe 0xa00-0xaff: clean.
  pcmcia_socket pcmcia_socket1: cs: IO port probe 0x100-0x3ff: excluding 0x170-0x177 0x1f0-0x1f7 0x2f8-0x2ff 0x370-0x37f 0x3c0-0x3e7 0x3f0-0x3ff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0a0000-0x0affff: excluding 0xa0000-0xaffff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0b0000-0x0bffff: excluding 0xb0000-0xbffff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0c0000-0x0cffff: excluding 0xc0000-0xcbfff
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0d0000-0x0dffff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0e0000-0x0effff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x60000000-0x60ffffff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0xa0000000-0xa0ffffff: clean.
  pcmcia_socket pcmcia_socket1: cs: memory probe 0x0cc000-0x0effff: excluding 0xe0000-0xeffff
  pcmcia_socket pcmcia_socket1: cs: unable to map card memory!

If filtering out the first 1MB is reverted, everything works as expected.

Tested-by: Robert Resch <fli4l@robert.reschpara.de>
Signed-off-by: Christoph Schulz <develop@kristov.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agomtd/ftl: fix the double free of the buffers allocated in build_maps()
Kevin Hao [Thu, 3 Jul 2014 02:35:26 +0000 (10:35 +0800)]
mtd/ftl: fix the double free of the buffers allocated in build_maps()

commit a152056c912db82860a8b4c23d0bd3a5aa89e363 upstream.

I got the following panic on my fsl p5020ds board.

  Unable to handle kernel paging request for data at address 0x7375627379737465
  Faulting instruction address: 0xc000000000100778
  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=24 CoreNet Generic
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-next-20140613 #145
  task: c0000000fe080000 ti: c0000000fe088000 task.ti: c0000000fe088000
  NIP: c000000000100778 LR: c00000000010073c CTR: 0000000000000000
  REGS: c0000000fe08aa00 TRAP: 0300   Not tainted  (3.15.0-next-20140613)
  MSR: 0000000080029000 <CE,EE,ME>  CR: 24ad2e24  XER: 00000000
  DEAR: 7375627379737465 ESR: 0000000000000000 SOFTE: 1
  GPR00: c0000000000c99b0 c0000000fe08ac80 c0000000009598e0 c0000000fe001d80
  GPR04: 00000000000000d0 0000000000000913 c000000007902b20 0000000000000000
  GPR08: c0000000feaae888 0000000000000000 0000000007091000 0000000000200200
  GPR12: 0000000028ad2e28 c00000000fff4000 c0000000007abe08 0000000000000000
  GPR16: c0000000007ab160 c0000000007aaf98 c00000000060ba68 c0000000007abda8
  GPR20: c0000000007abde8 c0000000feaea6f8 c0000000feaea708 c0000000007abd10
  GPR24: c000000000989370 c0000000008c6228 00000000000041ed c0000000fe00a400
  GPR28: c00000000017c1cc 00000000000000d0 7375627379737465 c0000000fe001d80
  NIP [c000000000100778] .__kmalloc_track_caller+0x70/0x168
  LR [c00000000010073c] .__kmalloc_track_caller+0x34/0x168
  Call Trace:
  [c0000000fe08ac80] [c00000000087e6b8] uevent_sock_list+0x0/0x10 (unreliable)
  [c0000000fe08ad20] [c0000000000c99b0] .kstrdup+0x44/0x90
  [c0000000fe08adc0] [c00000000017c1cc] .__kernfs_new_node+0x4c/0x130
  [c0000000fe08ae70] [c00000000017d7e4] .kernfs_new_node+0x2c/0x64
  [c0000000fe08aef0] [c00000000017db00] .kernfs_create_dir_ns+0x34/0xc8
  [c0000000fe08af80] [c00000000018067c] .sysfs_create_dir_ns+0x58/0xcc
  [c0000000fe08b010] [c0000000002c711c] .kobject_add_internal+0xc8/0x384
  [c0000000fe08b0b0] [c0000000002c7644] .kobject_add+0x64/0xc8
  [c0000000fe08b140] [c000000000355ebc] .device_add+0x11c/0x654
  [c0000000fe08b200] [c0000000002b5988] .add_disk+0x20c/0x4b4
  [c0000000fe08b2c0] [c0000000003a21d4] .add_mtd_blktrans_dev+0x340/0x514
  [c0000000fe08b350] [c0000000003a3410] .mtdblock_add_mtd+0x74/0xb4
  [c0000000fe08b3e0] [c0000000003a32cc] .blktrans_notify_add+0x64/0x94
  [c0000000fe08b470] [c00000000039b5b4] .add_mtd_device+0x1d4/0x368
  [c0000000fe08b520] [c00000000039b830] .mtd_device_parse_register+0xe8/0x104
  [c0000000fe08b5c0] [c0000000003b8408] .of_flash_probe+0x72c/0x734
  [c0000000fe08b750] [c00000000035ba40] .platform_drv_probe+0x38/0x84
  [c0000000fe08b7d0] [c0000000003599a4] .really_probe+0xa4/0x29c
  [c0000000fe08b870] [c000000000359d3c] .__driver_attach+0x100/0x104
  [c0000000fe08b900] [c00000000035746c] .bus_for_each_dev+0x84/0xe4
  [c0000000fe08b9a0] [c0000000003593c0] .driver_attach+0x24/0x38
  [c0000000fe08ba10] [c000000000358f24] .bus_add_driver+0x1c8/0x2ac
  [c0000000fe08bab0] [c00000000035a3a4] .driver_register+0x8c/0x158
  [c0000000fe08bb30] [c00000000035b9f4] .__platform_driver_register+0x6c/0x80
  [c0000000fe08bba0] [c00000000084e080] .of_flash_driver_init+0x1c/0x30
  [c0000000fe08bc10] [c000000000001864] .do_one_initcall+0xbc/0x238
  [c0000000fe08bd00] [c00000000082cdc0] .kernel_init_freeable+0x188/0x268
  [c0000000fe08bdb0] [c0000000000020a0] .kernel_init+0x1c/0xf7c
  [c0000000fe08be30] [c000000000000884] .ret_from_kernel_thread+0x58/0xd4
  Instruction dump:
  41bd0010 480000c8 4bf04eb5 60000000 e94d0028 e93f0000 7cc95214 e8a60008
  7fc9502a 2fbe0000 419e00c8 e93f0022 <7f7e482a39200000 88ed06b2 992d06b2
  ---[ end trace b4c9a94804a42d40 ]---

It seems that the corrupted partition header on my mtd device triggers
a bug in the ftl. In function build_maps() it will allocate the buffers
needed by the mtd partition, but if something goes wrong such as kmalloc
failure, mtd read error or invalid partition header parameter, it will
free all allocated buffers and then return non-zero. In my case, it
seems that partition header parameter 'NumTransferUnits' is invalid.

And the ftl_freepart() is a function which free all the partition
buffers allocated by build_maps(). Given the build_maps() is a self
cleaning function, so there is no need to invoke this function even
if build_maps() return with error. Otherwise it will causes the
buffers to be freed twice and then weird things would happen.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agogspca_pac7302: Add new usb-id for Genius i-Look 317
Hans de Goede [Wed, 9 Jul 2014 09:20:44 +0000 (06:20 -0300)]
gspca_pac7302: Add new usb-id for Genius i-Look 317

commit 242841d3d71191348f98310e2d2001e1001d8630 upstream.

Tested-and-reported-by: yullaw <yullaw@mageia.cz>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agotda10071: force modulation to QPSK on DVB-S
Antti Palosaari [Fri, 4 Jul 2014 08:44:39 +0000 (05:44 -0300)]
tda10071: force modulation to QPSK on DVB-S

commit db4175ae2095634dbecd4c847da439f9c83e1b3b upstream.

Only supported modulation for DVB-S is QPSK. Modulation parameter
contains invalid value for DVB-S on some cases, which leads driver
refusing tuning attempt. Due to that, hard code modulation to QPSK
in case of DVB-S.

Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoserial: core: Preserve termios c_cflag for console resume
Peter Hurley [Wed, 9 Jul 2014 13:21:14 +0000 (09:21 -0400)]
serial: core: Preserve termios c_cflag for console resume

commit ae84db9661cafc63d179e1d985a2c5b841ff0ac4 upstream.

When a tty is opened for the serial console, the termios c_cflag
settings are inherited from the console line settings.
However, if the tty is subsequently closed, the termios settings
are lost. This results in a garbled console if the console is later
suspended and resumed.

Preserve the termios c_cflag for the serial console when the tty
is shutdown; this reflects the most recent line settings.

Fixes: Bugzilla #69751, 'serial console does not wake from S3'
Reported-by: Valerio Vanni <valerio.vanni@inwind.it>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: tty_struct::termios is a pointer]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agodebugfs: Fix corrupted loop in debugfs_remove_recursive
Steven Rostedt [Mon, 9 Jun 2014 18:06:07 +0000 (14:06 -0400)]
debugfs: Fix corrupted loop in debugfs_remove_recursive

commit 485d44022a152c0254dd63445fdb81c4194cbf0e upstream.

[ I'm currently running my tests on it now, and so far, after a few
 hours it has yet to blow up. I'll run it for 24 hours which it never
 succeeded in the past. ]

The tracing code has a way to make directories within the debugfs file
system as well as deleting them using mkdir/rmdir in the instance
directory. This is very limited in functionality, such as there is
no renames, and the parent directory "instance" can not be modified.
The tracing code creates the instance directory from the debugfs code
and then replaces the dentry->d_inode->i_op with its own to allow
for mkdir/rmdir to work.

When these are called, the d_entry and inode locks need to be released
to call the instance creation and deletion code. That code has its own
accounting and locking to serialize everything to prevent multiple
users from causing harm. As the parent "instance" directory can not
be modified this simplifies things.

I created a stress test that creates several threads that randomly
creates and deletes directories thousands of times a second. The code
stood up to this test and I submitted it a while ago.

Recently I added a new test that adds readers to the mix. While the
instance directories were being added and deleted, readers would read
from these directories and even enable tracing within them. This test
was able to trigger a bug:

 general protection fault: 0000 [#1] PREEMPT SMP
 Modules linked in: ...
 CPU: 3 PID: 17789 Comm: rmdir Tainted: G        W     3.15.0-rc2-test+ #41
 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
 task: ffff88003786ca60 ti: ffff880077018000 task.ti: ffff880077018000
 RIP: 0010:[<ffffffff811ed5eb>]  [<ffffffff811ed5eb>] debugfs_remove_recursive+0x1bd/0x367
 RSP: 0018:ffff880077019df8  EFLAGS: 00010246
 RAX: 0000000000000002 RBX: ffff88006f0fe490 RCX: 0000000000000000
 RDX: dead000000100058 RSI: 0000000000000246 RDI: ffff88003786d454
 RBP: ffff88006f0fe640 R08: 0000000000000628 R09: 0000000000000000
 R10: 0000000000000628 R11: ffff8800795110a0 R12: ffff88006f0fe640
 R13: ffff88006f0fe640 R14: ffffffff81817d0b R15: ffffffff818188b7
 FS:  00007ff13ae24700(0000) GS:ffff88007d580000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000003054ec7be0 CR3: 0000000076d51000 CR4: 00000000000007e0
 Stack:
  ffff88007a41ebe0 dead000000100058 00000000fffffffe ffff88006f0fe640
  0000000000000000 ffff88006f0fe678 ffff88007a41ebe0 ffff88003793a000
  00000000fffffffe ffffffff810bde82 ffff88006f0fe640 ffff88007a41eb28
 Call Trace:
  [<ffffffff810bde82>] ? instance_rmdir+0x15b/0x1de
  [<ffffffff81132e2d>] ? vfs_rmdir+0x80/0xd3
  [<ffffffff81132f51>] ? do_rmdir+0xd1/0x139
  [<ffffffff8124ad9e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
  [<ffffffff814fea62>] ? system_call_fastpath+0x16/0x1b
 Code: fe ff ff 48 8d 75 30 48 89 df e8 c9 fd ff ff 85 c0 75 13 48 c7 c6 b8 cc d2 81 48 c7 c7 b0 cc d2 81 e8 8c 7a f5 ff 48 8b 54 24 08 <48> 8b 82 a8 00 00 00 48 89 d3 48 2d a8 00 00 00 48 89 44 24 08
 RIP  [<ffffffff811ed5eb>] debugfs_remove_recursive+0x1bd/0x367
  RSP <ffff880077019df8>

It took a while, but every time it triggered, it was always in the
same place:

list_for_each_entry_safe(child, next, &parent->d_subdirs, d_u.d_child) {

Where the child->d_u.d_child seemed to be corrupted.  I added lots of
trace_printk()s to see what was wrong, and sure enough, it was always
the child's d_u.d_child field. I looked around to see what touches
it and noticed that in __dentry_kill() which calls dentry_free():

static void dentry_free(struct dentry *dentry)
{
/* if dentry was never visible to RCU, immediate free is OK */
if (!(dentry->d_flags & DCACHE_RCUACCESS))
__d_free(&dentry->d_u.d_rcu);
else
call_rcu(&dentry->d_u.d_rcu, __d_free);
}

I also noticed that __dentry_kill() unlinks the child->d_u.child
under the parent->d_lock spin_lock.

Looking back at the loop in debugfs_remove_recursive() it never takes the
parent->d_lock to do the list walk. Adding more tracing, I was able to
prove this was the issue:

 ftrace-t-15385   1.... 246662024us : dentry_kill <ffffffff81138b91>: free ffff88006d573600
    rmdir-15409   2.... 246662024us : debugfs_remove_recursive <ffffffff811ec7e5>: child=ffff88006d573600 next=dead000000100058

The dentry_kill freed ffff88006d573600 just as the remove recursive was walking
it.

In order to fix this, the list walk needs to be modified a bit to take
the parent->d_lock. The safe version is no longer necessary, as every
time we remove a child, the parent->d_lock must be released and the
list walk must start over. Each time a child is removed, even though it
may still be on the list, it should be skipped by the first check
in the loop:

if (!debugfs_positive(child))
continue;

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: deleted code is slightly different; we don't
 have list_next_entry()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agostable_kernel_rules: Add pointer to netdev-FAQ for network patches
Dave Chiluk [Tue, 24 Jun 2014 15:11:26 +0000 (10:11 -0500)]
stable_kernel_rules: Add pointer to netdev-FAQ for network patches

commit b76fc285337b6b256e9ba20a40cfd043f70c27af upstream.

Stable_kernel_rules should point submitters of network stable patches to the
netdev_FAQ.txt as requests for stable network patches should go to netdev
first.

Signed-off-by: Dave Chiluk <chiluk@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoblock: don't assume last put of shared tags is for the host
Christoph Hellwig [Tue, 8 Jul 2014 10:25:28 +0000 (12:25 +0200)]
block: don't assume last put of shared tags is for the host

commit d45b3279a5a2252cafcd665bbf2db8c9b31ef783 upstream.

There is no inherent reason why the last put of a tag structure must be
the one for the Scsi_Host, as device model objects can be held for
arbitrary periods.  Merge blk_free_tags and __blk_free_tags into a single
funtion that just release a references and get rid of the BUG() when the
host reference wasn't the last.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoASoC: samsung: Correct I2S DAI suspend/resume ops
Sylwester Nawrocki [Fri, 4 Jul 2014 14:05:45 +0000 (16:05 +0200)]
ASoC: samsung: Correct I2S DAI suspend/resume ops

commit d3d4e5247b013008a39e4d5f69ce4c60ed57f997 upstream.

We should save/restore relevant I2S registers regardless of
the dai->active flag, otherwise some settings are being lost
after system suspend/resume cycle. E.g. I2S slave mode set only
during dai initialization is not preserved and the device ends
up in master mode after system resume.

Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoKVM: x86: Inter-privilege level ret emulation is not implemeneted
Nadav Amit [Sun, 15 Jun 2014 13:12:59 +0000 (16:12 +0300)]
KVM: x86: Inter-privilege level ret emulation is not implemeneted

commit 9e8919ae793f4edfaa29694a70f71a515ae9942a upstream.

Return unhandlable error on inter-privilege level ret instruction.  This is
since the current emulation does not check the privilege level correctly when
loading the CS, and does not pop RSP/SS as needed.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoLinux 3.2.62
Ben Hutchings [Wed, 6 Aug 2014 17:07:43 +0000 (18:07 +0100)]
Linux 3.2.62

7 years agoiommu/vt-d: Disable translation if already enabled
Takao Indoh [Tue, 23 Apr 2013 08:35:03 +0000 (17:35 +0900)]
iommu/vt-d: Disable translation if already enabled

commit 3a93c841c2b3b14824f7728dd74bd00a1cedb806 upstream.

This patch disables translation(dma-remapping) before its initialization
if it is already enabled.

This is needed for kexec/kdump boot. If dma-remapping is enabled in the
first kernel, it need to be disabled before initializing its page table
during second kernel boot. Wei Hu also reported that this is needed
when second kernel boots with intel_iommu=off.

Basically iommu->gcmd is used to know whether translation is enabled or
disabled, but it is always zero at boot time even when translation is
enabled since iommu->gcmd is initialized without considering such a
case. Therefor this patch synchronizes iommu->gcmd value with global
command register when iommu structure is allocated.

Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
[wyj: Backported to 3.4: adjust context]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agox86_32, entry: Store badsys error code in %eax
Sven Wegener [Tue, 22 Jul 2014 08:26:06 +0000 (10:26 +0200)]
x86_32, entry: Store badsys error code in %eax

commit 8142b215501f8b291a108a202b3a053a265b03dd upstream.

Commit 554086d ("x86_32, entry: Do syscall exit work on badsys
(CVE-2014-4508)") introduced a regression in the x86_32 syscall entry
code, resulting in syscall() not returning proper errors for undefined
syscalls on CPUs supporting the sysenter feature.

The following code:

> int result = syscall(666);
> printf("result=%d errno=%d error=%s\n", result, errno, strerror(errno));

results in:

> result=666 errno=0 error=Success

Obviously, the syscall return value is the called syscall number, but it
should have been an ENOSYS error. When run under ptrace it behaves
correctly, which makes it hard to debug in the wild:

> result=-1 errno=38 error=Function not implemented

The %eax register is the return value register. For debugging via ptrace
the syscall entry code stores the complete register context on the
stack. The badsys handlers only store the ENOSYS error code in the
ptrace register set and do not set %eax like a regular syscall handler
would. The old resume_userspace call chain contains code that clobbers
%eax and it restores %eax from the ptrace registers afterwards. The same
goes for the ptrace-enabled call chain. When ptrace is not used, the
syscall return value is the passed-in syscall number from the untouched
%eax register.

Use %eax as the return value register in syscall_badsys and
sysenter_badsys, like a real syscall handler does, and have the caller
push the value onto the stack for ptrace access.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Link: http://lkml.kernel.org/r/alpine.LNX.2.11.1407221022380.31021@titan.int.lan.stealer.net
Reviewed-and-tested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agolibata: introduce ata_host->n_tags to avoid oops on SAS controllers
Tejun Heo [Wed, 23 Jul 2014 13:05:27 +0000 (09:05 -0400)]
libata: introduce ata_host->n_tags to avoid oops on SAS controllers

commit 1a112d10f03e83fb3a2fdc4c9165865dec8a3ca6 upstream.

1871ee134b73 ("libata: support the ata host which implements a queue
depth less than 32") directly used ata_port->scsi_host->can_queue from
ata_qc_new() to determine the number of tags supported by the host;
unfortunately, SAS controllers doing SATA don't initialize ->scsi_host
leading to the following oops.

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
 IP: [<ffffffff814e0618>] ata_qc_new_init+0x188/0x1b0
 PGD 0
 Oops: 0002 [#1] SMP
 Modules linked in: isci libsas scsi_transport_sas mgag200 drm_kms_helper ttm
 CPU: 1 PID: 518 Comm: udevd Not tainted 3.16.0-rc6+ #62
 Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
 task: ffff880c1a00b280 ti: ffff88061a000000 task.ti: ffff88061a000000
 RIP: 0010:[<ffffffff814e0618>]  [<ffffffff814e0618>] ata_qc_new_init+0x188/0x1b0
 RSP: 0018:ffff88061a003ae8  EFLAGS: 00010012
 RAX: 0000000000000001 RBX: ffff88000241ca80 RCX: 00000000000000fa
 RDX: 0000000000000020 RSI: 0000000000000020 RDI: ffff8806194aa298
 RBP: ffff88061a003ae8 R08: ffff8806194a8000 R09: 0000000000000000
 R10: 0000000000000000 R11: ffff88000241ca80 R12: ffff88061ad58200
 R13: ffff8806194aa298 R14: ffffffff814e67a0 R15: ffff8806194a8000
 FS:  00007f3ad7fe3840(0000) GS:ffff880627620000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000058 CR3: 000000061a118000 CR4: 00000000001407e0
 Stack:
  ffff88061a003b20 ffffffff814e96e1 ffff88000241ca80 ffff88061ad58200
  ffff8800b6bf6000 ffff880c1c988000 ffff880619903850 ffff88061a003b68
  ffffffffa0056ce1 ffff88061a003b48 0000000013d6e6f8 ffff88000241ca80
 Call Trace:
  [<ffffffff814e96e1>] ata_sas_queuecmd+0xa1/0x430
  [<ffffffffa0056ce1>] sas_queuecommand+0x191/0x220 [libsas]
  [<ffffffff8149afee>] scsi_dispatch_cmd+0x10e/0x300
  [<ffffffff814a3bc5>] scsi_request_fn+0x2f5/0x550
  [<ffffffff81317613>] __blk_run_queue+0x33/0x40
  [<ffffffff8131781a>] queue_unplugged+0x2a/0x90
  [<ffffffff8131ceb4>] blk_flush_plug_list+0x1b4/0x210
  [<ffffffff8131d274>] blk_finish_plug+0x14/0x50
  [<ffffffff8117eaa8>] __do_page_cache_readahead+0x198/0x1f0
  [<ffffffff8117ee21>] force_page_cache_readahead+0x31/0x50
  [<ffffffff8117ee7e>] page_cache_sync_readahead+0x3e/0x50
  [<ffffffff81172ac6>] generic_file_read_iter+0x496/0x5a0
  [<ffffffff81219897>] blkdev_read_iter+0x37/0x40
  [<ffffffff811e307e>] new_sync_read+0x7e/0xb0
  [<ffffffff811e3734>] vfs_read+0x94/0x170
  [<ffffffff811e43c6>] SyS_read+0x46/0xb0
  [<ffffffff811e33d1>] ? SyS_lseek+0x91/0xb0
  [<ffffffff8171ee29>] system_call_fastpath+0x16/0x1b
 Code: 00 00 00 88 50 29 83 7f 08 01 19 d2 83 e2 f0 83 ea 50 88 50 34 c6 81 1d 02 00 00 40 c6 81 17 02 00 00 00 5d c3 66 0f 1f 44 00 00 <89> 14 25 58 00 00 00

Fix it by introducing ata_host->n_tags which is initialized to
ATA_MAX_QUEUE - 1 in ata_host_init() for SAS controllers and set to
scsi_host_template->can_queue in ata_host_register() for !SAS ones.
As SAS hosts are never registered, this will give them the same
ATA_MAX_QUEUE - 1 as before.  Note that we can't use
scsi_host->can_queue directly for SAS hosts anyway as they can go
higher than the libata maximum.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Reported-by: Jesse Brandeburg <jesse.brandeburg@gmail.com>
Reported-by: Peter Hurley <peter@hurleysoftware.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: 1871ee134b73 ("libata: support the ata host which implements a queue depth less than 32")
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agolibata: support the ata host which implements a queue depth less than 32
Kevin Hao [Sat, 12 Jul 2014 04:08:24 +0000 (12:08 +0800)]
libata: support the ata host which implements a queue depth less than 32

commit 1871ee134b73fb4cadab75752a7152ed2813c751 upstream.

The sata on fsl mpc8315e is broken after the commit 8a4aeec8d2d6
("libata/ahci: accommodate tag ordered controllers"). The reason is
that the ata controller on this SoC only implement a queue depth of
16. When issuing the commands in tag order, all the commands in tag
16 ~ 31 are mapped to tag 0 unconditionally and then causes the sata
malfunction. It makes no senses to use a 32 queue in software while
the hardware has less queue depth. So consider the queue depth
implemented by the hardware when requesting a command tag.

Fixes: 8a4aeec8d2d6 ("libata/ahci: accommodate tag ordered controllers")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agomm: kmemleak: avoid false negatives on vmalloc'ed objects
Catalin Marinas [Tue, 12 Nov 2013 23:07:45 +0000 (15:07 -0800)]
mm: kmemleak: avoid false negatives on vmalloc'ed objects

commit 7f88f88f83ed609650a01b18572e605ea50cd163 upstream.

Commit 248ac0e1943a ("mm/vmalloc: remove guard page from between vmap
blocks") had the side effect of making vmap_area.va_end member point to
the next vmap_area.va_start.  This was creating an artificial reference
to vmalloc'ed objects and kmemleak was rarely reporting vmalloc() leaks.

This patch marks the vmap_area containing pointers explicitly and
reduces the min ref_count to 2 as vm_struct still contains a reference
to the vmalloc'ed object.  The kmemleak add_scan_area() function has
been improved to allow a SIZE_MAX argument covering the rest of the
object (for simpler calling sites).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agointroduce SIZE_MAX
Xi Wang [Thu, 31 May 2012 23:26:04 +0000 (16:26 -0700)]
introduce SIZE_MAX

commit a3860c1c5dd1137db23d7786d284939c5761d517 upstream.

ULONG_MAX is often used to check for integer overflow when calculating
allocation size.  While ULONG_MAX happens to work on most systems, there
is no guarantee that `size_t' must be the same size as `long'.

This patch introduces SIZE_MAX, the maximum value of `size_t', to improve
portability and readability for allocation size validation.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Acked-by: Alex Elder <elder@dreamhost.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoceph: fix overflow check in build_snap_context()
Xi Wang [Thu, 16 Feb 2012 16:56:29 +0000 (11:56 -0500)]
ceph: fix overflow check in build_snap_context()

commit 80834312a4da1405a9bc788313c67643de6fcb4c upstream.

The overflow check for a + n * b should be (n > (ULONG_MAX - a) / b),
rather than (n > ULONG_MAX / b - a).

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoARM: 7670/1: fix the memset fix
Nicolas Pitre [Tue, 12 Mar 2013 12:00:42 +0000 (13:00 +0100)]
ARM: 7670/1: fix the memset fix

commit 418df63adac56841ef6b0f1fcf435bc64d4ed177 upstream.

Commit 455bd4c430b0 ("ARM: 7668/1: fix memset-related crashes caused by
recent GCC (4.7.2) optimizations") attempted to fix a compliance issue
with the memset return value.  However the memset itself became broken
by that patch for misaligned pointers.

This fixes the above by branching over the entry code from the
misaligned fixup code to avoid reloading the original pointer.

Also, because the function entry alignment is wrong in the Thumb mode
compilation, that fixup code is moved to the end.

While at it, the entry instructions are slightly reworked to help dual
issue pipelines.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoARM: 7668/1: fix memset-related crashes caused by recent GCC (4.7.2) optimizations
Ivan Djelic [Wed, 6 Mar 2013 19:09:27 +0000 (20:09 +0100)]
ARM: 7668/1: fix memset-related crashes caused by recent GCC (4.7.2) optimizations

commit 455bd4c430b0c0a361f38e8658a0d6cb469942b5 upstream.

Recent GCC versions (e.g. GCC-4.7.2) perform optimizations based on
assumptions about the implementation of memset and similar functions.
The current ARM optimized memset code does not return the value of
its first argument, as is usually expected from standard implementations.

For instance in the following function:

void debug_mutex_lock_common(struct mutex *lock, struct mutex_waiter *waiter)
{
memset(waiter, MUTEX_DEBUG_INIT, sizeof(*waiter));
waiter->magic = waiter;
INIT_LIST_HEAD(&waiter->list);
}

compiled as:

800554d0 <debug_mutex_lock_common>:
800554d0:       e92d4008        push    {r3, lr}
800554d4:       e1a00001        mov     r0, r1
800554d8:       e3a02010        mov     r2, #16 ; 0x10
800554dc:       e3a01011        mov     r1, #17 ; 0x11
800554e0:       eb04426e        bl      80165ea0 <memset>
800554e4:       e1a03000        mov     r3, r0
800554e8:       e583000c        str     r0, [r3, #12]
800554ec:       e5830000        str     r0, [r3]
800554f0:       e5830004        str     r0, [r3, #4]
800554f4:       e8bd8008        pop     {r3, pc}

GCC assumes memset returns the value of pointer 'waiter' in register r0; causing
register/memory corruptions.

This patch fixes the return value of the assembly version of memset.
It adds a 'mov' instruction and merges an additional load+store into
existing load/store instructions.
For ease of review, here is a breakdown of the patch into 4 simple steps:

Step 1
======
Perform the following substitutions:
ip -> r8, then
r0 -> ip,
and insert 'mov ip, r0' as the first statement of the function.
At this point, we have a memset() implementation returning the proper result,
but corrupting r8 on some paths (the ones that were using ip).

Step 2
======
Make sure r8 is saved and restored when (! CALGN(1)+0) == 1:

save r8:
-       str     lr, [sp, #-4]!
+       stmfd   sp!, {r8, lr}

and restore r8 on both exit paths:
-       ldmeqfd sp!, {pc}               @ Now <64 bytes to go.
+       ldmeqfd sp!, {r8, pc}           @ Now <64 bytes to go.
(...)
        tst     r2, #16
        stmneia ip!, {r1, r3, r8, lr}
-       ldr     lr, [sp], #4
+       ldmfd   sp!, {r8, lr}

Step 3
======
Make sure r8 is saved and restored when (! CALGN(1)+0) == 0:

save r8:
-       stmfd   sp!, {r4-r7, lr}
+       stmfd   sp!, {r4-r8, lr}

and restore r8 on both exit paths:
        bgt     3b
-       ldmeqfd sp!, {r4-r7, pc}
+       ldmeqfd sp!, {r4-r8, pc}
(...)
        tst     r2, #16
        stmneia ip!, {r4-r7}
-       ldmfd   sp!, {r4-r7, lr}
+       ldmfd   sp!, {r4-r8, lr}

Step 4
======
Rewrite register list "r4-r7, r8" as "r4-r8".

Signed-off-by: Ivan Djelic <ivan.djelic@parrot.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Dirk Behme <dirk.behme@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agomm: hugetlb: fix copy_hugetlb_page_range()
Naoya Horiguchi [Wed, 23 Jul 2014 21:00:19 +0000 (14:00 -0700)]
mm: hugetlb: fix copy_hugetlb_page_range()

commit 0253d634e0803a8376a0d88efee0bf523d8673f9 upstream.

Commit 4a705fef9862 ("hugetlb: fix copy_hugetlb_page_range() to handle
migration/hwpoisoned entry") changed the order of
huge_ptep_set_wrprotect() and huge_ptep_get(), which leads to breakage
in some workloads like hugepage-backed heap allocation via libhugetlbfs.
This patch fixes it.

The test program for the problem is shown below:

  $ cat heap.c
  #include <unistd.h>
  #include <stdlib.h>
  #include <string.h>

  #define HPS 0x200000

  int main() {
   int i;
   char *p = malloc(HPS);
   memset(p, '1', HPS);
   for (i = 0; i < 5; i++) {
   if (!fork()) {
   memset(p, '2', HPS);
   p = malloc(HPS);
   memset(p, '3', HPS);
   free(p);
   return 0;
   }
   }
   sleep(1);
   free(p);
   return 0;
  }

  $ export HUGETLB_MORECORE=yes ; export HUGETLB_NO_PREFAULT= ; hugectl --heap ./heap

Fixes 4a705fef9862 ("hugetlb: fix copy_hugetlb_page_range() to handle
migration/hwpoisoned entry"), so is applicable to -stable kernels which
include it.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Guillaume Morin <guillaume@morinfr.org>
Suggested-by: Guillaume Morin <guillaume@morinfr.org>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agocrypto: testmgr - update LZO compression test vectors
Markus F.X.J. Oberhumer [Sun, 14 Oct 2012 13:39:04 +0000 (15:39 +0200)]
crypto: testmgr - update LZO compression test vectors

commit 0ec7382036922be063b515b2a3f1d6f7a607392c upstream.

Update the LZO compression test vectors according to the latest compressor
version.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoipvs: stop tot_stats estimator only under CONFIG_SYSCTL
Julian Anastasov [Wed, 16 Jul 2014 08:26:47 +0000 (10:26 +0200)]
ipvs: stop tot_stats estimator only under CONFIG_SYSCTL

[ Upstream commit 9802d21e7a0b0d2167ef745edc1f4ea7a0fc6ea3 ]

The tot_stats estimator is started only when CONFIG_SYSCTL
is defined. But it is stopped without checking CONFIG_SYSCTL.
Fix the crash by moving ip_vs_stop_estimator into
ip_vs_control_net_cleanup_sysctl.

The change is needed after commit 14e405461e664b
("IPVS: Add __ip_vs_control_{init,cleanup}_sysctl()") from 2.6.39.

Reported-by: Jet Chen <jet.chen@intel.com>
Tested-by: Jet Chen <jet.chen@intel.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agox86, ioremap: Speed up check for RAM pages
Roland Dreier [Fri, 2 May 2014 18:18:41 +0000 (11:18 -0700)]
x86, ioremap: Speed up check for RAM pages

commit c81c8a1eeede61e92a15103748c23d100880cc8a upstream.

In __ioremap_caller() (the guts of ioremap), we loop over the range of
pfns being remapped and checks each one individually with page_is_ram().
For large ioremaps, this can be very slow.  For example, we have a
device with a 256 GiB PCI BAR, and ioremapping this BAR can take 20+
seconds -- sometimes long enough to trigger the soft lockup detector!

Internally, page_is_ram() calls walk_system_ram_range() on a single
page.  Instead, we can make a single call to walk_system_ram_range()
from __ioremap_caller(), and do our further checks only for any RAM
pages that we find.  For the common case of MMIO, this saves an enormous
amount of work, since the range being ioremapped doesn't intersect
system RAM at all.

With this change, ioremap on our 256 GiB BAR takes less than 1 second.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Link: http://lkml.kernel.org/r/1399054721-1331-1-git-send-email-roland@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agosym53c8xx_2: Set DID_REQUEUE return code when aborting squeue
Mikulas Patocka [Wed, 9 Apr 2014 01:52:05 +0000 (21:52 -0400)]
sym53c8xx_2: Set DID_REQUEUE return code when aborting squeue

commit fd1232b214af43a973443aec6a2808f16ee5bf70 upstream.

This patch fixes I/O errors with the sym53c8xx_2 driver when the disk
returns QUEUE FULL status.

When the controller encounters an error (including QUEUE FULL or BUSY
status), it aborts all not yet submitted requests in the function
sym_dequeue_from_squeue.

This function aborts them with DID_SOFT_ERROR.

If the disk has full tag queue, the request that caused the overflow is
aborted with QUEUE FULL status (and the scsi midlayer properly retries
it until it is accepted by the disk), but the sym53c8xx_2 driver aborts
the following requests with DID_SOFT_ERROR --- for them, the midlayer
does just a few retries and then signals the error up to sd.

The result is that disk returning QUEUE FULL causes request failures.

The error was reproduced on 53c895 with COMPAQ BD03685A24 disk
(rebranded ST336607LC) with command queue 48 or 64 tags.  The disk has
64 tags, but under some access patterns it return QUEUE FULL when there
are less than 64 pending tags.  The SCSI specification allows returning
QUEUE FULL anytime and it is up to the host to retry.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoapplicom: dereferencing NULL on error path
Dan Carpenter [Fri, 9 May 2014 11:59:16 +0000 (14:59 +0300)]
applicom: dereferencing NULL on error path

commit 8bab797c6e5724a43b7666ad70860712365cdb71 upstream.

This is a static checker fix.  The "dev" variable is always NULL after
the while statement so we would be dereferencing a NULL pointer here.

Fixes: 819a3eba4233 ('[PATCH] applicom: fix error handling')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agox86-32, espfix: Remove filter for espfix32 due to race
H. Peter Anvin [Wed, 30 Apr 2014 21:03:25 +0000 (14:03 -0700)]
x86-32, espfix: Remove filter for espfix32 due to race

commit 246f2d2ee1d715e1077fc47d61c394569c8ee692 upstream.

It is not safe to use LAR to filter when to go down the espfix path,
because the LDT is per-process (rather than per-thread) and another
thread might change the descriptors behind our back.  Fortunately it
is always *safe* (if a bit slow) to go down the espfix path, and a
32-bit LDT stack segment is extremely rare.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoscore: normalize global variables exported by vmlinux.lds
Jiang Liu [Wed, 3 Jul 2013 22:03:37 +0000 (15:03 -0700)]
score: normalize global variables exported by vmlinux.lds

commit ae49b83dcacfb69e22092cab688c415c2f2d870c upstream.

Generate mandatory global variables _sdata in file vmlinux.lds.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoalpha: add io{read,write}{16,32}be functions
Michael Cree [Wed, 30 Nov 2011 13:01:40 +0000 (08:01 -0500)]
alpha: add io{read,write}{16,32}be functions

commit 25534eb7707821b796fd84f7115367e02f36aa60 upstream.

These functions are used in some PCI drivers with big-endian
MMIO space.

Admittedly it is almost certain that no one this side of the
Moon would use such a card in an Alpha but it does get us
closer to being able to build allyesconfig or allmodconfig,
and it enables the Debian default generic config to build.

Tested-by: Raúl Porcel <armin76@gentoo.org>
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoscore: Add missing #include <linux/export.h>
Ben Hutchings [Sun, 3 Aug 2014 16:45:10 +0000 (17:45 +0100)]
score: Add missing #include <linux/export.h>

There is no upstream commit for this, as arch/score/kernel/init_task.c
has been replaced by generic code and <linux/export.h> is included
indirectly by arch/score/mm/init.c.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoScore: The commit is for compiling successfully. The modifications include: 1. Kconfi...
Lennox Wu [Sat, 14 Sep 2013 05:48:37 +0000 (13:48 +0800)]
Score: The commit is for compiling successfully. The modifications include: 1. Kconfig of Score: we don't support ioremap 2. Missed headfile including 3. There are some errors in other people's commit not checked by us, we fix it now 3.1 arch/score/kernel/entry.S: wrong instructions 3.2 arch/score/kernel/process.c : just some typos

commit 5fbbf8a1a93452b26e7791cf32cefce62b0a480b upstream.

Signed-off-by: Lennox Wu <lennox.wu@gmail.com>
[bwh: Backported to 3.2:
 - Drop addition of 'select HAVE_GENERIC_HARDIRQS' which was not removed here
 - Drop inapplicale change to copy_thread()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agounicore32: select generic atomic64_t support
Fengguang Wu [Fri, 5 Oct 2012 00:11:23 +0000 (17:11 -0700)]
unicore32: select generic atomic64_t support

commit 82e54a6aaf8aec971fb16afa3a4404e238a1b98b upstream.

It's required for the core fs/namespace.c and many other basic features.

Signed-off-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agounicore32: add ioremap_nocache definition
Guan Xuetao [Thu, 18 Aug 2011 07:38:05 +0000 (15:38 +0800)]
unicore32: add ioremap_nocache definition

commit a50e4213e71adc7dde0d514aabd8af7275fee39f upstream.

Bugfix for following error messages:
lib/iomap.c: In function 'pci_iomap':
lib/iomap.c:274: error: implicit declaration of function 'ioremap_nocache'
lib/iomap.c:274: warning: return makes pointer from integer without a cast

Also see commit <f1ecc69838a2d7c8a3e1909f637d4083c071777d>
  it will hide the ioremap_nocache function for systems with an MMU

Signed-off-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoshmem: fix splicing from a hole while it's punched
Hugh Dickins [Wed, 23 Jul 2014 21:00:13 +0000 (14:00 -0700)]
shmem: fix splicing from a hole while it's punched

commit b1a366500bd537b50c3aad26dc7df083ec03a448 upstream.

shmem_fault() is the actual culprit in trinity's hole-punch starvation,
and the most significant cause of such problems: since a page faulted is
one that then appears page_mapped(), needing unmap_mapping_range() and
i_mmap_mutex to be unmapped again.

But it is not the only way in which a page can be brought into a hole in
the radix_tree while that hole is being punched; and Vlastimil's testing
implies that if enough other processors are busy filling in the hole,
then shmem_undo_range() can be kept from completing indefinitely.

shmem_file_splice_read() is the main other user of SGP_CACHE, which can
instantiate shmem pagecache pages in the read-only case (without holding
i_mutex, so perhaps concurrently with a hole-punch).  Probably it's
silly not to use SGP_READ already (using the ZERO_PAGE for holes): which
ought to be safe, but might bring surprises - not a change to be rushed.

shmem_read_mapping_page_gfp() is an internal interface used by
drivers/gpu/drm GEM (and next by uprobes): it should be okay.  And
shmem_file_read_iter() uses the SGP_DIRTY variant of SGP_CACHE, when
called internally by the kernel (perhaps for a stacking filesystem,
which might rely on holes to be reserved): it's unclear whether it could
be provoked to keep hole-punch busy or not.

We could apply the same umbrella as now used in shmem_fault() to
shmem_file_splice_read() and the others; but it looks ugly, and use over
a range raises questions - should it actually be per page? can these get
starved themselves?

The origin of this part of the problem is my v3.1 commit d0823576bf4b
("mm: pincer in truncate_inode_pages_range"), once it was duplicated
into shmem.c.  It seemed like a nice idea at the time, to ensure
(barring RCU lookup fuzziness) that there's an instant when the entire
hole is empty; but the indefinitely repeated scans to ensure that make
it vulnerable.

Revert that "enhancement" to hole-punch from shmem_undo_range(), but
retain the unproblematic rescanning when it's truncating; add a couple
of comments there.

Remove the "indices[0] >= end" test: that is now handled satisfactorily
by the inner loop, and mem_cgroup_uncharge_start()/end() are too light
to be worth avoiding here.

But if we do not always loop indefinitely, we do need to handle the case
of swap swizzled back to page before shmem_free_swap() gets it: add a
retry for that case, as suggested by Konstantin Khlebnikov; and for the
case of page swizzled back to swap, as suggested by Johannes Weiner.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>