pandora-kernel.git
5 years agoPCI: Supply CPU physical address (not bus address) to iomem_is_exclusive()
Bjorn Helgaas [Fri, 8 Apr 2016 00:15:14 +0000 (17:15 -0700)]
PCI: Supply CPU physical address (not bus address) to iomem_is_exclusive()

commit ca620723d4ff9ea7ed484eab46264c3af871b9ae upstream.

iomem_is_exclusive() requires a CPU physical address, but on some arches we
supplied a PCI bus address instead.

On most arches, pci_resource_to_user(res) returns "res->start", which is a
CPU physical address.  But on microblaze, mips, powerpc, and sparc, it
returns the PCI bus address corresponding to "res->start".

The result is that pci_mmap_resource() may fail when it shouldn't (if the
bus address happens to match an existing resource), or it may succeed when
it should fail (if the resource is exclusive but the bus address doesn't
match it).

Call iomem_is_exclusive() with "res->start", which is always a CPU physical
address, not the result of pci_resource_to_user().

Fixes: e8de1481fd71 ("resource: allow MMIO exclusivity for device drivers")
Suggested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agocrypto: s5p-sss - Fix missed interrupts when working with 8 kB blocks
Krzysztof Kozlowski [Fri, 22 Apr 2016 12:15:23 +0000 (14:15 +0200)]
crypto: s5p-sss - Fix missed interrupts when working with 8 kB blocks

commit 79152e8d085fd64484afd473ef6830b45518acba upstream.

The tcrypt testing module on Exynos5422-based Odroid XU3/4 board failed on
testing 8 kB size blocks:

$ sudo modprobe tcrypt sec=1 mode=500
testing speed of async ecb(aes) (ecb-aes-s5p) encryption
test 0 (128 bit key, 16 byte blocks): 21971 operations in 1 seconds (351536 bytes)
test 1 (128 bit key, 64 byte blocks): 21731 operations in 1 seconds (1390784 bytes)
test 2 (128 bit key, 256 byte blocks): 21932 operations in 1 seconds (5614592 bytes)
test 3 (128 bit key, 1024 byte blocks): 21685 operations in 1 seconds (22205440 bytes)
test 4 (128 bit key, 8192 byte blocks):

This was caused by a race issue of missed BRDMA_DONE ("Block cipher
Receiving DMA") interrupt. Device starts processing the data in DMA mode
immediately after setting length of DMA block: receiving (FCBRDMAL) or
transmitting (FCBTDMAL). The driver sets these lengths from interrupt
handler through s5p_set_dma_indata() function (or xxx_setdata()).

However the interrupt handler was first dealing with receive buffer
(dma-unmap old, dma-map new, set receive block length which starts the
operation), then with transmit buffer and finally was clearing pending
interrupts (FCINTPEND). Because of the time window between setting
receive buffer length and clearing pending interrupts, the operation on
receive buffer could end already and driver would miss new interrupt.

User manual for Exynos5422 confirms in example code that setting DMA
block lengths should be the last operation.

The tcrypt hang could be also observed in following blocked-task dmesg:

INFO: task modprobe:258 blocked for more than 120 seconds.
      Not tainted 4.6.0-rc4-next-20160419-00005-g9eac8b7b7753-dirty #42
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
modprobe        D c06b09d8     0   258    256 0x00000000
[<c06b09d8>] (__schedule) from [<c06b0f24>] (schedule+0x40/0xac)
[<c06b0f24>] (schedule) from [<c06b49f8>] (schedule_timeout+0x124/0x178)
[<c06b49f8>] (schedule_timeout) from [<c06b17fc>] (wait_for_common+0xb8/0x144)
[<c06b17fc>] (wait_for_common) from [<bf0013b8>] (test_acipher_speed+0x49c/0x740 [tcrypt])
[<bf0013b8>] (test_acipher_speed [tcrypt]) from [<bf003e8c>] (do_test+0x2240/0x30ec [tcrypt])
[<bf003e8c>] (do_test [tcrypt]) from [<bf008048>] (tcrypt_mod_init+0x48/0xa4 [tcrypt])
[<bf008048>] (tcrypt_mod_init [tcrypt]) from [<c010177c>] (do_one_initcall+0x3c/0x16c)
[<c010177c>] (do_one_initcall) from [<c0191ff0>] (do_init_module+0x5c/0x1ac)
[<c0191ff0>] (do_init_module) from [<c0185610>] (load_module+0x1a30/0x1d08)
[<c0185610>] (load_module) from [<c0185ab0>] (SyS_finit_module+0x8c/0x98)
[<c0185ab0>] (SyS_finit_module) from [<c01078c0>] (ret_fast_syscall+0x0/0x3c)

Fixes: a49e490c7a8a ("crypto: s5p-sss - add S5PV210 advanced crypto engine support")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoath5k: Change led pin configuration for compaq c700 laptop
Joseph Salisbury [Mon, 14 Mar 2016 18:51:48 +0000 (14:51 -0400)]
ath5k: Change led pin configuration for compaq c700 laptop

commit 7b9bc799a445aea95f64f15e0083cb19b5789abe upstream.

BugLink: http://bugs.launchpad.net/bugs/972604
Commit 09c9bae26b0d3c9472cb6ae45010460a2cee8b8d ("ath5k: add led pin
configuration for compaq c700 laptop") added a pin configuration for the Compaq
c700 laptop.  However, the polarity of the led pin is reversed.  It should be
red for wifi off and blue for wifi on, but it is the opposite.  This bug was
reported in the following bug report:
http://pad.lv/972604

Fixes: 09c9bae26b0d3c9472cb6ae45010460a2cee8b8d ("ath5k: add led pin configuration for compaq c700 laptop")
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoregmap: cache: Fix typo in cache_bypass parameter description
Andrew F. Davis [Wed, 23 Mar 2016 14:26:33 +0000 (09:26 -0500)]
regmap: cache: Fix typo in cache_bypass parameter description

commit 267c85860308d36bc163c5573308cd024f659d7c upstream.

Setting the flag 'cache_bypass' will bypass the cache not the hardware.
Fix this comment here.

Fixes: 0eef6b0415f5 ("regmap: Fix doc comment")
Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoLinux 3.2.81 v3.2.81
Ben Hutchings [Wed, 15 Jun 2016 20:28:15 +0000 (21:28 +0100)]
Linux 3.2.81

5 years agonet: fix a kernel infoleak in x25 module
Kangjie Lu [Sun, 8 May 2016 16:10:14 +0000 (12:10 -0400)]
net: fix a kernel infoleak in x25 module

commit 79e48650320e6fba48369fccf13fd045315b19b8 upstream.

Stack object "dte_facilities" is allocated in x25_rx_call_request(),
which is supposed to be initialized in x25_negotiate_facilities.
However, 5 fields (8 bytes in total) are not initialized. This
object is then copied to userland via copy_to_user, thus infoleak
occurs.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agonet: fix infoleak in rtnetlink
Kangjie Lu [Tue, 3 May 2016 20:46:24 +0000 (16:46 -0400)]
net: fix infoleak in rtnetlink

commit 5f8e44741f9f216e33736ea4ec65ca9ac03036e6 upstream.

The stack object “map” has a total size of 32 bytes. Its last 4
bytes are padding generated by compiler. These padding bytes are
not initialized and sent out via “nla_put”.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context, indentation]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agonet: fix infoleak in llc
Kangjie Lu [Tue, 3 May 2016 20:35:05 +0000 (16:35 -0400)]
net: fix infoleak in llc

commit b8670c09f37bdf2847cc44f36511a53afc6161fd upstream.

The stack object “info” has a total size of 12 bytes. Its last byte
is padding which is not initialized and leaked via “put_cmsg”.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agonf_conntrack: avoid kernel pointer value leak in slab name
Linus Torvalds [Sat, 14 May 2016 18:11:44 +0000 (11:11 -0700)]
nf_conntrack: avoid kernel pointer value leak in slab name

commit 31b0b385f69d8d5491a4bca288e25e63f1d945d0 upstream.

The slab name ends up being visible in the directory structure under
/sys, and even if you don't have access rights to the file you can see
the filenames.

Just use a 64-bit counter instead of the pointer to the 'net' structure
to generate a unique name.

This code will go away in 4.7 when the conntrack code moves to a single
kmemcache, but this is the backportable simple solution to avoiding
leaking kernel pointers to user space.

Fixes: 5b3501faa874 ("netfilter: nf_conntrack: per netns nf_conntrack_cachep")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoget_rock_ridge_filename(): handle malformed NM entries
Al Viro [Thu, 5 May 2016 20:25:35 +0000 (16:25 -0400)]
get_rock_ridge_filename(): handle malformed NM entries

commit 99d825822eade8d827a1817357cbf3f889a552d6 upstream.

Payloads of NM entries are not supposed to contain NUL.  When we run
into such, only the part prior to the first NUL goes into the
concatenation (i.e. the directory entry name being encoded by a bunch
of NM entries).  We do stop when the amount collected so far + the
claimed amount in the current NM entry exceed 254.  So far, so good,
but what we return as the total length is the sum of *claimed*
sizes, not the actual amount collected.  And that can grow pretty
large - not unlimited, since you'd need to put CE entries in
between to be able to get more than the maximum that could be
contained in one isofs directory entry / continuation chunk and
we are stop once we'd encountered 32 CEs, but you can get about 8Kb
easily.  And that's what will be passed to readdir callback as the
name length.  8Kb __copy_to_user() from a buffer allocated by
__get_free_page()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoparisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls
Dmitry V. Levin [Wed, 27 Apr 2016 01:56:11 +0000 (04:56 +0300)]
parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls

commit f0b22d1bb2a37a665a969e95785c75a4f49d1499 upstream.

Do not load one entry beyond the end of the syscall table when the
syscall number of a traced process equals to __NR_Linux_syscalls.
Similar bug with regular processes was fixed by commit 3bb457af4fa8
("[PARISC] Fix bug when syscall nr is __NR_Linux_syscalls").

This bug was found by strace test suite.

Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoproc: prevent accessing /proc/<PID>/environ until it's ready
Mathias Krause [Thu, 5 May 2016 23:22:26 +0000 (16:22 -0700)]
proc: prevent accessing /proc/<PID>/environ until it's ready

commit 8148a73c9901a8794a50f950083c00ccf97d43b3 upstream.

If /proc/<PID>/environ gets read before the envp[] array is fully set up
in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying to
read more bytes than are actually written, as env_start will already be
set but env_end will still be zero, making the range calculation
underflow, allowing to read beyond the end of what has been written.

Fix this as it is done for /proc/<PID>/cmdline by testing env_end for
zero.  It is, apparently, intentionally set last in create_*_tables().

This bug was found by the PaX size_overflow plugin that detected the
arithmetic underflow of 'this_len = env_end - (env_start + src)' when
env_end is still zero.

The expected consequence is that userland trying to access
/proc/<PID>/environ of a not yet fully set up process may get
inconsistent data as we're in the middle of copying in the environment
variables.

Fixes: https://forums.grsecurity.net/viewtopic.php?f=3&t=4363
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=116461
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: Pax Team <pageexec@freemail.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agocrypto: hash - Fix page length clamping in hash walk
Herbert Xu [Wed, 4 May 2016 09:52:56 +0000 (17:52 +0800)]
crypto: hash - Fix page length clamping in hash walk

commit 13f4bb78cf6a312bbdec367ba3da044b09bf0e29 upstream.

The crypto hash walk code is broken when supplied with an offset
greater than or equal to PAGE_SIZE.  This patch fixes it by adjusting
walk->pg and walk->offset when this happens.

Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoACPICA: Dispatcher: Update thread ID for recursive method calls
Prarit Bhargava [Wed, 4 May 2016 05:48:56 +0000 (13:48 +0800)]
ACPICA: Dispatcher: Update thread ID for recursive method calls

commit 93d68841a23a5779cef6fb9aa0ef32e7c5bd00da upstream.

ACPICA commit 7a3bd2d962f221809f25ddb826c9e551b916eb25

Set the mutex owner thread ID.
Original patch from: Prarit Bhargava <prarit@redhat.com>

Link: https://bugzilla.kernel.org/show_bug.cgi?id=115121
Link: https://github.com/acpica/acpica/commit/7a3bd2d9
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Tested-by: Andy Lutomirski <luto@kernel.org> # On a Dell XPS 13 9350
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agodrm/radeon: make sure vertical front porch is at least 1
Alex Deucher [Mon, 2 May 2016 22:53:27 +0000 (18:53 -0400)]
drm/radeon: make sure vertical front porch is at least 1

commit 3104b8128d4d646a574ed9d5b17c7d10752cd70b upstream.

hw doesn't like a 0 value.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoMinimal fix-up of bad hashing behavior of hash_64()
Linus Torvalds [Mon, 2 May 2016 19:46:42 +0000 (12:46 -0700)]
Minimal fix-up of bad hashing behavior of hash_64()

commit 689de1d6ca95b3b5bd8ee446863bf81a4883ea25 upstream.

This is a fairly minimal fixup to the horribly bad behavior of hash_64()
with certain input patterns.

In particular, because the multiplicative value used for the 64-bit hash
was intentionally bit-sparse (so that the multiply could be done with
shifts and adds on architectures without hardware multipliers), some
bits did not get spread out very much.  In particular, certain fairly
common bit ranges in the input (roughly bits 12-20: commonly with the
most information in them when you hash things like byte offsets in files
or memory that have block factors that mean that the low bits are often
zero) would not necessarily show up much in the result.

There's a bigger patch-series brewing to fix up things more completely,
but this is the fairly minimal fix for the 64-bit hashing problem.  It
simply picks a much better constant multiplier, spreading the bits out a
lot better.

NOTE! For 32-bit architectures, the bad old hash_64() remains the same
for now, since 64-bit multiplies are expensive.  The bigger hashing
cleanup will replace the 32-bit case with something better.

The new constants were picked by George Spelvin who wrote that bigger
cleanup series.  I just picked out the constants and part of the comment
from that series.

Cc: George Spelvin <linux@horizon.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoMake hash_64() use a 64-bit multiply when appropriate
Linus Torvalds [Sat, 13 Sep 2014 18:24:03 +0000 (11:24 -0700)]
Make hash_64() use a 64-bit multiply when appropriate

commit 23d0db76ffa13ffb95229946e4648568c3c29db5 upstream.

The hash_64() function historically does the multiply by the
GOLDEN_RATIO_PRIME_64 number with explicit shifts and adds, because
unlike the 32-bit case, gcc seems unable to turn the constant multiply
into the more appropriate shift and adds when required.

However, that means that we generate those shifts and adds even when the
architecture has a fast multiplier, and could just do it better in
hardware.

Use the now-cleaned-up CONFIG_ARCH_HAS_FAST_MULTIPLIER (together with
"is it a 64-bit architecture") to decide whether to use an integer
multiply or the explicit sequence of shift/add instructions.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
[bwh: This has no immediate effect in 3.2 because nothing defines
 CONFIG_ARCH_HAS_FAST_MULTIPLIER. However the following fix removes
 that condition.]

5 years agoEDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback
Tony Luck [Fri, 29 Apr 2016 13:42:25 +0000 (15:42 +0200)]
EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback

commit c4fc1956fa31003bfbe4f597e359d751568e2954 upstream.

Both of these drivers can return NOTIFY_BAD, but this terminates
processing other callbacks that were registered later on the chain.
Since the driver did nothing to log the error it seems wrong to prevent
other interested parties from seeing it. E.g. neither of them had even
bothered to check the type of the error to see if it was a memory error
before the return NOTIFY_BAD.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/72937355dd92318d2630979666063f8a2853495b.1461864507.git.tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agomm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check
Konstantin Khlebnikov [Thu, 28 Apr 2016 23:18:32 +0000 (16:18 -0700)]
mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check

commit 3486b85a29c1741db99d0c522211c82d2b7a56d0 upstream.

Khugepaged detects own VMAs by checking vm_file and vm_ops but this way
it cannot distinguish private /dev/zero mappings from other special
mappings like /dev/hpet which has no vm_ops and popultes PTEs in mmap.

This fixes false-positive VM_BUG_ON and prevents installing THP where
they are not expected.

Link: http://lkml.kernel.org/r/CACT4Y+ZmuZMV5CjSFOeXviwQdABAgT7T+StKfTqan9YDtgEi5g@mail.gmail.com
Fixes: 78f11a255749 ("mm: thp: fix /dev/zero MAP_PRIVATE and vm_flags cleanups")
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2:
 - The assertions use VM_BUG_ON() and also check is_linear_pfn_mapping();
   keep that check
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agothp: introduce hugepage_vma_check()
Bob Liu [Wed, 12 Dec 2012 00:00:39 +0000 (16:00 -0800)]
thp: introduce hugepage_vma_check()

commit fa475e517adb422cb3492e636195f9b2c0d009c8 upstream.

Multiple places do the same check.

Signed-off-by: Bob Liu <lliubbo@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Ni zhan Chen <nizhan.chen@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2:
 - Also move the is_linear_pfn_mapping() test
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoIB/security: Restrict use of the write() interface
Jason Gunthorpe [Mon, 11 Apr 2016 01:13:13 +0000 (19:13 -0600)]
IB/security: Restrict use of the write() interface

commit e6bd18f57aad1a2d1ef40e646d03ed0f2515c9e3 upstream.

The drivers/infiniband stack uses write() as a replacement for
bi-directional ioctl().  This is not safe. There are ways to
trigger write calls that result in the return structure that
is normally written to user space being shunted off to user
specified kernel memory instead.

For the immediate repair, detect and deny suspicious accesses to
the write API.

For long term, update the user space libraries and the kernel API
to something that doesn't present the same security vulnerabilities
(likely a structured ioctl() interface).

The impacted uAPI interfaces are generally only available if
hardware from drivers/infiniband is installed in the system.

Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
[ Expanded check to all known write() entry points ]
Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.2:
 - Drop changes to hfi1
 - include/rdma/ib.h didn't exist, so create it with the usual header guard
   and include it in drivers/infiniband/core/ucma.c
 - ipath_write() has the same problem, so add the same restriction there]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agodrm/i915: Fix system resume if PCI device remained enabled
Imre Deak [Mon, 18 Apr 2016 11:45:54 +0000 (14:45 +0300)]
drm/i915: Fix system resume if PCI device remained enabled

commit dab9a2663f4e688106c041f7cd2797a721382f0a upstream.

During system resume we depended on pci_enable_device() also putting the
device into PCI D0 state. This won't work if the PCI device was already
enabled but still in D3 state. This is because pci_enable_device() is
refcounted and will not change the HW state if called with a non-zero
refcount. Leaving the device in D3 will make all subsequent device
accesses fail.

This didn't cause a problem most of the time, since we resumed with an
enable refcount of 0. But it fails at least after module reload because
after that we also happen to leak a PCI device enable reference: During
probing we call drm_get_pci_dev() which will enable the PCI device, but
during device removal drm_put_dev() won't disable it. This is a bug of
its own in DRM core, but without much harm as it only leaves the PCI
device enabled. Fixing it is also a bit more involved, due to DRM
mid-layering and because it affects non-i915 drivers too. The fix in
this patch is valid regardless of the problem in DRM core.

v2:
- Add a code comment about the relation of this fix to the freeze/thaw
  vs. the suspend/resume phases. (Ville)
- Add a code comment about the inconsistent ordering of set power state
  and device enable calls. (Chris)

CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460979954-14503-1-git-send-email-imre.deak@intel.com
(cherry picked from commit 44410cd0bfb26bde9288da34c190cc9267d42a20)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
[bwh: Backported to 3.2:
 - Return error code directly
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoUSB: serial: cp210x: add Straizona Focusers device ids
Jasem Mutlaq [Tue, 19 Apr 2016 07:38:27 +0000 (10:38 +0300)]
USB: serial: cp210x: add Straizona Focusers device ids

commit 613ac23a46e10d4d4339febdd534fafadd68e059 upstream.

Adding VID:PID for Straizona Focusers to cp210x driver.

Signed-off-by: Jasem Mutlaq <mutlaqja@ikarustech.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoUSB: serial: cp210x: add ID for Link ECU
Mike Manning [Mon, 18 Apr 2016 12:13:23 +0000 (12:13 +0000)]
USB: serial: cp210x: add ID for Link ECU

commit 1d377f4d690637a0121eac8701f84a0aa1e69a69 upstream.

The Link ECU is an aftermarket ECU computer for vehicles that provides
full tuning abilities as well as datalogging and displaying capabilities
via the USB to Serial adapter built into the device.

Signed-off-by: Mike Manning <michael@bsch.com.au>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agobatman-adv: Fix broadcast/ogm queue limit on a removed interface
Linus Lüssing [Fri, 11 Mar 2016 13:04:49 +0000 (14:04 +0100)]
batman-adv: Fix broadcast/ogm queue limit on a removed interface

commit c4fdb6cff2aa0ae740c5f19b6f745cbbe786d42f upstream.

When removing a single interface while a broadcast or ogm packet is
still pending then we will free the forward packet without releasing the
queue slots again.

This patch is supposed to fix this issue.

Fixes: 6d5808d4ae1b ("batman-adv: Add missing hardif_free_ref in forw_packet_free")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
[sven@narfation.org: fix conflicts with current version]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agobatman-adv: Reduce refcnt of removed router when updating route
Sven Eckelmann [Sun, 20 Mar 2016 11:27:53 +0000 (12:27 +0100)]
batman-adv: Reduce refcnt of removed router when updating route

commit d1a65f1741bfd9c69f9e4e2ad447a89b6810427d upstream.

_batadv_update_route rcu_derefences orig_ifinfo->router outside of a
spinlock protected region to print some information messages to the debug
log. But this pointer is not checked again when the new pointer is assigned
in the spinlock protected region. Thus is can happen that the value of
orig_ifinfo->router changed in the meantime and thus the reference counter
of the wrong router gets reduced after the spinlock protected region.

Just rcu_dereferencing the value of orig_ifinfo->router inside the spinlock
protected region (which also set the new pointer) is enough to get the
correct old router object.

Fixes: e1a5382f978b ("batman-adv: Make orig_node->router an rcu protected pointer")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
[bwh: Backported to 3.2: s/orig_ifinfo/orig_node/]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agobatman-adv: Check skb size before using encapsulated ETH+VLAN header
Sven Eckelmann [Fri, 26 Feb 2016 16:56:13 +0000 (17:56 +0100)]
batman-adv: Check skb size before using encapsulated ETH+VLAN header

commit c78296665c3d81f040117432ab9e1cb125521b0c upstream.

The encapsulated ethernet and VLAN header may be outside the received
ethernet frame. Thus the skb buffer size has to be checked before it can be
parsed to find out if it encapsulates another batman-adv packet.

Fixes: 420193573f11 ("batman-adv: softif bridge loop avoidance")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agox86/mm/xen: Suppress hugetlbfs in PV guests
Jan Beulich [Thu, 21 Apr 2016 06:27:04 +0000 (00:27 -0600)]
x86/mm/xen: Suppress hugetlbfs in PV guests

commit 103f6112f253017d7062cd74d17f4a514ed4485c upstream.

Huge pages are not normally available to PV guests. Not suppressing
hugetlbfs use results in an endless loop of page faults when user mode
code tries to access a hugetlbfs mapped area (since the hypervisor
denies such PTEs to be created, but error indications can't be
propagated out of xen_set_pte_at(), just like for various of its
siblings), and - once killed in an oops like this:

  kernel BUG at .../fs/hugetlbfs/inode.c:428!
  invalid opcode: 0000 [#1] SMP
  ...
  RIP: e030:[<ffffffff811c333b>]  [<ffffffff811c333b>] remove_inode_hugepages+0x25b/0x320
  ...
  Call Trace:
   [<ffffffff811c3415>] hugetlbfs_evict_inode+0x15/0x40
   [<ffffffff81167b3d>] evict+0xbd/0x1b0
   [<ffffffff8116514a>] __dentry_kill+0x19a/0x1f0
   [<ffffffff81165b0e>] dput+0x1fe/0x220
   [<ffffffff81150535>] __fput+0x155/0x200
   [<ffffffff81079fc0>] task_work_run+0x60/0xa0
   [<ffffffff81063510>] do_exit+0x160/0x400
   [<ffffffff810637eb>] do_group_exit+0x3b/0xa0
   [<ffffffff8106e8bd>] get_signal+0x1ed/0x470
   [<ffffffff8100f854>] do_signal+0x14/0x110
   [<ffffffff810030e9>] prepare_exit_to_usermode+0xe9/0xf0
   [<ffffffff814178a5>] retint_user+0x8/0x13

This is CVE-2016-3961 / XSA-174.

Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <JGross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Link: http://lkml.kernel.org/r/57188ED802000078000E431C@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agos390/hugetlb: add hugepages_supported define
Dominik Dingel [Fri, 17 Jul 2015 23:23:39 +0000 (16:23 -0700)]
s390/hugetlb: add hugepages_supported define

commit 7f9be77555bb2e52de84e9dddf7b4eb20cc6e171 upstream.

On s390 we only can enable hugepages if the underlying hardware/hypervisor
also does support this.  Common code now would assume this to be
signaled by setting HPAGE_SHIFT to 0.  But on s390, where we only
support one hugepage size, there is a link between HPAGE_SHIFT and
pageblock_order.

So instead of setting HPAGE_SHIFT to 0, we will implement the check for
the hardware capability.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agomm: hugetlb: allow hugepages_supported to be architecture specific
Dominik Dingel [Fri, 17 Jul 2015 23:23:37 +0000 (16:23 -0700)]
mm: hugetlb: allow hugepages_supported to be architecture specific

commit 2531c8cf56a640cd7d17057df8484e570716a450 upstream.

s390 has a constant hugepage size, by setting HPAGE_SHIFT we also change
e.g. the pageblock_order, which should be independent in respect to
hugepage support.

With this patch every architecture is free to define how to check
for hugepage support.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agohugetlb: ensure hugepage access is denied if hugepages are not supported
Nishanth Aravamudan [Tue, 6 May 2014 19:50:00 +0000 (12:50 -0700)]
hugetlb: ensure hugepage access is denied if hugepages are not supported

commit 457c1b27ed56ec472d202731b12417bff023594a upstream.

Currently, I am seeing the following when I `mount -t hugetlbfs /none
/dev/hugetlbfs`, and then simply do a `ls /dev/hugetlbfs`.  I think it's
related to the fact that hugetlbfs is properly not correctly setting
itself up in this state?:

  Unable to handle kernel paging request for data at address 0x00000031
  Faulting instruction address: 0xc000000000245710
  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  ....

In KVM guests on Power, in a guest not backed by hugepages, we see the
following:

  AnonHugePages:         0 kB
  HugePages_Total:       0
  HugePages_Free:        0
  HugePages_Rsvd:        0
  HugePages_Surp:        0
  Hugepagesize:         64 kB

HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
are not supported at boot-time, but this is only checked in
hugetlb_init().  Extract the check to a helper function, and use it in a
few relevant places.

This does make hugetlbfs not supported (not registered at all) in this
environment.  I believe this is fine, as there are no valid hugepages
and that won't change at runtime.

[akpm@linux-foundation.org: use pr_info(), per Mel]
[akpm@linux-foundation.org: fix build when HPAGE_SHIFT is undefined]
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2:
 - Drop changes to hugetlb_show_meminfo()
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoatl2: Disable unimplemented scatter/gather feature
Ben Hutchings [Wed, 20 Apr 2016 22:23:08 +0000 (23:23 +0100)]
atl2: Disable unimplemented scatter/gather feature

commit f43bfaeddc79effbf3d0fcb53ca477cca66f3db8 upstream.

atl2 includes NETIF_F_SG in hw_features even though it has no support
for non-linear skbs.  This bug was originally harmless since the
driver does not claim to implement checksum offload and that used to
be a requirement for SG.

Now that SG and checksum offload are independent features, if you
explicitly enable SG *and* use one of the rare protocols that can use
SG without checkusm offload, this potentially leaks sensitive
information (before you notice that it just isn't working).  Therefore
this obscure bug has been designated CVE-2016-2117.

Reported-by: Justin Yackoski <jyackoski@crypto-nite.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: ec5f06156423 ("net: Kill link between CSUM and SG features.")
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]

5 years agopowerpc: scan_features() updates incorrect bits for REAL_LE
Anton Blanchard [Fri, 15 Apr 2016 02:06:13 +0000 (12:06 +1000)]
powerpc: scan_features() updates incorrect bits for REAL_LE

commit 6997e57d693b07289694239e52a10d2f02c3a46f upstream.

The REAL_LE feature entry in the ibm_pa_feature struct is missing an MMU
feature value, meaning all the remaining elements initialise the wrong
values.

This means instead of checking for byte 5, bit 0, we check for byte 0,
bit 0, and then we incorrectly set the CPU feature bit as well as MMU
feature bit 1 and CPU user feature bits 0 and 2 (5).

Checking byte 0 bit 0 (IBM numbering), means we're looking at the
"Memory Management Unit (MMU)" feature - ie. does the CPU have an MMU.
In practice that bit is set on all platforms which have the property.

This means we set CPU_FTR_REAL_LE always. In practice that seems not to
matter because all the modern cpus which have this property also
implement REAL_LE, and we've never needed to disable it.

We're also incorrectly setting MMU feature bit 1, which is:

  #define MMU_FTR_TYPE_8xx 0x00000002

Luckily the only place that looks for MMU_FTR_TYPE_8xx is in Book3E
code, which can't run on the same cpus as scan_features(). So this also
doesn't matter in practice.

Finally in the CPU user feature mask, we're setting bits 0 and 2. Bit 2
is not currently used, and bit 0 is:

  #define PPC_FEATURE_PPC_LE 0x00000001

Which says the CPU supports the old style "PPC Little Endian" mode.
Again this should be harmless in practice as no 64-bit CPUs implement
that mode.

Fix the code by adding the missing initialisation of the MMU feature.

Also add a comment marking CPU user feature bit 2 (0x4) as reserved. It
would be unsafe to start using it as old kernels incorrectly set it.

Fixes: 44ae3ab3358e ("powerpc: Free up some CPU feature bits by moving out MMU-related features")
Signed-off-by: Anton Blanchard <anton@samba.org>
[mpe: Flesh out changelog, add comment reserving 0x4]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[bwh: Backported to 3.2: adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoInput: pmic8xxx-pwrkey - fix algorithm for converting trigger delay
Stephen Boyd [Sun, 17 Apr 2016 12:21:42 +0000 (05:21 -0700)]
Input: pmic8xxx-pwrkey - fix algorithm for converting trigger delay

commit eda5ecc0a6b865561997e177c393f0b0136fe3b7 upstream.

The trigger delay algorithm that converts from microseconds to
the register value looks incorrect. According to most of the PMIC
documentation, the equation is

delay (Seconds) = (1 / 1024) * 2 ^ (x + 4)

except for one case where the documentation looks to have a
formatting issue and the equation looks like

delay (Seconds) = (1 / 1024) * 2 x + 4

Most likely this driver was written with the improper
documentation to begin with. According to the downstream sources
the valid delays are from 2 seconds to 1/64 second, and the
latter equation just doesn't make sense for that. Let's fix the
algorithm and the range check to match the documentation and the
downstream sources.

Reported-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Fixes: 92d57a73e410 ("input: Add support for Qualcomm PMIC8XXX power key")
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[bwh: Backported to 3.2: use pdata->kpd_trigger_delay_us not kpd_delay]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agousb: hcd: out of bounds access in for_each_companion
Robert Dobrowolski [Thu, 24 Mar 2016 10:30:07 +0000 (03:30 -0700)]
usb: hcd: out of bounds access in for_each_companion

commit e86103a75705c7c530768f4ffaba74cf382910f2 upstream.

On BXT platform Host Controller and Device Controller figure as
same PCI device but with different device function. HCD should
not pass data to Device Controller but only to Host Controllers.
Checking if companion device is Host Controller, otherwise skip.

Signed-off-by: Robert Dobrowolski <robert.dobrowolski@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoUSB: uas: Add a new NO_REPORT_LUNS quirk
Hans de Goede [Tue, 12 Apr 2016 10:27:09 +0000 (12:27 +0200)]
USB: uas: Add a new NO_REPORT_LUNS quirk

commit 1363074667a6b7d0507527742ccd7bbed5e3ceaa upstream.

Add a new NO_REPORT_LUNS quirk and set it for Seagate drives with
an usb-id of: 0bc2:331a, as these will fail to respond to a
REPORT_LUNS command.

Reported-and-tested-by: David Webb <djw@noc.ac.uk>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2:
 - Adjust context
 - Drop the UAS changes]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agousb: xhci: fix wild pointers in xhci_mem_cleanup
Lu Baolu [Fri, 8 Apr 2016 13:25:09 +0000 (16:25 +0300)]
usb: xhci: fix wild pointers in xhci_mem_cleanup

commit 71504062a7c34838c3fccd92c447f399d3cb5797 upstream.

This patch fixes some wild pointers produced by xhci_mem_cleanup.
These wild pointers will cause system crash if xhci_mem_cleanup()
is called twice.

Reported-and-tested-by: Pengcheng Li <lpc.li@hisilicon.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: there's no xhci_hcd::ext_caps field to clear]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agousb: xhci: applying XHCI_PME_STUCK_QUIRK to Intel BXT B0 host
Rafal Redzimski [Fri, 8 Apr 2016 13:25:05 +0000 (16:25 +0300)]
usb: xhci: applying XHCI_PME_STUCK_QUIRK to Intel BXT B0 host

commit 0d46faca6f887a849efb07c1655b5a9f7c288b45 upstream.

Broxton B0 also requires XHCI_PME_STUCK_QUIRK.
Adding PCI device ID for Broxton B and adding to quirk.

Signed-off-by: Rafal Redzimski <rafal.f.redzimski@intel.com>
Signed-off-by: Robert Dobrowolski <robert.dobrowolski@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agonl80211: check netlink protocol in socket release notification
Dmitry Ivanov [Wed, 6 Apr 2016 14:23:18 +0000 (17:23 +0300)]
nl80211: check netlink protocol in socket release notification

commit 8f815cdde3e550e10c2736990d791f60c2ce43eb upstream.

A non-privileged user can create a netlink socket with the same port_id as
used by an existing open nl80211 netlink socket (e.g. as used by a hostapd
process) with a different protocol number.

Closing this socket will then lead to the notification going to nl80211's
socket release notification handler, and possibly cause an action such as
removing a virtual interface.

Fix this issue by checking that the netlink protocol is NETLINK_GENERIC.
Since generic netlink has no notifier chain of its own, we can't fix the
problem more generically.

Fixes: 026331c4d9b5 ("cfg80211/mac80211: allow registering for and sending action frames")
Signed-off-by: Dmitry Ivanov <dima@ubnt.com>
[rewrite commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agokvm: x86: do not leak guest xcr0 into host interrupt handlers
David Matlack [Wed, 30 Mar 2016 19:24:47 +0000 (12:24 -0700)]
kvm: x86: do not leak guest xcr0 into host interrupt handlers

commit fc5b7f3bf1e1414bd4e91db6918c85ace0c873a5 upstream.

An interrupt handler that uses the fpu can kill a KVM VM, if it runs
under the following conditions:
 - the guest's xcr0 register is loaded on the cpu
 - the guest's fpu context is not loaded
 - the host is using eagerfpu

Note that the guest's xcr0 register and fpu context are not loaded as
part of the atomic world switch into "guest mode". They are loaded by
KVM while the cpu is still in "host mode".

Usage of the fpu in interrupt context is gated by irq_fpu_usable(). The
interrupt handler will look something like this:

if (irq_fpu_usable()) {
        kernel_fpu_begin();

        [... code that uses the fpu ...]

        kernel_fpu_end();
}

As long as the guest's fpu is not loaded and the host is using eager
fpu, irq_fpu_usable() returns true (interrupted_kernel_fpu_idle()
returns true). The interrupt handler proceeds to use the fpu with
the guest's xcr0 live.

kernel_fpu_begin() saves the current fpu context. If this uses
XSAVE[OPT], it may leave the xsave area in an undesirable state.
According to the SDM, during XSAVE bit i of XSTATE_BV is not modified
if bit i is 0 in xcr0. So it's possible that XSTATE_BV[i] == 1 and
xcr0[i] == 0 following an XSAVE.

kernel_fpu_end() restores the fpu context. Now if any bit i in
XSTATE_BV == 1 while xcr0[i] == 0, XRSTOR generates a #GP. The
fault is trapped and SIGSEGV is delivered to the current process.

Only pre-4.2 kernels appear to be vulnerable to this sequence of
events. Commit 653f52c ("kvm,x86: load guest FPU context more eagerly")
from 4.2 forces the guest's fpu to always be loaded on eagerfpu hosts.

This patch fixes the bug by keeping the host's xcr0 loaded outside
of the interrupts-disabled region where KVM switches into guest mode.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David Matlack <dmatlack@google.com>
[Move load after goto cancel_injection. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.2:
 - Adjust context
 - Drop change in__kvm_set_xcr()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agolibahci: save port map for forced port map
Srinivas Kandagatla [Fri, 1 Apr 2016 07:52:56 +0000 (08:52 +0100)]
libahci: save port map for forced port map

commit 2fd0f46cb1b82587c7ae4a616d69057fb9bd0af7 upstream.

In usecases where force_port_map is used saved_port_map is never set,
resulting in not programming the PORTS_IMPL register as part of initial
config. This patch fixes this by setting it to port_map even in case
where force_port_map is used, making it more inline with other parts of
the code.

Fixes: 566d1827df2e ("libata: disable forced PORTS_IMPL for >= AHCI 1.3")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoInput: gtco - fix crash on detecting device without endpoints
Vladis Dronov [Thu, 31 Mar 2016 17:53:42 +0000 (10:53 -0700)]
Input: gtco - fix crash on detecting device without endpoints

commit 162f98dea487206d9ab79fc12ed64700667a894d upstream.

The gtco driver expects at least one valid endpoint. If given malicious
descriptors that specify 0 for the number of endpoints, it will crash in
the probe function. Ensure there is at least one endpoint on the interface
before using it.

Also let's fix a minor coding style issue.

The full correct report of this issue can be found in the public
Red Hat Bugzilla:

https://bugzilla.redhat.com/show_bug.cgi?id=1283385

Reported-by: Ralf Spenneberg <ralf@spenneberg.net>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoipmi: fix timeout calculation when bmc is disconnected
Xie XiuQi [Fri, 24 Jan 2014 20:00:52 +0000 (14:00 -0600)]
ipmi: fix timeout calculation when bmc is disconnected

commit e21404dc0ac7ac971c1e36274b48bb460463f4e5 upstream.

Loading ipmi_si module while bmc is disconnected, we found the timeout
is longer than 5 secs.  Actually it takes about 3 mins and 20
secs.(HZ=250)

error message as below:
  Dec 12 19:08:59 linux kernel: IPMI BT: timeout in RD_WAIT [ ] 1 retries left
  Dec 12 19:08:59 linux kernel: BT: write 4 bytes seq=0x01 03 18 00 01
  [...]
  Dec 12 19:12:19 linux kernel: IPMI BT: timeout in RD_WAIT [ ]
  Dec 12 19:12:19 linux kernel: failed 2 retries, sending error response
  Dec 12 19:12:19 linux kernel: IPMI: BT reset (takes 5 secs)
  Dec 12 19:12:19 linux kernel: IPMI BT: flag reset [ ]

Function wait_for_msg_done() use schedule_timeout_uninterruptible(1) to
sleep 1 tick, so we should subtract jiffies_to_usecs(1) instead of 100
usecs from timeout.

Reported-by: Hu Shiyuan <hushiyuan@huawei.com>
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: sebastian.riemer@profitbricks.com
Cc: cminyard@mvista.com
Cc: openipmi-developer@lists.sourceforge.net
5 years agox86, sparse: Do not force removal of __user when calling copy_to/from_user_nocheck()
Steven Rostedt [Fri, 3 Jan 2014 21:45:00 +0000 (16:45 -0500)]
x86, sparse: Do not force removal of __user when calling copy_to/from_user_nocheck()

commit df90ca969035d3f6c95044e272f75bf417b14245 upstream.

Commit ff47ab4ff3cdd "x86: Add 1/2/4/8 byte optimization to 64bit
__copy_{from,to}_user_inatomic" added a "_nocheck" call in between
the copy_to/from_user() and copy_user_generic(). As both the
normal and nocheck versions of theses calls use the proper __user
annotation, a typecast to remove it should not be added.
This causes sparse to spin out the following warnings:

arch/x86/include/asm/uaccess_64.h:207:47: warning: incorrect type in argument 2 (different address spaces)
arch/x86/include/asm/uaccess_64.h:207:47:    expected void const [noderef] <asn:1>*src
arch/x86/include/asm/uaccess_64.h:207:47:    got void const *<noident>
arch/x86/include/asm/uaccess_64.h:207:47: warning: incorrect type in argument 2 (different address spaces)
arch/x86/include/asm/uaccess_64.h:207:47:    expected void const [noderef] <asn:1>*src
arch/x86/include/asm/uaccess_64.h:207:47:    got void const *<noident>
arch/x86/include/asm/uaccess_64.h:207:47: warning: incorrect type in argument 2 (different address spaces)
arch/x86/include/asm/uaccess_64.h:207:47:    expected void const [noderef] <asn:1>*src
arch/x86/include/asm/uaccess_64.h:207:47:    got void const *<noident>
arch/x86/include/asm/uaccess_64.h:207:47: warning: incorrect type in argument 2 (different address spaces)
arch/x86/include/asm/uaccess_64.h:207:47:    expected void const [noderef] <asn:1>*src
arch/x86/include/asm/uaccess_64.h:207:47:    got void const *<noident>

Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20140103164500.5f6478f5@gandalf.local.home
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: mingo@redhat.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: h.zuidam@computer.org
5 years agox86: Add 1/2/4/8 byte optimization to 64bit __copy_{from,to}_user_inatomic
Andi Kleen [Fri, 16 Aug 2013 21:17:19 +0000 (14:17 -0700)]
x86: Add 1/2/4/8 byte optimization to 64bit __copy_{from,to}_user_inatomic

commit ff47ab4ff3cddfa7bc1b25b990e24abe2ae474ff upstream.

The 64bit __copy_{from,to}_user_inatomic always called
copy_from_user_generic, but skipped the special optimizations for 1/2/4/8
byte accesses.

This especially hurts the futex call, which accesses the 4 byte futex
user value with a complicated fast string operation in a function call,
instead of a single movl.

Use __copy_{from,to}_user for _inatomic instead to get the same
optimizations. The only problem was the might_fault() in those functions.
So move that to new wrapper and call __copy_{f,t}_user_nocheck()
from *_inatomic directly.

32bit already did this correctly by duplicating the code.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1376687844-19857-2-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: mingo@redhat.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: h.zuidam@computer.org
5 years agocrypto: gcm - Fix rfc4543 decryption crash
Herbert Xu [Fri, 18 Mar 2016 14:42:40 +0000 (22:42 +0800)]
crypto: gcm - Fix rfc4543 decryption crash

This bug has already bee fixed upstream since 4.2.  However, it
was fixed during the AEAD conversion so no fix was backported to
the older kernels.

When we do an RFC 4543 decryption, we will end up writing the
ICV beyond the end of the dst buffer.  This should lead to a
crash but for some reason it was never noticed.

This patch fixes it by only writing back the ICV for encryption.

Fixes: d733ac90f9fe ("crypto: gcm - fix rfc4543 to handle async...")
Reported-by: Patrick Meyer <patrick.meyer@vasgard.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agocrypto: gcm - fix rfc4543 to handle async crypto correctly
Jussi Kivilinna [Sun, 7 Apr 2013 13:43:46 +0000 (16:43 +0300)]
crypto: gcm - fix rfc4543 to handle async crypto correctly

commit d733ac90f9fe8ac284e523f9920b507555b12f6d upstream.

If the gcm cipher used by rfc4543 does not complete request immediately,
the authentication tag is not copied to destination buffer. Patch adds
correct async logic for this case.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agox86/microcode/amd: Do not overwrite final patch levels
Borislav Petkov [Mon, 12 Oct 2015 09:22:42 +0000 (11:22 +0200)]
x86/microcode/amd: Do not overwrite final patch levels

commit 0399f73299f1b7e04de329050f7111b362b7eeb5 upstream.

A certain number of patch levels of applied microcode should not
be overwritten by the microcode loader, otherwise bad things
will happen.

Check those and abort update if the current core has one of
those final patch levels applied by the BIOS. 32-bit needs
special handling, of course.

See https://bugzilla.suse.com/show_bug.cgi?id=913996 for more
info.

Tested-by: Peter Kirchgeßner <pkirchgessner@t-online.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1444641762-9437-7-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
5 years agox86/microcode/amd: Extract current patch level read to a function
Borislav Petkov [Mon, 12 Oct 2015 09:22:41 +0000 (11:22 +0200)]
x86/microcode/amd: Extract current patch level read to a function

commit 2eff73c0a11f19ff082a566e3429fbaaca7b8e7b upstream.

Pave the way for checking the current patch level of the
microcode in a core. We want to be able to do stuff depending on
the patch level - in this case decide whether to update or not.
But that will be added in a later patch.

Drop unused local var uci assignment, while at it.

Integrate a fix for 32-bit and CONFIG_PARAVIRT from Takashi Iwai:

 Use native_rdmsr() in check_current_patch_level() because with
 CONFIG_PARAVIRT enabled and on 32-bit, where we run before
 paging has been enabled, we cannot deref pv_info yet. Or we
 could, but we'd need to access its physical address. This way of
 fixing it is simpler. See:

   https://bugzilla.suse.com/show_bug.cgi?id=943179 for the background.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Takashi Iwai <tiwai@suse.com>:
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1444641762-9437-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
5 years agoRevert "net: validate variable length ll headers"
Ben Hutchings [Sun, 1 May 2016 16:00:43 +0000 (18:00 +0200)]
Revert "net: validate variable length ll headers"

This reverts commit b5518429e70cd783b8ca52335456172c1a0589f6, which was
commit 2793a23aacbd754dbbb5cb75093deb7e4103bace upstream.  It is
pointless unless af_packet calls the new function.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoRevert "ax25: add link layer header validation function"
Ben Hutchings [Sun, 1 May 2016 15:59:44 +0000 (17:59 +0200)]
Revert "ax25: add link layer header validation function"

This reverts commit 0954b59d9f4b2dcc59f28d1f64c3a21062a64372, which was
commit ea47781c26510e5d97f80f9aceafe9065bd5e3aa upstream.  It is
pointless unless af_packet calls the new function.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoLinux 3.2.80 v3.2.80
Ben Hutchings [Sat, 30 Apr 2016 22:05:28 +0000 (00:05 +0200)]
Linux 3.2.80

5 years agonetfilter: x_tables: fix unconditional helper
Florian Westphal [Tue, 22 Mar 2016 17:02:52 +0000 (18:02 +0100)]
netfilter: x_tables: fix unconditional helper

commit 54d83fc74aa9ec72794373cb47432c5f7fb1a309 upstream.

Ben Hawkes says:

 In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it
 is possible for a user-supplied ipt_entry structure to have a large
 next_offset field. This field is not bounds checked prior to writing a
 counter value at the supplied offset.

Problem is that mark_source_chains should not have been called --
the rule doesn't have a next entry, so its supposed to return
an absolute verdict of either ACCEPT or DROP.

However, the function conditional() doesn't work as the name implies.
It only checks that the rule is using wildcard address matching.

However, an unconditional rule must also not be using any matches
(no -m args).

The underflow validator only checked the addresses, therefore
passing the 'unconditional absolute verdict' test, while
mark_source_chains also tested for presence of matches, and thus
proceeeded to the next (not-existent) rule.

Unify this so that all the callers have same idea of 'unconditional rule'.

Reported-by: Ben Hawkes <hawkes@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agojme: Fix device PM wakeup API usage
Guo-Fu Tseng [Sat, 5 Mar 2016 00:11:56 +0000 (08:11 +0800)]
jme: Fix device PM wakeup API usage

commit 81422e672f8181d7ad1ee6c60c723aac649f538f upstream.

According to Documentation/power/devices.txt

The driver should not use device_set_wakeup_enable() which is the policy
for user to decide.

Using device_init_wakeup() to initialize dev->power.should_wakeup and
dev->power.can_wakeup on driver initialization.

And use device_may_wakeup() on suspend to decide if WoL function should
be enabled on NIC.

Reported-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agojme: Do not enable NIC WoL functions on S0
Guo-Fu Tseng [Sat, 5 Mar 2016 00:11:55 +0000 (08:11 +0800)]
jme: Do not enable NIC WoL functions on S0

commit 0772a99b818079e628a1da122ac7ee023faed83e upstream.

Otherwise it might be back on resume right after going to suspend in
some hardware.

Reported-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoipv6: Count in extension headers in skb->network_header
Jakub Sitnicki [Tue, 5 Apr 2016 16:41:08 +0000 (18:41 +0200)]
ipv6: Count in extension headers in skb->network_header

[ Upstream commit 3ba3458fb9c050718b95275a3310b74415e767e2 ]

When sending a UDPv6 message longer than MTU, account for the length
of fragmentable IPv6 extension headers in skb->network_header offset.
Same as we do in alloc_new_skb path in __ip6_append_data().

This ensures that later on __ip6_make_skb() will make space in
headroom for fragmentable extension headers:

/* move skb->data to ip header from ext header */
if (skb->data < skb_network_header(skb))
__skb_pull(skb, skb_network_offset(skb));

Prevents a splat due to skb_under_panic:

skbuff: skb_under_panic: text:ffffffff8143397b len:2126 put:14 \
head:ffff880005bacf50 data:ffff880005bacf4a tail:0x48 end:0xc0 dev:lo
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:104!
invalid opcode: 0000 [#1] KASAN
CPU: 0 PID: 160 Comm: reproducer Not tainted 4.6.0-rc2 #65
[...]
Call Trace:
 [<ffffffff813eb7b9>] skb_push+0x79/0x80
 [<ffffffff8143397b>] eth_header+0x2b/0x100
 [<ffffffff8141e0d0>] neigh_resolve_output+0x210/0x310
 [<ffffffff814eab77>] ip6_finish_output2+0x4a7/0x7c0
 [<ffffffff814efe3a>] ip6_output+0x16a/0x280
 [<ffffffff815440c1>] ip6_local_out+0xb1/0xf0
 [<ffffffff814f1115>] ip6_send_skb+0x45/0xd0
 [<ffffffff81518836>] udp_v6_send_skb+0x246/0x5d0
 [<ffffffff8151985e>] udpv6_sendmsg+0xa6e/0x1090
[...]

Reported-by: Ji Jianwen <jiji@redhat.com>
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoipv4: l2tp: fix a potential issue in l2tp_ip_recv
Haishuang Yan [Sun, 3 Apr 2016 14:09:23 +0000 (22:09 +0800)]
ipv4: l2tp: fix a potential issue in l2tp_ip_recv

[ Upstream commit 5745b8232e942abd5e16e85fa9b27cc21324acf0 ]

pskb_may_pull() can change skb->data, so we have to load ptr/optr at the
right place.

Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoxfrm: Fix crash observed during device unregistration and decryption
subashab@codeaurora.org [Thu, 24 Mar 2016 04:39:50 +0000 (22:39 -0600)]
xfrm: Fix crash observed during device unregistration and decryption

[ Upstream commit 071d36bf21bcc837be00cea55bcef8d129e7f609 ]

A crash is observed when a decrypted packet is processed in receive
path. get_rps_cpus() tries to dereference the skb->dev fields but it
appears that the device is freed from the poison pattern.

[<ffffffc000af58ec>] get_rps_cpu+0x94/0x2f0
[<ffffffc000af5f94>] netif_rx_internal+0x140/0x1cc
[<ffffffc000af6094>] netif_rx+0x74/0x94
[<ffffffc000bc0b6c>] xfrm_input+0x754/0x7d0
[<ffffffc000bc0bf8>] xfrm_input_resume+0x10/0x1c
[<ffffffc000ba6eb8>] esp_input_done+0x20/0x30
[<ffffffc0000b64c8>] process_one_work+0x244/0x3fc
[<ffffffc0000b7324>] worker_thread+0x2f8/0x418
[<ffffffc0000bb40c>] kthread+0xe0/0xec

-013|get_rps_cpu(
     |    dev = 0xFFFFFFC08B688000,
     |    skb = 0xFFFFFFC0C76AAC00 -> (
     |      dev = 0xFFFFFFC08B688000 -> (
     |        name =
"......................................................
     |        name_hlist = (next = 0xAAAAAAAAAAAAAAAA, pprev =
0xAAAAAAAAAAA

Following are the sequence of events observed -

- Encrypted packet in receive path from netdevice is queued
- Encrypted packet queued for decryption (asynchronous)
- Netdevice brought down and freed
- Packet is decrypted and returned through callback in esp_input_done
- Packet is queued again for process in network stack using netif_rx

Since the device appears to have been freed, the dereference of
skb->dev in get_rps_cpus() leads to an unhandled page fault
exception.

Fix this by holding on to device reference when queueing packets
asynchronously and releasing the reference on call back return.

v2: Make the change generic to xfrm as mentioned by Steffen and
update the title to xfrm

Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jerome Stanislaus <jeromes@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoqlge: Fix receive packets drop.
Manish Chopra [Tue, 15 Mar 2016 11:13:45 +0000 (07:13 -0400)]
qlge: Fix receive packets drop.

[ Upstream commit 2c9a266afefe137bff06bbe0fc48b4d3b3cb348c ]

When running small packets [length < 256 bytes] traffic, packets were
being dropped due to invalid data in those packets which were
delivered by the driver upto the stack. Using pci_dma_sync_single_for_cpu
ensures copying latest and updated data into skb from the receive buffer.

Signed-off-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agofarsync: fix off-by-one bug in fst_add_one
Arnd Bergmann [Mon, 14 Mar 2016 14:18:35 +0000 (15:18 +0100)]
farsync: fix off-by-one bug in fst_add_one

[ Upstream commit e725a66c0202b5f36c2f9d59d26a65c53bbf21f7 ]

gcc-6 finds an out of bounds access in the fst_add_one function
when calculating the end of the mmio area:

drivers/net/wan/farsync.c: In function 'fst_add_one':
drivers/net/wan/farsync.c:418:53: error: index 2 denotes an offset greater than size of 'u8[2][8192] {aka unsigned char[2][8192]}' [-Werror=array-bounds]
 #define BUF_OFFSET(X)   (BFM_BASE + offsetof(struct buf_window, X))
                                                     ^
include/linux/compiler-gcc.h:158:21: note: in definition of macro '__compiler_offsetof'
  __builtin_offsetof(a, b)
                     ^
drivers/net/wan/farsync.c:418:37: note: in expansion of macro 'offsetof'
 #define BUF_OFFSET(X)   (BFM_BASE + offsetof(struct buf_window, X))
                                     ^~~~~~~~
drivers/net/wan/farsync.c:2519:36: note: in expansion of macro 'BUF_OFFSET'
                                  + BUF_OFFSET ( txBuffer[i][NUM_TX_BUFFER][0]);
                                    ^~~~~~~~~~

The warning is correct, but not critical because this appears
to be a write-only variable that is set by each WAN driver but
never accessed afterwards.

I'm taking the minimal fix here, using the correct pointer by
pointing 'mem_end' to the last byte inside of the register area
as all other WAN drivers do, rather than the first byte outside of
it. An alternative would be to just remove the mem_end member
entirely.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agomacvtap: always pass ethernet header in linear
Willem de Bruijn [Tue, 8 Mar 2016 20:18:54 +0000 (15:18 -0500)]
macvtap: always pass ethernet header in linear

[ Upstream commit 8e2ad4113ce4671686740f808ff2795395c39eef ]

The stack expects link layer headers in the skb linear section.
Macvtap can create skbs with llheader in frags in edge cases:
when (IFF_VNET_HDR is off or vnet_hdr.hdr_len < ETH_HLEN) and
prepad + len > PAGE_SIZE and vnet_hdr.flags has no or bad csum.

Add checks to ensure linear is always at least ETH_HLEN.
At this point, len is already ensured to be >= ETH_HLEN.

For backwards compatiblity, rounds up short vnet_hdr.hdr_len.
This differs from tap and packet, which return an error.

Fixes b9fb9ee07e67 ("macvtap: add GSO/csum offload support")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: don't use macvtap16_to_cpu()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agosh_eth: fix NULL pointer dereference in sh_eth_ring_format()
Sergei Shtylyov [Mon, 7 Mar 2016 22:36:28 +0000 (01:36 +0300)]
sh_eth: fix NULL pointer dereference in sh_eth_ring_format()

[ Upstream commit c1b7fca65070bfadca94dd53a4e6b71cd4f69715 ]

In a low memory situation, if netdev_alloc_skb() fails on a first RX ring
loop iteration  in sh_eth_ring_format(), 'rxdesc' is still NULL.  Avoid
kernel oops by adding the 'rxdesc' check after the loop.

Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoax25: add link layer header validation function
Willem de Bruijn [Thu, 10 Mar 2016 02:58:33 +0000 (21:58 -0500)]
ax25: add link layer header validation function

[ Upstream commit ea47781c26510e5d97f80f9aceafe9065bd5e3aa ]

As variable length protocol, AX25 fails link layer header validation
tests based on a minimum length. header_ops.validate allows protocols
to validate headers that are shorter than hard_header_len. Implement
this callback for AX25.

See also http://comments.gmane.org/gmane.linux.network/401064

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agonet: validate variable length ll headers
Willem de Bruijn [Thu, 10 Mar 2016 02:58:32 +0000 (21:58 -0500)]
net: validate variable length ll headers

[ Upstream commit 2793a23aacbd754dbbb5cb75093deb7e4103bace ]

Netdevice parameter hard_header_len is variously interpreted both as
an upper and lower bound on link layer header length. The field is
used as upper bound when reserving room at allocation, as lower bound
when validating user input in PF_PACKET.

Clarify the definition to be maximum header length. For validation
of untrusted headers, add an optional validate member to header_ops.

Allow bypassing of validation by passing CAP_SYS_RAWIO, for instance
for deliberate testing of corrupt input. In this case, pad trailing
bytes, as some device drivers expect completely initialized headers.

See also http://comments.gmane.org/gmane.linux.network/401064

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: net_device has inline comments instead of kernel-doc]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agocdc_ncm: toggle altsetting to force reset before setup
Bjørn Mork [Thu, 3 Mar 2016 21:20:53 +0000 (22:20 +0100)]
cdc_ncm: toggle altsetting to force reset before setup

[ Upstream commit 48906f62c96cc2cd35753e59310cb70eb08cc6a5 ]

Some devices will silently fail setup unless they are reset first.
This is necessary even if the data interface is already in
altsetting 0, which it will be when the device is probed for the
first time.  Briefly toggling the altsetting forces a function
reset regardless of the initial state.

This fixes a setup problem observed on a number of Huawei devices,
appearing to operate in NTB-32 mode even if we explicitly set them
to NTB-16 mode.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: hard-code 1 for data_altsetting]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agosctp: lack the check for ports in sctp_v6_cmp_addr
Xin Long [Sun, 28 Feb 2016 02:03:51 +0000 (10:03 +0800)]
sctp: lack the check for ports in sctp_v6_cmp_addr

[ Upstream commit 40b4f0fd74e46c017814618d67ec9127ff20f157 ]

As the member .cmp_addr of sctp_af_inet6, sctp_v6_cmp_addr should also check
the port of addresses, just like sctp_v4_cmp_addr, cause it's invoked by
sctp_cmp_addr_exact().

Now sctp_v6_cmp_addr just check the port when two addresses have different
family, and lack the port check for two ipv6 addresses. that will make
sctp_hash_cmp() cannot work well.

so fix it by adding ports comparison in sctp_v6_cmp_addr().

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agonet: jme: fix suspend/resume on JMC260
Diego Viola [Tue, 23 Feb 2016 15:04:04 +0000 (12:04 -0300)]
net: jme: fix suspend/resume on JMC260

[ Upstream commit ee50c130c82175eaa0820c96b6d3763928af2241 ]

The JMC260 network card fails to suspend/resume because the call to
jme_start_irq() was too early, moving the call to jme_start_irq() after
the call to jme_reset_link() makes it work.

Prior this change suspend/resume would fail unless /sys/power/pm_async=0
was explicitly specified.

Relevant bug report: https://bugzilla.kernel.org/show_bug.cgi?id=112351

Signed-off-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoserial: sh-sci: Remove cpufreq notifier to fix crash/deadlock
Geert Uytterhoeven [Mon, 4 Apr 2016 14:36:25 +0000 (16:36 +0200)]
serial: sh-sci: Remove cpufreq notifier to fix crash/deadlock

commit ff1cab374ad98f4b9f408525ca9c08992b4ed784 upstream.

The BSP team noticed that there is spin/mutex lock issue on sh-sci when
CPUFREQ is used.  The issue is that the notifier function may call
mutex_lock() while the spinlock is held, which can lead to a BUG().
This may happen if CPUFREQ is changed while another CPU calls
clk_get_rate().

Taking the spinlock was added to the notifier function in commit
e552de2413edad1a ("sh-sci: add platform device private data"), to
protect the list of serial ports against modification during traversal.
At that time the Common Clock Framework didn't exist yet, and
clk_get_rate() just returned clk->rate without taking a mutex.
Note that since commit d535a2305facf9b4 ("serial: sh-sci: Require a
device per port mapping."), there's no longer a list of serial ports to
traverse, and taking the spinlock became superfluous.

To fix the issue, just remove the cpufreq notifier:
  1. The notifier doesn't work correctly: all it does is update the
     stored clock rate; it does not update the divider in the hardware.
     The divider will only be updated when calling sci_set_termios().
     I believe this was broken back in 2004, when the old
     drivers/char/sh-sci.c driver (where the notifier did update the
     divider) was replaced by drivers/serial/sh-sci.c (where the
     notifier just updated port->uartclk).
     Cfr. full-history-linux commits 6f8deaef2e9675d9 ("[PATCH] sh: port
     sh-sci driver to the new API") and 3f73fe878dc9210a ("[PATCH]
     Remove old sh-sci driver").
  2. On modern SoCs, the sh-sci parent clock rate is no longer related
     to the CPU clock rate anyway, so using a cpufreq notifier is
     futile.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agobio: return EINTR if copying to user space got interrupted
Hannes Reinecke [Fri, 12 Feb 2016 08:39:15 +0000 (09:39 +0100)]
bio: return EINTR if copying to user space got interrupted

commit 2d99b55d378c996b9692a0c93dd25f4ed5d58934 upstream.

Commit 35dc248383bbab0a7203fca4d722875bc81ef091 introduced a check for
current->mm to see if we have a user space context and only copies data
if we do. Now if an IO gets interrupted by a signal data isn't copied
into user space any more (as we don't have a user space context) but
user space isn't notified about it.

This patch modifies the behaviour to return -EINTR from bio_uncopy_user()
to notify userland that a signal has interrupted the syscall, otherwise
it could lead to a situation where the caller may get a buffer with
no data returned.

This can be reproduced by issuing SG_IO ioctl()s in one thread while
constantly sending signals to it.

Fixes: 35dc248 [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
[bwh: Backported to 3.2:
 - Adjust filename
 - Put the assignment in the existing 'else' block]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agosctp: Fix port hash table size computation
Neil Horman [Thu, 18 Feb 2016 21:10:57 +0000 (16:10 -0500)]
sctp: Fix port hash table size computation

[ Upstream commit d9749fb5942f51555dc9ce1ac0dbb1806960a975 ]

Dmitry Vyukov noted recently that the sctp_port_hashtable had an error in
its size computation, observing that the current method never guaranteed
that the hashsize (measured in number of entries) would be a power of two,
which the input hash function for that table requires.  The root cause of
the problem is that two values need to be computed (one, the allocation
order of the storage requries, as passed to __get_free_pages, and two the
number of entries for the hash table).  Both need to be ^2, but for
different reasons, and the existing code is simply computing one order
value, and using it as the basis for both, which is wrong (i.e. it assumes
that ((1<<order)*PAGE_SIZE)/sizeof(bucket) is still ^2 when its not).

To fix this, we change the logic slightly.  We start by computing a goal
allocation order (which is limited by the maximum size hash table we want
to support.  Then we attempt to allocate that size table, decreasing the
order until a successful allocation is made.  Then, with the resultant
successful order we compute the number of buckets that hash table supports,
which we then round down to the nearest power of two, giving us the number
of entries the table actually supports.

I've tested this locally here, using non-debug and spinlock-debug kernels,
and the number of entries in the hashtable consistently work out to be
powers of two in all cases.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
CC: Dmitry Vyukov <dvyukov@google.com>
CC: Vladislav Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agopppoe: fix reference counting in PPPoE proxy
Guillaume Nault [Mon, 15 Feb 2016 16:01:10 +0000 (17:01 +0100)]
pppoe: fix reference counting in PPPoE proxy

[ Upstream commit 29e73269aa4d36f92b35610c25f8b01c789b0dc8 ]

Drop reference on the relay_po socket when __pppoe_xmit() succeeds.
This is already handled correctly in the error path.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoipv4: fix memory leaks in ip_cmsg_send() callers
Eric Dumazet [Thu, 4 Feb 2016 14:23:28 +0000 (06:23 -0800)]
ipv4: fix memory leaks in ip_cmsg_send() callers

[ Upstream commit 919483096bfe75dda338e98d56da91a263746a0a ]

Dmitry reported memory leaks of IP options allocated in
ip_cmsg_send() when/if this function returns an error.

Callers are responsible for the freeing.

Many thanks to Dmitry for the report and diagnostic.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agonet/ipv6: add sysctl option accept_ra_min_hop_limit
Hangbin Liu [Thu, 30 Jul 2015 06:28:42 +0000 (14:28 +0800)]
net/ipv6: add sysctl option accept_ra_min_hop_limit

[ Upstream commit 8013d1d7eafb0589ca766db6b74026f76b7f5cb4 ]

Commit 6fd99094de2b ("ipv6: Don't reduce hop limit for an interface")
disabled accept hop limit from RA if it is smaller than the current hop
limit for security stuff. But this behavior kind of break the RFC definition.

RFC 4861, 6.3.4.  Processing Received Router Advertisements
   A Router Advertisement field (e.g., Cur Hop Limit, Reachable Time,
   and Retrans Timer) may contain a value denoting that it is
   unspecified.  In such cases, the parameter should be ignored and the
   host should continue using whatever value it is already using.

   If the received Cur Hop Limit value is non-zero, the host SHOULD set
   its CurHopLimit variable to the received value.

So add sysctl option accept_ra_min_hop_limit to let user choose the minimum
hop limit value they can accept from RA. And set default to 1 to meet RFC
standards.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
 - Adjust filename, context
 - Number DEVCONF enumerators explicitly to match upstream]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoipv6/udp: use sticky pktinfo egress ifindex on connect()
Paolo Abeni [Fri, 29 Jan 2016 11:30:20 +0000 (12:30 +0100)]
ipv6/udp: use sticky pktinfo egress ifindex on connect()

[ Upstream commit 1cdda91871470f15e79375991bd2eddc6e86ddb1 ]

Currently, the egress interface index specified via IPV6_PKTINFO
is ignored by __ip6_datagram_connect(), so that RFC 3542 section 6.7
can be subverted when the user space application calls connect()
before sendmsg().
Fix it by initializing properly flowi6_oif in connect() before
performing the route lookup.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agonet: dp83640: Fix tx timestamp overflow handling.
Manfred Rudigier [Wed, 20 Jan 2016 10:22:28 +0000 (11:22 +0100)]
net: dp83640: Fix tx timestamp overflow handling.

[ Upstream commit 81e8f2e930fe76b9814c71b9d87c30760b5eb705 ]

PHY status frames are not reliable, the PHY may not be able to send them
during heavy receive traffic. This overflow condition is signaled by the
PHY in the next status frame, but the driver did not make use of it.
Instead it always reported wrong tx timestamps to user space after an
overflow happened because it assigned newly received tx timestamps to old
packets in the queue.

This commit fixes this issue by clearing the tx timestamp queue every time
an overflow happens, so that no timestamps are delivered for overflow
packets. This way time stamping will continue correctly after an overflow.

Signed-off-by: Manfred Rudigier <manfred.rudigier@omicron.at>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoaf_iucv: Validate socket address length in iucv_sock_bind()
Ursula Braun [Tue, 19 Jan 2016 09:41:33 +0000 (10:41 +0100)]
af_iucv: Validate socket address length in iucv_sock_bind()

[ Upstream commit 52a82e23b9f2a9e1d429c5207f8575784290d008 ]

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Evgeny Cherkashin <Eugene.Crosser@ru.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoipv6: update skb->csum when CE mark is propagated
Eric Dumazet [Fri, 15 Jan 2016 12:56:56 +0000 (04:56 -0800)]
ipv6: update skb->csum when CE mark is propagated

[ Upstream commit 34ae6a1aa0540f0f781dd265366036355fdc8930 ]

When a tunnel decapsulates the outer header, it has to comply
with RFC 6080 and eventually propagate CE mark into inner header.

It turns out IP6_ECN_set_ce() does not correctly update skb->csum
for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure"
messages and stack traces.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
 - Adjust context
 - Add skb argument to other callers of IP6_ECN_set_ce()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agophonet: properly unshare skbs in phonet_rcv()
Eric Dumazet [Tue, 12 Jan 2016 16:58:00 +0000 (08:58 -0800)]
phonet: properly unshare skbs in phonet_rcv()

[ Upstream commit 7aaed57c5c2890634cfadf725173c7c68ea4cb4f ]

Ivaylo Dimitrov reported a regression caused by commit 7866a621043f
("dev: add per net_device packet type chains").

skb->dev becomes NULL and we crash in __netif_receive_skb_core().

Before above commit, different kind of bugs or corruptions could happen
without major crash.

But the root cause is that phonet_rcv() can queue skb without checking
if skb is shared or not.

Many thanks to Ivaylo Dimitrov for his help, diagnosis and tests.

Reported-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Tested-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Remi Denis-Courmont <courmisch@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agotcp_yeah: don't set ssthresh below 2
Neal Cardwell [Mon, 11 Jan 2016 18:42:43 +0000 (13:42 -0500)]
tcp_yeah: don't set ssthresh below 2

[ Upstream commit 83d15e70c4d8909d722c0d64747d8fb42e38a48f ]

For tcp_yeah, use an ssthresh floor of 2, the same floor used by Reno
and CUBIC, per RFC 5681 (equation 4).

tcp_yeah_ssthresh() was sometimes returning a 0 or negative ssthresh
value if the intended reduction is as big or bigger than the current
cwnd. Congestion control modules should never return a zero or
negative ssthresh. A zero ssthresh generally results in a zero cwnd,
causing the connection to stall. A negative ssthresh value will be
interpreted as a u32 and will set a target cwnd for PRR near 4
billion.

Oleksandr Natalenko reported that a system using tcp_yeah with ECN
could see a warning about a prior_cwnd of 0 in
tcp_cwnd_reduction(). Testing verified that this was due to
tcp_yeah_ssthresh() misbehaving in this way.

Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agobridge: Only call /sbin/bridge-stp for the initial network namespace
Hannes Frederic Sowa [Tue, 5 Jan 2016 09:46:00 +0000 (10:46 +0100)]
bridge: Only call /sbin/bridge-stp for the initial network namespace

[ Upstream commit ff62198553e43cdffa9d539f6165d3e83f8a42bc ]

[I stole this patch from Eric Biederman. He wrote:]

> There is no defined mechanism to pass network namespace information
> into /sbin/bridge-stp therefore don't even try to invoke it except
> for bridge devices in the initial network namespace.
>
> It is possible for unprivileged users to cause /sbin/bridge-stp to be
> invoked for any network device name which if /sbin/bridge-stp does not
> guard against unreasonable arguments or being invoked twice on the
> same network device could cause problems.

[Hannes: changed patch using netns_eq]

Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoconnector: bump skb->users before callback invocation
Florian Westphal [Thu, 31 Dec 2015 13:26:33 +0000 (14:26 +0100)]
connector: bump skb->users before callback invocation

[ Upstream commit 55285bf09427c5abf43ee1d54e892f352092b1f1 ]

Dmitry reports memleak with syskaller program.
Problem is that connector bumps skb usecount but might not invoke callback.

So move skb_get to where we invoke the callback.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agosctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close
Xin Long [Tue, 29 Dec 2015 09:49:25 +0000 (17:49 +0800)]
sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close

[ Upstream commit 068d8bd338e855286aea54e70d1c101569284b21 ]

In sctp_close, sctp_make_abort_user may return NULL because of memory
allocation failure. If this happens, it will bypass any state change
and never free the assoc. The assoc has no chance to be freed and it
will be kept in memory with the state it had even after the socket is
closed by sctp_close().

So if sctp_make_abort_user fails to allocate memory, we should abort
the asoc via sctp_primitive_ABORT as well. Just like the annotation in
sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said,
"Even if we can't send the ABORT due to low memory delete the TCB.
This is a departure from our typical NOMEM handling".

But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would
dereference the chunk pointer, and system crash. So we should add
SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other
places where it adds SCTP_CMD_REPLY cmd.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoext4: fix NULL pointer dereference in ext4_mark_inode_dirty()
Eryu Guan [Sun, 13 Mar 2016 02:40:32 +0000 (21:40 -0500)]
ext4: fix NULL pointer dereference in ext4_mark_inode_dirty()

commit 5e1021f2b6dff1a86a468a1424d59faae2bc63c1 upstream.

ext4_reserve_inode_write() in ext4_mark_inode_dirty() could fail on
error (e.g. EIO) and iloc.bh can be NULL in this case. But the error is
ignored in the following "if" condition and ext4_expand_extra_isize()
might be called with NULL iloc.bh set, which triggers NULL pointer
dereference.

This is uncovered by commit 8b4953e13f4c ("ext4: reserve code points for
the project quota feature"), which enlarges the ext4_inode size, and
run the following script on new kernel but with old mke2fs:

  #/bin/bash
  mnt=/mnt/ext4
  devname=ext4-error
  dev=/dev/mapper/$devname
  fsimg=/home/fs.img

  trap cleanup 0 1 2 3 9 15

  cleanup()
  {
          umount $mnt >/dev/null 2>&1
          dmsetup remove $devname
          losetup -d $backend_dev
          rm -f $fsimg
          exit 0
  }

  rm -f $fsimg
  fallocate -l 1g $fsimg
  backend_dev=`losetup -f --show $fsimg`
  devsize=`blockdev --getsz $backend_dev`

  good_tab="0 $devsize linear $backend_dev 0"
  error_tab="0 $devsize error $backend_dev 0"

  dmsetup create $devname --table "$good_tab"

  mkfs -t ext4 $dev
  mount -t ext4 -o errors=continue,strictatime $dev $mnt

  dmsetup load $devname --table "$error_tab" && dmsetup resume $devname
  echo 3 > /proc/sys/vm/drop_caches
  ls -l $mnt
  exit 0

[ Patch changed to simplify the function a tiny bit. -- Ted ]

Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoipv4: Don't do expensive useless work during inetdev destroy.
David S. Miller [Mon, 14 Mar 2016 03:28:00 +0000 (23:28 -0400)]
ipv4: Don't do expensive useless work during inetdev destroy.

commit fbd40ea0180a2d328c5adc61414dc8bab9335ce2 upstream.

When an inetdev is destroyed, every address assigned to the interface
is removed.  And in this scenerio we do two pointless things which can
be very expensive if the number of assigned interfaces is large:

1) Address promotion.  We are deleting all addresses, so there is no
   point in doing this.

2) A full nf conntrack table purge for every address.  We only need to
   do this once, as is already caught by the existing
   masq_dev_notifier so masq_inet_event() can skip this.

Reported-by: Solar Designer <solar@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
[bwh: Backported to 3.2: adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoUSB: usbip: fix potential out-of-bounds write
Ignat Korchagin [Thu, 17 Mar 2016 18:00:29 +0000 (18:00 +0000)]
USB: usbip: fix potential out-of-bounds write

commit b348d7dddb6c4fbfc810b7a0626e8ec9e29f7cbb upstream.

Fix potential out-of-bounds write to urb->transfer_buffer
usbip handles network communication directly in the kernel. When receiving a
packet from its peer, usbip code parses headers according to protocol. As
part of this parsing urb->actual_length is filled. Since the input for
urb->actual_length comes from the network, it should be treated as untrusted.
Any entity controlling the network may put any value in the input and the
preallocated urb->transfer_buffer may not be large enough to hold the data.
Thus, the malicious entity is able to write arbitrary data to kernel memory.

Signed-off-by: Ignat Korchagin <ignat.korchagin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agousbnet: cleanup after bind() in probe()
Oliver Neukum [Mon, 7 Mar 2016 10:31:10 +0000 (11:31 +0100)]
usbnet: cleanup after bind() in probe()

commit 1666984c8625b3db19a9abc298931d35ab7bc64b upstream.

In case bind() works, but a later error forces bailing
in probe() in error cases work and a timer may be scheduled.
They must be killed. This fixes an error case related to
the double free reported in
http://www.spinics.net/lists/netdev/msg367669.html
and needs to go on top of Linus' fix to cdc-ncm.

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agox86/mm/32: Enable full randomization on i386 and X86_32
Hector Marco-Gisbert [Thu, 10 Mar 2016 19:51:00 +0000 (20:51 +0100)]
x86/mm/32: Enable full randomization on i386 and X86_32

commit 8b8addf891de8a00e4d39fc32f93f7c5eb8feceb upstream.

Currently on i386 and on X86_64 when emulating X86_32 in legacy mode, only
the stack and the executable are randomized but not other mmapped files
(libraries, vDSO, etc.). This patch enables randomization for the
libraries, vDSO and mmap requests on i386 and in X86_32 in legacy mode.

By default on i386 there are 8 bits for the randomization of the libraries,
vDSO and mmaps which only uses 1MB of VA.

This patch preserves the original randomness, using 1MB of VA out of 3GB or
4GB. We think that 1MB out of 3GB is not a big cost for having the ASLR.

The first obvious security benefit is that all objects are randomized (not
only the stack and the executable) in legacy mode which highly increases
the ASLR effectiveness, otherwise the attackers may use these
non-randomized areas. But also sensitive setuid/setgid applications are
more secure because currently, attackers can disable the randomization of
these applications by setting the ulimit stack to "unlimited". This is a
very old and widely known trick to disable the ASLR in i386 which has been
allowed for too long.

Another trick used to disable the ASLR was to set the ADDR_NO_RANDOMIZE
personality flag, but fortunately this doesn't work on setuid/setgid
applications because there is security checks which clear Security-relevant
flags.

This patch always randomizes the mmap_legacy_base address, removing the
possibility to disable the ASLR by setting the stack to "unlimited".

Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
Acked-by: Ismael Ripoll Ripoll <iripoll@upv.es>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1457639460-5242-1-git-send-email-hecmargi@upv.es
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agox86: standardize mmap_rnd() usage
Kees Cook [Tue, 14 Apr 2015 22:47:45 +0000 (15:47 -0700)]
x86: standardize mmap_rnd() usage

commit 82168140bc4cec7ec9bad39705518541149ff8b7 upstream.

In preparation for splitting out ET_DYN ASLR, this refactors the use of
mmap_rnd() to be used similarly to arm, and extracts the checking of
PF_RANDOMIZE.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agonetfilter: x_tables: make sure e->next_offset covers remaining blob size
Florian Westphal [Tue, 22 Mar 2016 17:02:50 +0000 (18:02 +0100)]
netfilter: x_tables: make sure e->next_offset covers remaining blob size

commit 6e94e0cfb0887e4013b3b930fa6ab1fe6bb6ba91 upstream.

Otherwise this function may read data beyond the ruleset blob.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agonetfilter: x_tables: validate e->target_offset early
Florian Westphal [Tue, 22 Mar 2016 17:02:49 +0000 (18:02 +0100)]
netfilter: x_tables: validate e->target_offset early

commit bdf533de6968e9686df777dc178486f600c6e617 upstream.

We should check that e->target_offset is sane before
mark_source_chains gets called since it will fetch the target entry
for loop detection.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoparisc: Unbreak handling exceptions from kernel modules
Helge Deller [Fri, 8 Apr 2016 16:32:52 +0000 (18:32 +0200)]
parisc: Unbreak handling exceptions from kernel modules

commit 2ef4dfd9d9f288943e249b78365a69e3ea3ec072 upstream.

Handling exceptions from modules never worked on parisc.
It was just masked by the fact that exceptions from modules
don't happen during normal use.

When a module triggers an exception in get_user() we need to load the
main kernel dp value before accessing the exception_data structure, and
afterwards restore the original dp value of the module on exit.

Noticed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoparisc: Fix kernel crash with reversed copy_from_user()
Helge Deller [Fri, 8 Apr 2016 16:18:48 +0000 (18:18 +0200)]
parisc: Fix kernel crash with reversed copy_from_user()

commit ef72f3110d8b19f4c098a0bff7ed7d11945e70c6 upstream.

The kernel module testcase (lib/test_user_copy.c) exhibited a kernel
crash on parisc if the parameters for copy_from_user were reversed
("illegal reversed copy_to_user" testcase).

Fix this potential crash by checking the fault handler if the faulting
address is in the exception table.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoparisc: Avoid function pointers for kernel exception routines
Helge Deller [Fri, 8 Apr 2016 16:11:33 +0000 (18:11 +0200)]
parisc: Avoid function pointers for kernel exception routines

commit e3893027a300927049efc1572f852201eb785142 upstream.

We want to avoid the kernel module loader to create function pointers
for the kernel fixup routines of get_user() and put_user(). Changing
the external reference from function type to int type fixes this.

This unbreaks exception handling for get_user() and put_user() when
called from a kernel module.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoUSB: serial: cp210x: Adding GE Healthcare Device ID
Martyn Welch [Tue, 29 Mar 2016 16:47:29 +0000 (17:47 +0100)]
USB: serial: cp210x: Adding GE Healthcare Device ID

commit cddc9434e3dcc37a85c4412fb8e277d3a582e456 upstream.

The CP2105 is used in the GE Healthcare Remote Alarm Box, with the
Manufacturer ID of 0x1901 and Product ID of 0x0194.

Signed-off-by: Martyn Welch <martyn.welch@collabora.co.uk>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoUSB: serial: ftdi_sio: Add support for ICP DAS I-756xU devices
Josh Boyer [Thu, 10 Mar 2016 14:48:52 +0000 (09:48 -0500)]
USB: serial: ftdi_sio: Add support for ICP DAS I-756xU devices

commit ea6db90e750328068837bed34cb1302b7a177339 upstream.

A Fedora user reports that the ftdi_sio driver works properly for the
ICP DAS I-7561U device.  Further, the user manual for these devices
instructs users to load the driver and add the ids using the sysfs
interface.

Add support for these in the driver directly so that the devices work
out of the box instead of needing manual configuration.

Reported-by: <thesource@mail.ru>
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoxen/events: Mask a moving irq
Boris Ostrovsky [Fri, 18 Mar 2016 14:11:07 +0000 (10:11 -0400)]
xen/events: Mask a moving irq

commit ff1e22e7a638a0782f54f81a6c9cb139aca2da35 upstream.

Moving an unmasked irq may result in irq handler being invoked on both
source and target CPUs.

With 2-level this can happen as follows:

On source CPU:
        evtchn_2l_handle_events() ->
            generic_handle_irq() ->
                handle_edge_irq() ->
                   eoi_pirq():
                       irq_move_irq(data);

                       /***** WE ARE HERE *****/

                       if (VALID_EVTCHN(evtchn))
                           clear_evtchn(evtchn);

If at this moment target processor is handling an unrelated event in
evtchn_2l_handle_events()'s loop it may pick up our event since target's
cpu_evtchn_mask claims that this event belongs to it *and* the event is
unmasked and still pending. At the same time, source CPU will continue
executing its own handle_edge_irq().

With FIFO interrupt the scenario is similar: irq_move_irq() may result
in a EVTCHNOP_unmask hypercall which, in turn, may make the event
pending on the target CPU.

We can avoid this situation by moving and clearing the event while
keeping event masked.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[bwh: Backported to 3.2:
 - Adjust filename
 - Add a suitable definition of test_and_set_mask()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoALSA: timer: Use mod_timer() for rearming the system timer
Takashi Iwai [Fri, 1 Apr 2016 10:28:16 +0000 (12:28 +0200)]
ALSA: timer: Use mod_timer() for rearming the system timer

commit 4a07083ed613644c96c34a7dd2853dc5d7c70902 upstream.

ALSA system timer backend stops the timer via del_timer() without sync
and leaves del_timer_sync() at the close instead.  This is because of
the restriction by the design of ALSA timer: namely, the stop callback
may be called from the timer handler, and calling the sync shall lead
to a hangup.  However, this also triggers a kernel BUG() when the
timer is rearmed immediately after stopping without sync:
 kernel BUG at kernel/time/timer.c:966!
 Call Trace:
  <IRQ>
  [<ffffffff8239c94e>] snd_timer_s_start+0x13e/0x1a0
  [<ffffffff8239e1f4>] snd_timer_interrupt+0x504/0xec0
  [<ffffffff8122fca0>] ? debug_check_no_locks_freed+0x290/0x290
  [<ffffffff8239ec64>] snd_timer_s_function+0xb4/0x120
  [<ffffffff81296b72>] call_timer_fn+0x162/0x520
  [<ffffffff81296add>] ? call_timer_fn+0xcd/0x520
  [<ffffffff8239ebb0>] ? snd_timer_interrupt+0xec0/0xec0
  ....

It's the place where add_timer() checks the pending timer.  It's clear
that this may happen after the immediate restart without sync in our
cases.

So, the workaround here is just to use mod_timer() instead of
add_timer().  This looks like a band-aid fix, but it's a right move,
as snd_timer_interrupt() takes care of the continuous rearm of timer.

Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoKVM: x86: Inject pending interrupt even if pending nmi exist
Yuki Shibuya [Thu, 24 Mar 2016 05:17:03 +0000 (05:17 +0000)]
KVM: x86: Inject pending interrupt even if pending nmi exist

commit 321c5658c5e9192dea0d58ab67cf1791e45b2b26 upstream.

Non maskable interrupts (NMI) are preferred to interrupts in current
implementation. If a NMI is pending and NMI is blocked by the result
of nmi_allowed(), pending interrupt is not injected and
enable_irq_window() is not executed, even if interrupts injection is
allowed.

In old kernel (e.g. 2.6.32), schedule() is often called in NMI context.
In this case, interrupts are needed to execute iret that intends end
of NMI. The flag of blocking new NMI is not cleared until the guest
execute the iret, and interrupts are blocked by pending NMI. Due to
this, iret can't be invoked in the guest, and the guest is starved
until block is cleared by some events (e.g. canceling injection).

This patch injects pending interrupts, when it's allowed, even if NMI
is blocked. And, If an interrupts is pending after executing
inject_pending_event(), enable_irq_window() is executed regardless of
NMI pending counter.

Signed-off-by: Yuki Shibuya <shibuya.yk@ncos.nec.co.jp>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.2:
 - vcpu_enter_guest() is simpler because inject_pending_event() can't fail
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agosd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes
Martin K. Petersen [Tue, 29 Mar 2016 01:18:56 +0000 (21:18 -0400)]
sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes

commit f08bb1e0dbdd0297258d0b8cd4dbfcc057e57b2a upstream.

During revalidate we check whether device capacity has changed before we
decide whether to output disk information or not.

The check for old capacity failed to take into account that we scaled
sdkp->capacity based on the reported logical block size. And therefore
the capacity test would always fail for devices with sectors bigger than
512 bytes and we would print several copies of the same discovery
information.

Avoid scaling sdkp->capacity and instead adjust the value on the fly
when setting the block device capacity and generating fake C/H/S
geometry.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[bwh: Backported to 3.2:
 - logical_to_sectors() is a new function
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
5 years agoUSB: digi_acceleport: do sanity checking for the number of ports
Oliver Neukum [Thu, 31 Mar 2016 16:04:26 +0000 (12:04 -0400)]
USB: digi_acceleport: do sanity checking for the number of ports

commit 5a07975ad0a36708c6b0a5b9fea1ff811d0b0c1f upstream.

The driver can be crashed with devices that expose crafted descriptors
with too few endpoints.

See: http://seclists.org/bugtraq/2016/Mar/61

Signed-off-by: Oliver Neukum <ONeukum@suse.com>
[johan: fix OOB endpoint check and add error messages ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>