7 years agosparc64: Fix userspace FPU register corruptions.
David S. Miller [Fri, 7 Aug 2015 02:13:25 +0000 (19:13 -0700)]
sparc64: Fix userspace FPU register corruptions.

commit 44922150d87cef616fd183220d43d8fde4d41390 upstream.

If we have a series of events from userpsace, with %fprs=FPRS_FEF,
like follows:

RTRAP (kernel FPU restore with fpu_saved=0x4)

We will not restore the user registers that were clobbered by the FPU
using kernel code in the inner-most trap.

Traps allocate FPU save slots in the thread struct, and FPU using
sequences save the "dirty" FPU registers only.

This works at the initial trap level because all of the registers
get recorded into the top-level FPU save area, and we'll return
to userspace with the FPU disabled so that any FPU use by the user
will take an FPU disabled trap wherein we'll load the registers
back up properly.

But this is not how trap returns from kernel to kernel operate.

The simplest fix for this bug is to always save all FPU register state
for anything other than the top-most FPU save area.

Getting rid of the optimized inner-slot FPU saving code ends up
making VISEntryHalf degenerate into plain VISEntry.

Longer term we need to do something smarter to reinstate the partial
save optimizations.  Perhaps the fundament error is having trap entry
and exit allocate FPU save slots and restore register state.  Instead,
the VISEntry et al. calls should be doing that work.

This bug is about two decades old.

Reported-by: James Y Knight <>
Signed-off-by: David S. Miller <>
[bwh: Backported to 3.2:
 - Adjust context
 - Drop changes to NG4memcpy.S and ksyms.c]
Signed-off-by: Ben Hutchings <>
7 years agosctp: donot reset the overall_error_count in SHUTDOWN_RECEIVE state
lucien [Wed, 26 Aug 2015 20:52:20 +0000 (04:52 +0800)]
sctp: donot reset the overall_error_count in SHUTDOWN_RECEIVE state

commit f648f807f61e64d247d26611e34cc97e4ed03401 upstream.

Commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown")
fixed a problem with excessive retransmissions in the SHUTDOWN_PENDING by not
resetting the association overall_error_count.  This allowed the association
to better enforce assoc.max_retrans limit.

However, the same issue still exists when the association is in SHUTDOWN_RECEIVED
state.  In this state, HB-ACKs will continue to reset the overall_error_count
for the association would extend the lifetime of association unnecessarily.

This patch solves this by resetting the overall_error_count whenever the current
state is small then SCTP_STATE_SHUTDOWN_PENDING.  As a small side-effect, we
states, but they are not really impacted because we disable Heartbeats in those

Fixes: Commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown")
Signed-off-by: Xin Long <>
Acked-by: Marcelo Ricardo Leitner <>
Acked-by: Vlad Yasevich <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
7 years agonet: Fix RCU splat in af_key
David Ahern [Mon, 24 Aug 2015 21:17:17 +0000 (15:17 -0600)]
net: Fix RCU splat in af_key

commit ba51b6be38c122f7dab40965b4397aaf6188a464 upstream.

Hit the following splat testing VRF change for ipsec:

[  113.475692] ===============================
[  113.476194] [ INFO: suspicious RCU usage. ]
[  113.476667] 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED Not tainted
[  113.477545] -------------------------------
[  113.478013] /work/monster-14/dsa/kernel.git/include/linux/rcupdate.h:568 Illegal context switch in RCU read-side critical section!
[  113.479288]
[  113.479288] other info that might help us debug this:
[  113.479288]
[  113.480207]
[  113.480207] rcu_scheduler_active = 1, debug_locks = 1
[  113.480931] 2 locks held by setkey/6829:
[  113.481371]  #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff814e9887>] pfkey_sendmsg+0xfb/0x213
[  113.482509]  #1:  (rcu_read_lock){......}, at: [<ffffffff814e767f>] rcu_read_lock+0x0/0x6e
[  113.483509]
[  113.483509] stack backtrace:
[  113.484041] CPU: 0 PID: 6829 Comm: setkey Not tainted 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED
[  113.485422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 04/01/2014
[  113.486845]  0000000000000001 ffff88001d4c7a98 ffffffff81518af2 ffffffff81086962
[  113.487732]  ffff88001d538480 ffff88001d4c7ac8 ffffffff8107ae75 ffffffff8180a154
[  113.488628]  0000000000000b30 0000000000000000 00000000000000d0 ffff88001d4c7ad8
[  113.489525] Call Trace:
[  113.489813]  [<ffffffff81518af2>] dump_stack+0x4c/0x65
[  113.490389]  [<ffffffff81086962>] ? console_unlock+0x3d6/0x405
[  113.491039]  [<ffffffff8107ae75>] lockdep_rcu_suspicious+0xfa/0x103
[  113.491735]  [<ffffffff81064032>] rcu_preempt_sleep_check+0x45/0x47
[  113.492442]  [<ffffffff8106404d>] ___might_sleep+0x19/0x1c8
[  113.493077]  [<ffffffff81064268>] __might_sleep+0x6c/0x82
[  113.493681]  [<ffffffff81133190>] cache_alloc_debugcheck_before.isra.50+0x1d/0x24
[  113.494508]  [<ffffffff81134876>] kmem_cache_alloc+0x31/0x18f
[  113.495149]  [<ffffffff814012b5>] skb_clone+0x64/0x80
[  113.495712]  [<ffffffff814e6f71>] pfkey_broadcast_one+0x3d/0xff
[  113.496380]  [<ffffffff814e7b84>] pfkey_broadcast+0xb5/0x11e
[  113.497024]  [<ffffffff814e82d1>] pfkey_register+0x191/0x1b1
[  113.497653]  [<ffffffff814e9770>] pfkey_process+0x162/0x17e
[  113.498274]  [<ffffffff814e9895>] pfkey_sendmsg+0x109/0x213

In pfkey_sendmsg the net mutex is taken and then pfkey_broadcast takes
the RCU lock.

Since pfkey_broadcast takes the RCU lock the allocation argument is
pointless since GFP_ATOMIC must be used between the rcu_read_{,un}lock.
The one call outside of rcu can be done with GFP_KERNEL.

Fixes: 7f6b9dbd5afbd ("af_key: locking change")
Signed-off-by: David Ahern <>
Acked-by: Eric Dumazet <>
Signed-off-by: David S. Miller <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agox86/ldt: Further fix FPU emulation
Andy Lutomirski [Fri, 14 Aug 2015 22:02:55 +0000 (15:02 -0700)]
x86/ldt: Further fix FPU emulation

commit 12e244f4b550498bbaf654a52f93633f7dde2dc7 upstream.

The previous fix confused a selector with a segment prefix.  Fix it.

Compile-tested only.

Cc: Juergen Gross <>
Reported-by: Linus Torvalds <>
Fixes: 4809146b86c3 ("x86/ldt: Correct FPU emulation access to LDT")
Signed-off-by: Andy Lutomirski <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Ben Hutchings <>
7 years agoipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits
Herton R. Krzesinski [Fri, 14 Aug 2015 22:35:02 +0000 (15:35 -0700)]
ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits

commit 602b8593d2b4138c10e922eeaafe306f6b51817b upstream.

The current semaphore code allows a potential use after free: in
exit_sem we may free the task's sem_undo_list while there is still
another task looping through the same semaphore set and cleaning the
sem_undo list at freeary function (the task called IPC_RMID for the same
semaphore set).

For example, with a test program [1] running which keeps forking a lot
of processes (which then do a semop call with SEM_UNDO flag), and with
the parent right after removing the semaphore set with IPC_RMID, and a
kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
in the kernel log:

   Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
   000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b  kkkkkkkk.kkkkkkk
   010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff  ....kkkk........
   Prev obj: start=ffff88003b45c180, len=64
   000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
   010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff  ...........7....
   Next obj: start=ffff88003b45c200, len=64
   000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
   010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff  ........h).<....
   BUG: spinlock wrong CPU on CPU#2, test/18028
   general protection fault: 0000 [#1] SMP
   Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
   CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
   RIP: spin_dump+0x53/0xc0
   Call Trace:
     ? _raw_spin_lock+0xe/0x10
     ? __do_page_fault+0x19a/0x430
     ? __audit_syscall_entry+0xa8/0x100
     ? syscall_trace_leave+0xde/0x130
   Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
   RIP  [<ffffffff810d6053>] spin_dump+0x53/0xc0
    RSP <ffff88003750fd68>
   ---[ end trace 783ebb76612867a0 ]---
   NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
   Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
   CPU: 3 PID: 18053 Comm: test Tainted: G      D         4.2.0-rc5+ #1
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
   RIP: native_read_tsc+0x0/0x20
   Call Trace:
     ? delay_tsc+0x40/0x70
     ? handle_mm_fault+0xbf6/0x1880
     ? dequeue_task_fair+0x79/0x4a0
     ? __do_page_fault+0x19a/0x430
     ? kfree_debugcheck+0x16/0x40
     ? __do_page_fault+0x19a/0x430
     ? __audit_syscall_entry+0xa8/0x100
     ? do_audit_syscall_entry+0x66/0x70
     ? syscall_trace_enter_phase1+0x139/0x160
   Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
   Kernel panic - not syncing: softlockup: hung tasks

I wasn't able to trigger any badness on a recent kernel without the
proper config debugs enabled, however I have softlockup reports on some
kernel versions, in the semaphore code, which are similar as above (the
scenario is seen on some servers running IBM DB2 which uses semaphore

The patch here fixes the race against freeary, by acquiring or waiting
on the sem_undo_list lock as necessary (exit_sem can race with freeary,
while freeary sets un->semid to -1 and removes the same sem_undo from
list_proc or when it removes the last sem_undo).

After the patch I'm unable to reproduce the problem using the test case

[1] Test case used below:

    #include <stdio.h>
    #include <sys/types.h>
    #include <sys/ipc.h>
    #include <sys/sem.h>
    #include <sys/wait.h>
    #include <stdlib.h>
    #include <time.h>
    #include <unistd.h>
    #include <errno.h>

    #define NSEM 1
    #define NSET 5

    int sid[NSET];

    void thread()
            struct sembuf op;
            int s;
            uid_t pid = getuid();

            s = rand() % NSET;
            op.sem_num = pid % NSEM;
            op.sem_op = 1;
            op.sem_flg = SEM_UNDO;

            semop(sid[s], &op, 1);

    void create_set()
            int i, j;
            pid_t p;
            union {
                    int val;
                    struct semid_ds *buf;
                    unsigned short int *array;
                    struct seminfo *__buf;
            } un;

            /* Create and initialize semaphore set */
            for (i = 0; i < NSET; i++) {
                    sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
                    if (sid[i] < 0) {
            un.val = 0;
            for (i = 0; i < NSET; i++) {
                    for (j = 0; j < NSEM; j++) {
                            if (semctl(sid[i], j, SETVAL, un) < 0)

            /* Launch threads that operate on semaphore set */
            for (i = 0; i < NSEM * NSET * NSET; i++) {
                    p = fork();
                    if (p < 0)
                    if (p == 0)

            /* Free semaphore set */
            for (i = 0; i < NSET; i++) {
                    if (semctl(sid[i], NSEM, IPC_RMID))

            /* Wait for forked processes to exit */
            while (wait(NULL)) {
                    if (errno == ECHILD)

    int main(int argc, char **argv)
            pid_t p;


            while (1) {
                    p = fork();
                    if (p < 0) {
                    if (p == 0) {
                            goto end;

                    /* Wait for forked processes to exit */
                    while (wait(NULL)) {
                            if (errno == ECHILD)
            return 0;

[ use normal comment layout]
Signed-off-by: Herton R. Krzesinski <>
Acked-by: Manfred Spraul <>
Cc: Davidlohr Bueso <>
Cc: Rafael Aquini <>
CC: Aristeu Rozanski <>
Cc: David Jeffery <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agolibfc: Fix fc_fcp_cleanup_each_cmd()
Bart Van Assche [Fri, 5 Jun 2015 21:20:51 +0000 (14:20 -0700)]
libfc: Fix fc_fcp_cleanup_each_cmd()

commit 8f2777f53e3d5ad8ef2a176a4463a5c8e1a16431 upstream.

Since fc_fcp_cleanup_cmd() can sleep this function must not
be called while holding a spinlock. This patch avoids that
fc_fcp_cleanup_each_cmd() triggers the following bug:

BUG: scheduling while atomic: sg_reset/1512/0x00000202
1 lock held by sg_reset/1512:
 #0:  (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Call Trace:
 [<ffffffff816c612c>] dump_stack+0x4f/0x7b
 [<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
 [<ffffffff816c87aa>] __schedule+0x71a/0xa10
 [<ffffffff816c8ad2>] schedule+0x32/0x80
 [<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
 [<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
 [<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
 [<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
 [<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
 [<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
 [<ffffffff814a2650>] scsi_ioctl+0x150/0x440
 [<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
 [<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
 [<ffffffff811da608>] block_ioctl+0x38/0x40
 [<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
 [<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
 [<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a

Signed-off-by: Bart Van Assche <>
Signed-off-by: Vasu Dev <>
Signed-off-by: James Bottomley <>
Signed-off-by: Ben Hutchings <>
7 years agolibiscsi: Fix host busy blocking during connection teardown
John Soni Jose [Wed, 24 Jun 2015 01:11:58 +0000 (06:41 +0530)]
libiscsi: Fix host busy blocking during connection teardown

commit 660d0831d1494a6837b2f810d08b5be092c1f31d upstream.

In case of hw iscsi offload, an host can have N-number of active
connections. There can be IO's running on some connections which
make host->host_busy always TRUE. Now if logout from a connection
is tried then the code gets into an infinite loop as host->host_busy
is always TRUE.

     * Block until all in-progress commands for this connection
     * time out or fail.
     for (;;) {
      spin_lock_irqsave(session->host->host_lock, flags);
      if (!atomic_read(&session->host->host_busy)) { /* OK for ERL == 0 */
      spin_unlock_irqrestore(session->host->host_lock, flags);
     spin_unlock_irqrestore(session->host->host_lock, flags);
     iscsi_conn_printk(KERN_INFO, conn, "iscsi conn_destroy(): "
                 "host_busy %d host_failed %d\n",


This is not an issue with software-iscsi/iser as each cxn is a separate

Acquiring eh_mutex in iscsi_conn_teardown() before setting
session->state = ISCSI_STATE_TERMINATE.

Signed-off-by: John Soni Jose <>
Reviewed-by: Mike Christie <>
Reviewed-by: Chris Leech <>
Signed-off-by: James Bottomley <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agodm btree: add ref counting ops for the leaves of top level btrees
Joe Thornber [Wed, 12 Aug 2015 14:12:09 +0000 (15:12 +0100)]
dm btree: add ref counting ops for the leaves of top level btrees

commit b0dc3c8bc157c60b1d470163882be8c13e1950af upstream.

When using nested btrees, the top leaves of the top levels contain
block addresses for the root of the next tree down.  If we shadow a
shared leaf node the leaf values (sub tree roots) should be incremented

This is only an issue if there is metadata sharing in the top levels.
Which only occurs if metadata snapshots are being used (as is possible
with dm-thinp).  And could result in a block from the thinp metadata
snap being reused early, thus corrupting the thinp metadata snap.

Signed-off-by: Joe Thornber <>
Signed-off-by: Mike Snitzer <>
[bwh: Backported to 3.2:
 - Drop change to remove_one()
 - Remove const pointer qualifications]
Signed-off-by: Ben Hutchings <>
7 years agolocalmodconfig: Use Kbuild files too
Richard Weinberger [Sun, 26 Jul 2015 22:06:55 +0000 (00:06 +0200)]
localmodconfig: Use Kbuild files too

commit c0ddc8c745b7f89c50385fd7aa03c78dc543fa7a upstream.

In kbuild it is allowed to define objects in files named "Makefile"
and "Kbuild".
Currently localmodconfig reads objects only from "Makefile"s and misses
modules like nouveau.

Reported-and-tested-by: Leonidas Spyropoulos <>
Signed-off-by: Richard Weinberger <>
Signed-off-by: Steven Rostedt <>
Signed-off-by: Ben Hutchings <>
7 years agox86/ldt: Correct FPU emulation access to LDT
Juergen Gross [Thu, 6 Aug 2015 17:54:34 +0000 (19:54 +0200)]
x86/ldt: Correct FPU emulation access to LDT

commit 4809146b86c3d41ce588fdb767d021e2a80600dd upstream.

Commit 37868fe113ff ("x86/ldt: Make modify_ldt synchronous")
introduced a new struct ldt_struct anchored at mm->context.ldt.

Adapt the x86 fpu emulation code to use that new structure.

Signed-off-by: Juergen Gross <>
Reviewed-by: Andy Lutomirski <>
Cc: Linus Torvalds <>
Cc: Peter Zijlstra <>
Cc: Thomas Gleixner <>
Signed-off-by: Ingo Molnar <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agox86/ldt: Correct LDT access in single stepping logic
Juergen Gross [Thu, 6 Aug 2015 08:04:38 +0000 (10:04 +0200)]
x86/ldt: Correct LDT access in single stepping logic

commit 136d9d83c07c5e30ac49fc83b27e8c4842f108fc upstream.

Commit 37868fe113ff ("x86/ldt: Make modify_ldt synchronous")
introduced a new struct ldt_struct anchored at mm->context.ldt.

convert_ip_to_linear() was changed to reflect this, but indexing
into the ldt has to be changed as the pointer is no longer void *.

Signed-off-by: Juergen Gross <>
Reviewed-by: Andy Lutomirski <>
Cc: Linus Torvalds <>
Cc: Peter Zijlstra <>
Cc: Thomas Gleixner <>
Signed-off-by: Ingo Molnar <>
Signed-off-by: Ben Hutchings <>
7 years agox86/ldt: Make modify_ldt synchronous
Andy Lutomirski [Thu, 30 Jul 2015 21:31:32 +0000 (14:31 -0700)]
x86/ldt: Make modify_ldt synchronous

commit 37868fe113ff2ba814b3b4eb12df214df555f8dc upstream.

modify_ldt() has questionable locking and does not synchronize
threads.  Improve it: redesign the locking and synchronize all
threads' LDTs using an IPI on all modifications.

This will dramatically slow down modify_ldt in multithreaded
programs, but there shouldn't be any multithreaded programs that
care about modify_ldt's performance in the first place.

This fixes some fallout from the CVE-2015-5157 fixes.

Signed-off-by: Andy Lutomirski <>
Reviewed-by: Borislav Petkov <>
Cc: Andrew Cooper <>
Cc: Andy Lutomirski <>
Cc: Boris Ostrovsky <>
Cc: Borislav Petkov <>
Cc: Brian Gerst <>
Cc: Denys Vlasenko <>
Cc: H. Peter Anvin <>
Cc: Jan Beulich <>
Cc: Konrad Rzeszutek Wilk <>
Cc: Linus Torvalds <>
Cc: Peter Zijlstra <>
Cc: Sasha Levin <>
Cc: Steven Rostedt <>
Cc: Thomas Gleixner <>
Cc: <>
Cc: xen-devel <>
Signed-off-by: Ingo Molnar <>
[bwh: Backported to 3.2:
 - Adjust context
 - Drop comment changes in switch_mm()
 - Drop changes to get_segment_base() in arch/x86/kernel/cpu/perf_event.c
 - Open-code lockless_dereference(), smp_store_release(), on_each_cpu_mask()]
Signed-off-by: Ben Hutchings <>
7 years agonet: Fix skb_set_peeked use-after-free bug
Herbert Xu [Tue, 4 Aug 2015 07:42:47 +0000 (15:42 +0800)]
net: Fix skb_set_peeked use-after-free bug

commit a0a2a6602496a45ae838a96db8b8173794b5d398 upstream.

The commit 738ac1ebb96d02e0d23bc320302a6ea94c612dec ("net: Clone
skb before setting peeked flag") introduced a use-after-free bug
in skb_recv_datagram.  This is because skb_set_peeked may create
a new skb and free the existing one.  As it stands the caller will
continue to use the old freed skb.

This patch fixes it by making skb_set_peeked return the new skb
(or the old one if unchanged).

Fixes: 738ac1ebb96d ("net: Clone skb before setting peeked flag")
Reported-by: Brenden Blanco <>
Signed-off-by: Herbert Xu <>
Tested-by: Brenden Blanco <>
Reviewed-by: Konstantin Khlebnikov <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
7 years agonet: Clone skb before setting peeked flag
Herbert Xu [Mon, 13 Jul 2015 08:04:13 +0000 (16:04 +0800)]
net: Clone skb before setting peeked flag

commit 738ac1ebb96d02e0d23bc320302a6ea94c612dec upstream.

Shared skbs must not be modified and this is crucial for broadcast
and/or multicast paths where we use it as an optimisation to avoid
unnecessary cloning.

The function skb_recv_datagram breaks this rule by setting peeked
without cloning the skb first.  This causes funky races which leads
to double-free.

This patch fixes this by cloning the skb and replacing the skb
in the list when setting skb->peeked.

Fixes: a59322be07c9 ("[UDP]: Only increment counter on first peek/recv")
Reported-by: Konstantin Khlebnikov <>
Signed-off-by: Herbert Xu <>
Signed-off-by: David S. Miller <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agoocfs2: fix BUG in ocfs2_downconvert_thread_do_work()
Joseph Qi [Thu, 6 Aug 2015 22:46:23 +0000 (15:46 -0700)]
ocfs2: fix BUG in ocfs2_downconvert_thread_do_work()

commit 209f7512d007980fd111a74a064d70a3656079cf upstream.

The "BUG_ON(list_empty(&osb->blocked_lock_list))" in
ocfs2_downconvert_thread_do_work can be triggered in the following case:

ocfs2dc has firstly saved osb->blocked_lock_count to local varibale
processed, and then processes the dentry lockres.  During the dentry
put, it calls iput and then deletes rw, inode and open lockres from
blocked list in ocfs2_mark_lockres_freeing.  And this causes the
variable `processed' to not reflect the number of blocked lockres to be
processed, which triggers the BUG.

Signed-off-by: Joseph Qi <>
Cc: Mark Fasheh <>
Cc: Joel Becker <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Ben Hutchings <>
7 years agoMIPS: Make set_pte() SMP safe.
David Daney [Tue, 4 Aug 2015 00:48:43 +0000 (17:48 -0700)]
MIPS: Make set_pte() SMP safe.

commit 46011e6ea39235e4aca656673c500eac81a07a17 upstream.

On MIPS the GLOBAL bit of the PTE must have the same value in any
aligned pair of PTEs.  These pairs of PTEs are referred to as
"buddies".  In a SMP system is is possible for two CPUs to be calling
set_pte() on adjacent PTEs at the same time.  There is a race between
setting the PTE and a different CPU setting the GLOBAL bit in its
buddy PTE.

This race can be observed when multiple CPUs are executing
vmap()/vfree() at the same time.

Make setting the buddy PTE's GLOBAL bit an atomic operation to close
the race condition.

The case of CONFIG_64BIT_PHYS_ADDR && CONFIG_CPU_MIPS32 is *not*

Signed-off-by: David Daney <>
Signed-off-by: Ralf Baechle <>
Signed-off-by: Ben Hutchings <>
7 years agoperf: Fix fasync handling on inherited events
Peter Zijlstra [Thu, 11 Jun 2015 08:32:01 +0000 (10:32 +0200)]
perf: Fix fasync handling on inherited events

commit fed66e2cdd4f127a43fd11b8d92a99bdd429528c upstream.

Vince reported that the fasync signal stuff doesn't work proper for
inherited events. So fix that.

Installing fasync allocates memory and sets filp->f_flags |= FASYNC,
which upon the demise of the file descriptor ensures the allocation is
freed and state is updated.

Now for perf, we can have the events stick around for a while after the
original FD is dead because of references from child events. So we
cannot copy the fasync pointer around. We can however consistently use
the parent's fasync, as that will be updated.

Reported-and-Tested-by: Vince Weaver <>
Signed-off-by: Peter Zijlstra (Intel) <>
Cc: Arnaldo Carvalho deMelo <>
Cc: Linus Torvalds <>
Cc: Peter Zijlstra <>
Cc: Thomas Gleixner <>
Signed-off-by: Ingo Molnar <>
Signed-off-by: Ben Hutchings <>
7 years agords: fix an integer overflow test in rds_info_getsockopt()
Dan Carpenter [Sat, 1 Aug 2015 12:33:26 +0000 (15:33 +0300)]
rds: fix an integer overflow test in rds_info_getsockopt()

commit 468b732b6f76b138c0926eadf38ac88467dcd271 upstream.

"len" is a signed integer.  We check that len is not negative, so it
goes from zero to INT_MAX.  PAGE_SIZE is unsigned long so the comparison
is type promoted to unsigned long.  ULONG_MAX - 4095 is a higher than
INT_MAX so the condition can never be true.

I don't know if this is harmful but it seems safe to limit "len" to
INT_MAX - 4095.

Fixes: a8c879a7ee98 ('RDS: Info and stats')
Signed-off-by: Dan Carpenter <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
7 years agoxhci: fix off by one error in TRB DMA address boundary check
Mathias Nyman [Mon, 3 Aug 2015 13:07:48 +0000 (16:07 +0300)]
xhci: fix off by one error in TRB DMA address boundary check

commit 7895086afde2a05fa24a0e410d8e6b75ca7c8fdd upstream.

We need to check that a TRB is part of the current segment
before calculating its DMA address.

Previously a ring segment didn't use a full memory page, and every
new ring segment got a new memory page, so the off by one
error in checking the upper bound was never seen.

Now that we use a full memory page, 256 TRBs (4096 bytes), the off by one
didn't catch the case when a TRB was the first element of the next segment.

This is triggered if the virtual memory pages for a ring segment are
next to each in increasing order where the ring buffer wraps around and
causes errors like:

[  106.398223] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 1
[  106.398230] xhci_hcd 0000:00:14.0: Looking for event-dma fffd3000 trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 seg-end fffd4ff0

The trb-end address is one outside the end-seg address.

Tested-by: Arkadiusz Miśkiewicz <>
Signed-off-by: Mathias Nyman <>
Signed-off-by: Greg Kroah-Hartman <>
Signed-off-by: Ben Hutchings <>
7 years agoMIPS: Fix sched_getaffinity with MT FPAFF enabled
Felix Fietkau [Sat, 18 Jul 2015 22:38:41 +0000 (00:38 +0200)]
MIPS: Fix sched_getaffinity with MT FPAFF enabled

commit 1d62d737555e1378eb62a8bba26644f7d97139d2 upstream.

p->thread.user_cpus_allowed is zero-initialized and is only filled on
the first sched_setaffinity call.

To avoid adding overhead in the task initialization codepath, simply OR
the returned mask in sched_getaffinity with p->cpus_allowed.

Signed-off-by: Felix Fietkau <>
Signed-off-by: Ralf Baechle <>
[bwh: Backported to 3.2: also convert from obsolete cpumask API]
Signed-off-by: Ben Hutchings <>
7 years agotarget: REPORT LUNS should return LUN 0 even for dynamic ACLs
Roland Dreier [Fri, 24 Jul 2015 19:11:46 +0000 (12:11 -0700)]
target: REPORT LUNS should return LUN 0 even for dynamic ACLs

commit 9c395170a559d3b23dad100b01fc4a89d661c698 upstream.

If an initiator doesn't have any real LUNs assigned, we should report
LUN 0 and a LUN list length of 1.  Some versions of Solaris at least
go beserk if we report a LUN list length of 0.

Signed-off-by: Roland Dreier <>
Signed-off-by: Nicholas Bellinger <>
[bwh: Backported to 3.2: adjust filename, context]
Signed-off-by: Ben Hutchings <>
7 years agomd/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies
NeilBrown [Mon, 27 Jul 2015 01:48:52 +0000 (11:48 +1000)]
md/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies

commit 423f04d63cf421ea436bcc5be02543d549ce4b28 upstream.

raid1_end_read_request() assumes that the In_sync bits are consistent
with the ->degaded count.
raid1_spare_active updates the In_sync bit before the ->degraded count
and so exposes an inconsistency, as does error()
So extend the spinlock in raid1_spare_active() and error() to hide those

This should probably be part of
  Commit: 34cab6f42003 ("md/raid1: fix test for 'was read error from
  last working device'.")
as it addresses the same issue.  It fixes the same bug and should go
to -stable for same reasons.

Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown <>
Signed-off-by: Ben Hutchings <>
7 years agotarget/iscsi: Fix double free of a TUR followed by a solicited NOPOUT
Alexei Potashnik [Tue, 21 Jul 2015 22:07:56 +0000 (15:07 -0700)]
target/iscsi: Fix double free of a TUR followed by a solicited NOPOUT

commit 9547308bda296b6f69876c840a0291fcfbeddbb8 upstream.

Make sure all non-READ SCSI commands get targ_xfer_tag initialized
to 0xffffffff, not just WRITEs.

Double-free of a TUR cmd object occurs under the following scenario:

1. TUR received (targ_xfer_tag is uninitialized and left at 0)
2. TUR status sent
3. First unsolicited NOPIN is sent to initiator (gets targ_xfer_tag of 0)
4. NOPOUT for NOPIN (with TTT=0) arrives
 - its ExpStatSN acks TUR status, TUR is queued for removal
 - LIO tries to find NOPIN with TTT=0, but finds the same TUR instead,
   TUR is queued for removal for the 2nd time

(Drop unbalanced conditional bracket usage - nab)

Signed-off-by: Alexei Potashnik <>
Signed-off-by: Spencer Baugh <>
Signed-off-by: Nicholas Bellinger <>
[bwh: Backported to 3.2:
 - Adjust context
 - Keep the braces around the if-block]
Signed-off-by: Ben Hutchings <>
7 years agoUSB: sierra: add 1199:68AB device ID
Dirk Behme [Mon, 27 Jul 2015 06:56:05 +0000 (08:56 +0200)]
USB: sierra: add 1199:68AB device ID

commit 74472233233f577eaa0ca6d6e17d9017b6e53150 upstream.

Add support for the Sierra Wireless AR8550 device with
USB descriptor 0x1199, 0x68AB.

It is common with MC879x modules 1199:683c/683d which
also are composite devices with 7 interfaces (0..6)
and also MDM62xx based as the AR8550.

The major difference are only the interface attributes
02/02/01 on interfaces 3 and 4 on the AR8550. They are
vendor specific ff/ff/ff on MC879x modules.

lsusb reports:

Bus 001 Device 004: ID 1199:68ab Sierra Wireless, Inc.
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        64
  idVendor           0x1199 Sierra Wireless, Inc.
  idProduct          0x68ab
  bcdDevice            0.06
  iManufacturer           3 Sierra Wireless, Incorporated
  iProduct                2 AR8550
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength          198
    bNumInterfaces          7
    bConfigurationValue     1
    iConfiguration          1 Sierra Configuration
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x01  EP 1 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        1
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        2
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x03  EP 3 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        3
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass         2 Communications
      bInterfaceSubClass      2 Abstract (modem)
      bInterfaceProtocol      1 AT-commands (v.25ter)
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x84  EP 4 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               5
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x85  EP 5 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x04  EP 4 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        4
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass         2 Communications
      bInterfaceSubClass      2 Abstract (modem)
      bInterfaceProtocol      1 AT-commands (v.25ter)
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x86  EP 6 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               5
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x87  EP 7 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x05  EP 5 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        5
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x88  EP 8 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               5
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x89  EP 9 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x06  EP 6 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        6
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x8a  EP 10 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               5
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x8b  EP 11 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x07  EP 7 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval              32
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

Signed-off-by: Dirk Behme <>
Cc: Lars Melin <>
Signed-off-by: Johan Hovold <>
Signed-off-by: Ben Hutchings <>
7 years agocrypto: ixp4xx - Remove bogus BUG_ON on scattered dst buffer
Herbert Xu [Wed, 22 Jul 2015 10:05:35 +0000 (18:05 +0800)]
crypto: ixp4xx - Remove bogus BUG_ON on scattered dst buffer

commit f898c522f0e9ac9f3177d0762b76e2ab2d2cf9c0 upstream.

This patch removes a bogus BUG_ON in the ablkcipher path that
triggers when the destination buffer is different from the source
buffer and is scattered.

Signed-off-by: Herbert Xu <>
Signed-off-by: Ben Hutchings <>
7 years agoxen/gntdevt: Fix race condition in gntdev_release()
Marek Marczykowski-Górecki [Fri, 26 Jun 2015 01:28:24 +0000 (03:28 +0200)]
xen/gntdevt: Fix race condition in gntdev_release()

commit 30b03d05e07467b8c6ec683ea96b5bffcbcd3931 upstream.

While gntdev_release() is called the MMU notifier is still registered
and can traverse priv->maps list even if no pages are mapped (which is
the case -- gntdev_release() is called after all). But
gntdev_release() will clear that list, so make sure that only one of
those things happens at the same time.

Signed-off-by: Marek Marczykowski-Górecki <>
Signed-off-by: David Vrabel <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agoxen/gntdev: convert priv->lock to a mutex
David Vrabel [Fri, 9 Jan 2015 18:06:12 +0000 (18:06 +0000)]
xen/gntdev: convert priv->lock to a mutex

commit 1401c00e59ea021c575f74612fe2dbba36d6a4ee upstream.

Unmapping may require sleeping and we unmap while holding priv->lock, so
convert it to a mutex.

Signed-off-by: David Vrabel <>
Reviewed-by: Stefano Stabellini <>
[bwh: Backported to 3.2:
 - Adjust context
 - Drop changes to functions we don't have]
Signed-off-by: Ben Hutchings <>
7 years agojbd2: protect all log tail updates with j_checkpoint_mutex
Jan Kara [Tue, 13 Mar 2012 19:43:04 +0000 (15:43 -0400)]
jbd2: protect all log tail updates with j_checkpoint_mutex

commit a78bb11d7acd525623c6a0c2ff4e213d527573fa upstream.

There are some log tail updates that are not protected by j_checkpoint_mutex.
Some of these are harmless because they happen during startup or shutdown but
updates in jbd2_journal_commit_transaction() and jbd2_journal_flush() can
really race with other log tail updates (e.g. someone doing
jbd2_journal_flush() with someone running jbd2_cleanup_journal_tail()). So
protect all log tail updates with j_checkpoint_mutex.

Signed-off-by: Jan Kara <>
Signed-off-by: "Theodore Ts'o" <>
[bwh: Backported to 3.2:
 - Adjust context
 - Add unlock on the error path in jbd2_journal_flush()]
Signed-off-by: Ben Hutchings <>
Cc: Bartosz Kwitniewski <>
7 years agopktgen: Require CONFIG_INET due to use of IPv4 checksum function
Thomas Graf [Mon, 29 Jul 2013 11:44:15 +0000 (13:44 +0200)]
pktgen: Require CONFIG_INET due to use of IPv4 checksum function

commit ffd756b3174e496cf6f3c5458c434e31d2cd48b0 upstream.

Unlike for IPv6, the IPv4 checksum functions are only available
if CONFIG_INET is set.

Reported-by: kbuild test robot <>
Signed-off-by: Thomas Graf <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
7 years agoipv6: Fix build failure when CONFIG_INET disabled
Ben Hutchings [Tue, 18 Aug 2015 18:31:23 +0000 (20:31 +0200)]
ipv6: Fix build failure when CONFIG_INET disabled

output_core.c, added in 3.2.66, is only needed and can only be
compiled when CONFIG_INET is enabled.

The condition in the Makefile is already correct upstream.

Reported-by: kbuild test robot <>
Signed-off-by: Ben Hutchings <>
7 years agoLinux 3.2.71 v3.2.71
Ben Hutchings [Wed, 12 Aug 2015 14:33:24 +0000 (16:33 +0200)]
Linux 3.2.71

7 years agox86/xen: Probe target addresses in set_aliased_prot() before the hypercall
Andy Lutomirski [Thu, 30 Jul 2015 21:31:31 +0000 (14:31 -0700)]
x86/xen: Probe target addresses in set_aliased_prot() before the hypercall

commit aa1acff356bbedfd03b544051f5b371746735d89 upstream.

The update_va_mapping hypercall can fail if the VA isn't present
in the guest's page tables.  Under certain loads, this can
result in an OOPS when the target address is in unpopulated vmap

While we're at it, add comments to help explain what's going on.

This isn't a great long-term fix.  This code should probably be
changed to use something like set_memory_ro.

Signed-off-by: Andy Lutomirski <>
Cc: Andrew Cooper <>
Cc: Andy Lutomirski <>
Cc: Boris Ostrovsky <>
Cc: Borislav Petkov <>
Cc: Brian Gerst <>
Cc: David Vrabel <>
Cc: Denys Vlasenko <>
Cc: H. Peter Anvin <>
Cc: Jan Beulich <>
Cc: Konrad Rzeszutek Wilk <>
Cc: Linus Torvalds <>
Cc: Peter Zijlstra <>
Cc: Sasha Levin <>
Cc: Steven Rostedt <>
Cc: Thomas Gleixner <>
Cc: <>
Cc: xen-devel <>
Signed-off-by: Ingo Molnar <>
Signed-off-by: Ben Hutchings <>
7 years agodrm/radeon/combios: add some validation of lvds values
Alex Deucher [Mon, 27 Jul 2015 23:24:31 +0000 (19:24 -0400)]
drm/radeon/combios: add some validation of lvds values

commit 0a90a0cff9f429f886f423967ae053150dce9259 upstream.

Fixes a broken hsync start value uncovered by:
(drm: Perform basic sanity checks on probed modes)

The driver handled the bad hsync start elsewhere, but
the above commit prevented it from getting added.


Signed-off-by: Alex Deucher <>
Signed-off-by: Ben Hutchings <>
7 years agoALSA: usb-audio: add dB range mapping for some devices
Yao-Wen Mao [Wed, 29 Jul 2015 07:13:54 +0000 (15:13 +0800)]
ALSA: usb-audio: add dB range mapping for some devices

commit 2d1cb7f658fb9c3ba8f9dab8aca297d4dfdec835 upstream.

Add the correct dB ranges of Bose Companion 5 and Drangonfly DAC 1.2.

Signed-off-by: Yao-Wen Mao <>
Signed-off-by: Takashi Iwai <>
Signed-off-by: Ben Hutchings <>
7 years agovhost: actually track log eventfd file
Marc-André Lureau [Fri, 17 Jul 2015 13:32:03 +0000 (15:32 +0200)]
vhost: actually track log eventfd file

commit 7932c0bd7740f4cd2aa168d3ce0199e7af7d72d5 upstream.

While reviewing vhost log code, I found out that log_file is never
set. Note: I haven't tested the change (QEMU doesn't use LOG_FD yet).

Signed-off-by: Marc-André Lureau <>
Signed-off-by: Michael S. Tsirkin <>
Signed-off-by: Ben Hutchings <>
7 years agoniu: don't count tx error twice in case of headroom realloc fails
Jiri Pirko [Thu, 23 Jul 2015 10:20:37 +0000 (12:20 +0200)]
niu: don't count tx error twice in case of headroom realloc fails

commit 42288830494cd51873ca745a7a229023df061226 upstream.

Fixes: a3138df9 ("[NIU]: Add Sun Neptune ethernet driver.")
Signed-off-by: Jiri Pirko <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
7 years agoiscsi-target: Fix use-after-free during TPG session shutdown
Nicholas Bellinger [Wed, 22 Jul 2015 07:24:09 +0000 (00:24 -0700)]
iscsi-target: Fix use-after-free during TPG session shutdown

commit 417c20a9bdd1e876384127cf096d8ae8b559066c upstream.

This patch fixes a use-after-free bug in iscsit_release_sessions_for_tpg()
where se_portal_group->session_lock was incorrectly released/re-acquired
while walking the active se_portal_group->tpg_sess_list.

The can result in a NULL pointer dereference when iscsit_close_session()
shutdown happens in the normal path asynchronously to this code, causing
a bogus dereference of an already freed list entry to occur.

To address this bug, walk the session list checking for the same state
as before, but move entries to a local list to avoid dropping the lock
while walking the active list.

As before, signal using iscsi_session->session_restatement=1 for those
list entries to be released locally by iscsit_free_session() code.

Reported-by: Sunilkumar Nadumuttlu <>
Cc: Sunilkumar Nadumuttlu <>
Signed-off-by: Nicholas Bellinger <>
Signed-off-by: Ben Hutchings <>
7 years agomd/raid1: fix test for 'was read error from last working device'.
NeilBrown [Thu, 23 Jul 2015 23:22:16 +0000 (09:22 +1000)]
md/raid1: fix test for 'was read error from last working device'.

commit 34cab6f42003cb06f48f86a86652984dec338ae9 upstream.

When we get a read error from the last working device, we don't
try to repair it, and don't fail the device.  We simple report a
read error to the caller.

However the current test for 'is this the last working device' is
When there is only one fully working device, it assumes that a
non-faulty device is that device.  However a spare which is rebuilding
would be non-faulty but so not the only working device.

So change the test from "!Faulty" to "In_sync".  If ->degraded says
there is only one fully working device and this device is in_sync,
this must be the one.

This bug has existed since we allowed read_balance to read from
a recovering spare in v3.0

Reported-and-tested-by: Alexander Lyakas <>
Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown <>
Signed-off-by: Ben Hutchings <>
7 years agoInput: usbtouchscreen - avoid unresponsive TSC-30 touch screen
Bernhard Bender [Thu, 23 Jul 2015 20:58:08 +0000 (13:58 -0700)]
Input: usbtouchscreen - avoid unresponsive TSC-30 touch screen

commit 968491709e5b1aaf429428814fff3d932fa90b60 upstream.

This patch fixes a problem in the usbtouchscreen driver for DMC TSC-30
touch screen.  Due to a missing delay between the RESET and SET_RATE
commands, the touch screen may become unresponsive during system startup or
driver loading.

According to the DMC documentation, a delay is needed after the RESET
command to allow the chip to complete its internal initialization. As this
delay is not guaranteed, we had a system where the touch screen
occasionally did not send any touch data. There was no other indication of
the problem.

The patch fixes the problem by adding a 150ms delay between the RESET and
SET_RATE commands.

Suggested-by: Jakob Mustafa <>
Signed-off-by: Bernhard Bender <>
Signed-off-by: Dmitry Torokhov <>
Signed-off-by: Ben Hutchings <>
7 years agotile: use free_bootmem_late() for initrd
Chris Metcalf [Thu, 23 Jul 2015 18:11:09 +0000 (14:11 -0400)]
tile: use free_bootmem_late() for initrd

commit 3f81d2447b37ac697b3c600039f2c6b628c06e21 upstream.

We were previously using free_bootmem() and just getting lucky
that nothing too bad happened.

Signed-off-by: Chris Metcalf <>
Signed-off-by: Ben Hutchings <>
7 years agousb-storage: ignore ZTE MF 823 card reader in mode 0x1225
Oliver Neukum [Mon, 6 Jul 2015 11:12:32 +0000 (13:12 +0200)]
usb-storage: ignore ZTE MF 823 card reader in mode 0x1225

commit 5fb2c782f451a4fb9c19c076e2c442839faf0f76 upstream.

This device automatically switches itself to another mode (0x1405)
unless the specific access pattern of Windows is followed in its
initial mode. That makes a dirty unmount of the internal storage
devices inevitable if they are mounted. So the card reader of
such a device should be ignored, lest an unclean removal become

This replaces an earlier patch that ignored all LUNs of this device.
That patch was overly broad.

Signed-off-by: Oliver Neukum <>
Reviewed-by: Lars Melin <>
Signed-off-by: Greg Kroah-Hartman <>
Signed-off-by: Ben Hutchings <>
7 years agoxhci: do not report PLC when link is in internal resume state
Zhuang Jin Can [Tue, 21 Jul 2015 14:20:31 +0000 (17:20 +0300)]
xhci: do not report PLC when link is in internal resume state

commit aca3a0489ac019b58cf32794d5362bb284cb9b94 upstream.

Port link change with port in resume state should not be
reported to usbcore, as this is an internal state to be
handled by xhci driver. Reporting PLC to usbcore may
cause usbcore clearing PLC first and port change event irq
won't be generated.

Signed-off-by: Zhuang Jin Can <>
Signed-off-by: Mathias Nyman <>
Signed-off-by: Greg Kroah-Hartman <>
[bwh: Backported to 3.2:
 - Adjust indentation
 - s/raw_port_status/temp/]
Signed-off-by: Ben Hutchings <>
7 years agoxhci: report U3 when link is in resume state
Zhuang Jin Can [Tue, 21 Jul 2015 14:20:29 +0000 (17:20 +0300)]
xhci: report U3 when link is in resume state

commit 243292a2ad3dc365849b820a64868927168894ac upstream.

xhci_hub_report_usb3_link_state() returns pls as U0 when the link
is in resume state, and this causes usb core to think the link is in
U0 while actually it's in resume state. When usb core transfers
control request on the link, it fails with TRB error as the link
is not ready for transfer.

To fix the issue, report U3 when the link is in resume state, thus
usb core knows the link it's not ready for transfer.

Signed-off-by: Zhuang Jin Can <>
Signed-off-by: Mathias Nyman <>
Signed-off-by: Greg Kroah-Hartman <>
Signed-off-by: Ben Hutchings <>
7 years agoxhci: Calculate old endpoints correctly on device reset
Brian Campbell [Tue, 21 Jul 2015 14:20:28 +0000 (17:20 +0300)]
xhci: Calculate old endpoints correctly on device reset

commit 326124a027abc9a7f43f72dc94f6f0f7a55b02b3 upstream.

When resetting a device the number of active TTs may need to be
corrected by xhci_update_tt_active_eps, but the number of old active
endpoints supplied to it was always zero, so the number of TTs and the
bandwidth reserved for them was not updated, and could rise

This affected systems using Intel's Patherpoint chipset, which rely on
software bandwidth checking.  For example, a Lenovo X230 would lose the
ability to use ports on the docking station after enough suspend/resume
cycles because the bandwidth calculated would rise with every cycle when
a suitable device is attached.

The correct number of active endpoints is calculated in the same way as
in xhci_reserve_bandwidth.

Signed-off-by: Brian Campbell <>
Signed-off-by: Mathias Nyman <>
Signed-off-by: Greg Kroah-Hartman <>
Signed-off-by: Ben Hutchings <>
7 years agousb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init() function
AMAN DEEP [Tue, 21 Jul 2015 14:20:27 +0000 (17:20 +0300)]
usb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init() function

commit 3496810663922617d4b706ef2780c279252ddd6a upstream.

virt_dev->num_cached_rings counts on freed ring and is not updated
correctly. In xhci_free_or_cache_endpoint_ring() function, the free ring
is added into cache and then num_rings_cache is incremented as below:
virt_dev->ring_cache[rings_cached] =
here, free ring pointer is added to a current index and then
index is incremented.
So current index always points to empty location in the ring cache.
For getting available free ring, current index should be decremented
first and then corresponding ring buffer value should be taken from ring

But In function xhci_endpoint_init(), the num_rings_cached index is
accessed before decrement.
virt_dev->eps[ep_index].new_ring =
virt_dev->ring_cache[virt_dev->num_rings_cached] = NULL;
This is bug in manipulating the index of ring cache.
And it should be as below:
virt_dev->eps[ep_index].new_ring =
virt_dev->ring_cache[virt_dev->num_rings_cached] = NULL;

Signed-off-by: Aman Deep <>
Signed-off-by: Mathias Nyman <>
Signed-off-by: Greg Kroah-Hartman <>
Signed-off-by: Ben Hutchings <>
7 years agonetfilter: nf_conntrack: Support expectations in different zones
Joe Stringer [Wed, 22 Jul 2015 04:37:31 +0000 (21:37 -0700)]
netfilter: nf_conntrack: Support expectations in different zones

commit 4b31814d20cbe5cd4ccf18089751e77a04afe4f2 upstream.

When zones were originally introduced, the expectation functions were
all extended to perform lookup using the zone. However, insertion was
not modified to check the zone. This means that two expectations which
are intended to apply for different connections that have the same tuple
but exist in different zones cannot both be tracked.

Fixes: 5d0aa2ccd4 (netfilter: nf_conntrack: add support for "conntrack zones")
Signed-off-by: Joe Stringer <>
Signed-off-by: Pablo Neira Ayuso <>
Signed-off-by: Ben Hutchings <>
7 years agousb: dwc3: Reset the transfer resource index on SET_INTERFACE
John Youn [Mon, 17 Sep 2001 07:00:00 +0000 (00:00 -0700)]
usb: dwc3: Reset the transfer resource index on SET_INTERFACE

commit aebda618718157a69c0dc0adb978d69bc2b8723c upstream.

This fixes an issue introduced in commit b23c843992b6 (usb: dwc3:
gadget: fix DEPSTARTCFG for non-EP0 EPs) that made sure we would
only use DEPSTARTCFG once per SetConfig.

The trick is that we should use one DEPSTARTCFG per SetConfig *OR*
SetInterface. SetInterface was completely missed from the original

This problem became aparent after commit 76e838c9f776 (usb: dwc3:
gadget: return error if command sent to DEPCMD register fails)
added checking of the return status of device endpoint commands.

'Set Endpoint Transfer Resource' command was caught failing
occasionally. This is because the Transfer Resource
Index was not getting reset during a SET_INTERFACE request.

Finally, to fix the issue, was we have to do is make sure that
our start_config_issued flag gets reset whenever we receive a
SetInterface request.

To verify the problem (and its fix), all we have to do is run
test 9 from testusb with 'testusb -t 9 -s 2048 -a -c 5000'.

Tested-by: Huang Rui <>
Tested-by: Subbaraya Sundeep Bhatta <>
Fixes: b23c843992b6 (usb: dwc3: gadget: fix DEPSTARTCFG for non-EP0 EPs)
Signed-off-by: John Youn <>
Signed-off-by: Felipe Balbi <>
[bwh: Backported to 3.2: use dev_vdbg() instead of dwc3_trace()]
Signed-off-by: Ben Hutchings <>
7 years agoinet: frags: fix defragmented packet's IP header for af_packet
Edward Hyunkoo Jee [Tue, 21 Jul 2015 07:43:59 +0000 (09:43 +0200)]
inet: frags: fix defragmented packet's IP header for af_packet

commit 0848f6428ba3a2e42db124d41ac6f548655735bf upstream.

When ip_frag_queue() computes positions, it assumes that the passed
sk_buff does not contain L2 headers.

However, when PACKET_FANOUT_FLAG_DEFRAG is used, IP reassembly
functions can be called on outgoing packets that contain L2 headers.

Also, IPv4 checksum is not corrected after reassembly.

Fixes: 7736d33f4262 ("packet: Add pre-defragmentation support for ipv4 fanouts.")
Signed-off-by: Edward Hyunkoo Jee <>
Signed-off-by: Eric Dumazet <>
Cc: Willem de Bruijn <>
Cc: Jerry Chu <>
Signed-off-by: David S. Miller <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agomac80211: clear subdir_stations when removing debugfs
Tom Hughes [Mon, 29 Jun 2015 18:41:49 +0000 (19:41 +0100)]
mac80211: clear subdir_stations when removing debugfs

commit 4479004e6409087d1b4986881dc98c6c15dffb28 upstream.

If we don't do this, and we then fail to recreate the debugfs
directory during a mode change, then we will fail later trying
to add stations to this now bogus directory:

BUG: unable to handle kernel NULL pointer dereference at 0000006c
IP: [<c0a92202>] mutex_lock+0x12/0x30
Call Trace:
[<c0678ab4>] start_creating+0x44/0xc0
[<c0679203>] debugfs_create_dir+0x13/0xf0
[<f8a938ae>] ieee80211_sta_debugfs_add+0x6e/0x490 [mac80211]

Signed-off-by: Tom Hughes <>
Signed-off-by: Johannes Berg <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agodrm/radeon: Don't flush the GART TLB if rdev->gart.ptr == NULL
Michel Dänzer [Fri, 3 Jul 2015 01:02:27 +0000 (10:02 +0900)]
drm/radeon: Don't flush the GART TLB if rdev->gart.ptr == NULL

commit 233709d2cd6bbaaeda0aeb8d11f6ca7f98563b39 upstream.

This can be the case when the GPU is powered off, e.g. via vgaswitcheroo
or runpm. When the GPU is powered up again, radeon_gart_table_vram_pin
flushes the TLB after setting rdev->gart.ptr to non-NULL.

Fixes panic on powering off R7xx GPUs.

Reviewed-by: Christian König <>
Signed-off-by: Michel Dänzer <>
Signed-off-by: Alex Deucher <>
Signed-off-by: Ben Hutchings <>
7 years agodatagram: Factor out sk queue referencing
Pavel Emelyanov [Tue, 21 Feb 2012 07:30:33 +0000 (07:30 +0000)]
datagram: Factor out sk queue referencing

commit 4934b0329f7150dcb5f90506860e2db32274c755 upstream.

This makes lines shorter and simplifies further patching.

Signed-off-by: Pavel Emelyanov <>
Acked-by: Eric Dumazet <>
Signed-off-by: David S. Miller <>
[bwh: Prerequisite of "net: Clone skb before setting peeked flag"]
Signed-off-by: Ben Hutchings <>
7 years agolibata: increase the timeout when setting transfer mode
Mikulas Patocka [Wed, 8 Jul 2015 17:06:12 +0000 (13:06 -0400)]
libata: increase the timeout when setting transfer mode

commit d531be2ca2f27cca5f041b6a140504999144a617 upstream.

I have a ST4000DM000 disk. If Linux is booted while the disk is spun down,
the command that sets transfer mode causes the disk to spin up. The
spin-up takes longer than the default 5s timeout, so the command fails and
timeout is reported.

Fix this by increasing the timeout to 15s, which is enough for the disk to
spin up.

Signed-off-by: Mikulas Patocka <>
Signed-off-by: Tejun Heo <>
Signed-off-by: Ben Hutchings <>
7 years agolibata: force disable trim for SuperSSpeed S238
Arne Fitzenreiter [Wed, 15 Jul 2015 11:54:37 +0000 (13:54 +0200)]
libata: force disable trim for SuperSSpeed S238

commit cda57b1b05cf7b8b99ab4b732bea0b05b6c015cc upstream.

This device loses blocks, often the partition table area, on trim.
Disable TRIM.

Signed-off-by: Arne Fitzenreiter <>
Signed-off-by: Tejun Heo <>
Signed-off-by: Ben Hutchings <>
7 years agolibata: add ATA_HORKAGE_NOTRIM
Arne Fitzenreiter [Wed, 15 Jul 2015 11:54:36 +0000 (13:54 +0200)]

commit 71d126fd28de2d4d9b7b2088dbccd7ca62fad6e0 upstream.

Some devices lose data on TRIM whether queued or not.  This patch adds
a horkage to disable TRIM.

tj: Collapsed unnecessary if() nesting.

Signed-off-by: Arne Fitzenreiter <>
Signed-off-by: Tejun Heo <>
[bwh: Backported to 3.2:
 - Adjust context
 - Drop change to show_ata_dev_trim()]
Signed-off-by: Ben Hutchings <>
7 years agolibata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk VB0250EAVER
Aleksei Mamlin [Wed, 1 Jul 2015 10:48:30 +0000 (13:48 +0300)]
libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk VB0250EAVER

commit 08c85d2a599d967ede38a847f5594447b6100642 upstream.

Enabling AA on HP 250GB SATA disk VB0250EAVER causes errors:

[    3.788362] ata3.00: failed to enable AA (error_mask=0x1)
[    3.789243] ata3.00: failed to enable AA (error_mask=0x1)

Add the ATA_HORKAGE_BROKEN_FPDMA_AA for this specific harddisk.

tj: Collected FPDMA_AA entries and updated comment.

Signed-off-by: Aleksei Mamlin <>
Signed-off-by: Tejun Heo <>
Signed-off-by: Ben Hutchings <>
7 years agoata: pmp: add quirk for Marvell 4140 SATA PMP
Lior Amsalem [Tue, 30 Jun 2015 14:09:49 +0000 (16:09 +0200)]
ata: pmp: add quirk for Marvell 4140 SATA PMP

commit 945b47441d83d2392ac9f984e0267ad521f24268 upstream.

This commit adds the necessary quirk to make the Marvell 4140 SATA PMP
work properly. This PMP doesn't like SRST on port number 4 (the host
port) so this commit marks this port as not supporting SRST.

Signed-off-by: Lior Amsalem <>
Reviewed-by: Nadav Haklai <>
Signed-off-by: Thomas Petazzoni <>
Signed-off-by: Tejun Heo <>
Signed-off-by: Ben Hutchings <>
7 years agords: rds_ib_device.refcount overflow
Wengang Wang [Mon, 6 Jul 2015 06:35:11 +0000 (14:35 +0800)]
rds: rds_ib_device.refcount overflow

commit 4fabb59449aa44a585b3603ffdadd4c5f4d0c033 upstream.

Fixes: 3e0249f9c05c ("RDS/IB: add refcount tracking to struct rds_ib_device")

There lacks a dropping on rds_ib_device.refcount in case rds_ib_alloc_fmr
failed(mr pool running out). this lead to the refcount overflow.

A complain in line 117(see following) is seen. From vmcore:
s_ib_rdma_mr_pool_depleted is 2147485544 and rds_ibdev->refcount is -2147475448.
That is the evidence the mr pool is used up. so rds_ib_alloc_fmr is very likely
to return ERR_PTR(-EAGAIN).

115 void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
116 {
117         BUG_ON(atomic_read(&rds_ibdev->refcount) <= 0);
118         if (atomic_dec_and_test(&rds_ibdev->refcount))
119                 queue_work(rds_wq, &rds_ibdev->free_work);
120 }

fix is to drop refcount when rds_ib_alloc_fmr failed.

Signed-off-by: Wengang Wang <>
Reviewed-by: Haggai Eran <>
Signed-off-by: Doug Ledford <>
Signed-off-by: Ben Hutchings <>
7 years agoBtrfs: fix file corruption after cloning inline extents
Filipe Manana [Tue, 14 Jul 2015 15:09:39 +0000 (16:09 +0100)]
Btrfs: fix file corruption after cloning inline extents

commit ed958762644b404654a6f5d23e869f496fe127c6 upstream.

Using the clone ioctl (or extent_same ioctl, which calls the same extent
cloning function as well) we end up allowing copy an inline extent from
the source file into a non-zero offset of the destination file. This is
something not expected and that the btrfs code is not prepared to deal
with - all inline extents must be at a file offset equals to 0.

For example, the following excerpt of a test case for fstests triggers
a crash/BUG_ON() on a write operation after an inline extent is cloned
into a non-zero offset:

  _scratch_mkfs >>$seqres.full 2>&1

  # Create our test files. File foo has the same 2K of data at offset 4K
  # as file bar has at its offset 0.
  $XFS_IO_PROG -f -s -c "pwrite -S 0xaa 0 4K" \
      -c "pwrite -S 0xbb 4k 2K" \
      -c "pwrite -S 0xcc 8K 4K" \
      $SCRATCH_MNT/foo | _filter_xfs_io

  # File bar consists of a single inline extent (2K size).
  $XFS_IO_PROG -f -s -c "pwrite -S 0xbb 0 2K" \
     $SCRATCH_MNT/bar | _filter_xfs_io

  # Now call the clone ioctl to clone the extent of file bar into file
  # foo at its offset 4K. This made file foo have an inline extent at
  # offset 4K, something which the btrfs code can not deal with in future
  # IO operations because all inline extents are supposed to start at an
  # offset of 0, resulting in all sorts of chaos.
  # So here we validate that clone ioctl returns an EOPNOTSUPP, which is
  # what it returns for other cases dealing with inlined extents.
  $CLONER_PROG -s 0 -d $((4 * 1024)) -l $((2 * 1024)) \

  # Because of the inline extent at offset 4K, the following write made
  # the kernel crash with a BUG_ON().
  $XFS_IO_PROG -c "pwrite -S 0xdd 6K 2K" $SCRATCH_MNT/foo | _filter_xfs_io


The stack trace of the BUG_ON() triggered by the last write is:

  [152154.035903] ------------[ cut here ]------------
  [152154.036424] kernel BUG at mm/page-writeback.c:2286!
  [152154.036424] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  [152154.036424] Modules linked in: btrfs dm_flakey dm_mod crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop fuse parport_pc acpi_cpu$
  [152154.036424] CPU: 2 PID: 17873 Comm: xfs_io Tainted: G        W       4.1.0-rc6-btrfs-next-11+ #2
  [152154.036424] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 04/01/2014
  [152154.036424] task: ffff880429f70990 ti: ffff880429efc000 task.ti: ffff880429efc000
  [152154.036424] RIP: 0010:[<ffffffff8111a9d5>]  [<ffffffff8111a9d5>] clear_page_dirty_for_io+0x1e/0x90
  [152154.036424] RSP: 0018:ffff880429effc68  EFLAGS: 00010246
  [152154.036424] RAX: 0200000000000806 RBX: ffffea0006a6d8f0 RCX: 0000000000000001
  [152154.036424] RDX: 0000000000000000 RSI: ffffffff81155d1b RDI: ffffea0006a6d8f0
  [152154.036424] RBP: ffff880429effc78 R08: ffff8801ce389fe0 R09: 0000000000000001
  [152154.036424] R10: 0000000000002000 R11: ffffffffffffffff R12: ffff8800200dce68
  [152154.036424] R13: 0000000000000000 R14: ffff8800200dcc88 R15: ffff8803d5736d80
  [152154.036424] FS:  00007fbf119f6700(0000) GS:ffff88043d280000(0000) knlGS:0000000000000000
  [152154.036424] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [152154.036424] CR2: 0000000001bdc000 CR3: 00000003aa555000 CR4: 00000000000006e0
  [152154.036424] Stack:
  [152154.036424]  ffff8803d5736d80 0000000000000001 ffff880429effcd8 ffffffffa04e97c1
  [152154.036424]  ffff880429effd68 ffff880429effd60 0000000000000001 ffff8800200dc9c8
  [152154.036424]  0000000000000001 ffff8800200dcc88 0000000000000000 0000000000001000
  [152154.036424] Call Trace:
  [152154.036424]  [<ffffffffa04e97c1>] lock_and_cleanup_extent_if_need+0x147/0x18d [btrfs]
  [152154.036424]  [<ffffffffa04ea82c>] __btrfs_buffered_write+0x245/0x4c8 [btrfs]
  [152154.036424]  [<ffffffffa04ed14b>] ? btrfs_file_write_iter+0x150/0x3e0 [btrfs]
  [152154.036424]  [<ffffffffa04ed15a>] ? btrfs_file_write_iter+0x15f/0x3e0 [btrfs]
  [152154.036424]  [<ffffffffa04ed2c7>] btrfs_file_write_iter+0x2cc/0x3e0 [btrfs]
  [152154.036424]  [<ffffffff81165a4a>] __vfs_write+0x7c/0xa5
  [152154.036424]  [<ffffffff81165f89>] vfs_write+0xa0/0xe4
  [152154.036424]  [<ffffffff81166855>] SyS_pwrite64+0x64/0x82
  [152154.036424]  [<ffffffff81465197>] system_call_fastpath+0x12/0x6f
  [152154.036424] Code: 48 89 c7 e8 0f ff ff ff 5b 41 5c 5d c3 0f 1f 44 00 00 55 48 89 e5 41 54 53 48 89 fb e8 ae ef 00 00 49 89 c4 48 8b 03 a8 01 75 02 <0f> 0b 4d 85 e4 74 59 49 8b 3c 2$
  [152154.036424] RIP  [<ffffffff8111a9d5>] clear_page_dirty_for_io+0x1e/0x90
  [152154.036424]  RSP <ffff880429effc68>
  [152154.242621] ---[ end trace e3d3376b23a57041 ]---

Fix this by returning the error EOPNOTSUPP if an attempt to copy an
inline extent into a non-zero offset happens, just like what is done for
other scenarios that would require copying/splitting inline extents,
which were introduced by the following commits:

   00fdf13a2e9f ("Btrfs: fix a crash of clone with inline extents's split")
   3f9e3df8da3c ("btrfs: replace error code from btrfs_drop_extents")

Signed-off-by: Filipe Manana <>
[bwh: Backported to 3.2: test new_key.offset as last_dest_end isn't defined
 in this function]
Signed-off-by: Ben Hutchings <>
7 years agos390/process: fix sfpc inline assembly
Heiko Carstens [Mon, 6 Jul 2015 13:02:37 +0000 (15:02 +0200)]
s390/process: fix sfpc inline assembly

commit e47994dd44bcb4a77b4152bd0eada585934703c0 upstream.

The sfpc inline assembly within execve_tail() may incorrectly set bits
28-31 of the sfpc instruction to a value which is not zero.
These bits however are currently unused and therefore should be zero
so we won't get surprised if these bits will be used in the future.

Therefore remove the second operand from the inline assembly.

Signed-off-by: Heiko Carstens <>
Signed-off-by: Martin Schwidefsky <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years ago9p: don't leave a half-initialized inode sitting around
Al Viro [Sun, 12 Jul 2015 14:34:29 +0000 (10:34 -0400)]
9p: don't leave a half-initialized inode sitting around

commit 0a73d0a204a4a04a1e110539c5a524ae51f91d6d upstream.

Signed-off-by: Al Viro <>
Signed-off-by: Ben Hutchings <>
7 years agonet: call rcu_read_lock early in process_backlog
Julian Anastasov [Thu, 9 Jul 2015 06:59:10 +0000 (09:59 +0300)]
net: call rcu_read_lock early in process_backlog

commit 2c17d27c36dcce2b6bf689f41a46b9e909877c21 upstream.

Incoming packet should be either in backlog queue or
in RCU read-side section. Otherwise, the final sequence of
flush_backlog() and synchronize_net() may miss packets
that can run without device reference:

CPU 1                  CPU 2
                       skb->dev: no reference

on_each_cpu for
flush_backlog =>       IPI(hardirq): flush_backlog
                       - packet not found in backlog

                       CPU delayed ...
- no ongoing RCU
read-side sections

rcu_barrier: no
ongoing callbacks
                       - too late
free dev
                       process packet for freed dev

Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
Cc: Eric W. Biederman <>
Cc: Stephen Hemminger <>
Signed-off-by: Julian Anastasov <>
Signed-off-by: David S. Miller <>
[bwh: Backported to 3.2:
 - Adjust context
 - No need to rename the label in __netif_receive_skb()]
Signed-off-by: Ben Hutchings <>
7 years agonet: do not process device backlog during unregistration
Julian Anastasov [Thu, 9 Jul 2015 06:59:09 +0000 (09:59 +0300)]
net: do not process device backlog during unregistration

commit e9e4dd3267d0c5234c5c0f47440456b10875dec9 upstream.

commit 381c759d9916 ("ipv4: Avoid crashing in ip_error")
fixes a problem where processed packet comes from device
with destroyed inetdev (dev->ip_ptr). This is not expected
because inetdev_destroy is called in NETDEV_UNREGISTER
phase and packets should not be processed after
dev_close_many() and synchronize_net(). Above fix is still
required because inetdev_destroy can be called for other
reasons. But it shows the real problem: backlog can keep
packets for long time and they do not hold reference to
device. Such packets are then delivered to upper levels
at the same time when device is unregistered.
Calling flush_backlog after NETDEV_UNREGISTER_FINAL still
accounts all packets from backlog but before that some packets
continue to be delivered to upper levels long after the
synchronize_net call which is supposed to wait the last
ones. Also, as Eric pointed out, processed packets, mostly
from other devices, can continue to add new packets to backlog.

Fix the problem by moving flush_backlog early, after the
device driver is stopped and before the synchronize_net() call.
Then use netif_running check to make sure we do not add more
packets to backlog. We have to do it in enqueue_to_backlog
context when the local IRQ is disabled. As result, after the
flush_backlog and synchronize_net sequence all packets
should be accounted.

Thanks to Eric W. Biederman for the test script and his
valuable feedback!

Reported-by: Vittorio Gambaletta <>
Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
Cc: Eric W. Biederman <>
Cc: Stephen Hemminger <>
Signed-off-by: Julian Anastasov <>
Signed-off-by: David S. Miller <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agomm: avoid setting up anonymous pages into file mapping
Kirill A. Shutemov [Mon, 6 Jul 2015 20:18:37 +0000 (23:18 +0300)]
mm: avoid setting up anonymous pages into file mapping

commit 6b7339f4c31ad69c8e9c0b2859276e22cf72176d upstream.

Reading page fault handler code I've noticed that under right
circumstances kernel would map anonymous pages into file mappings: if
the VMA doesn't have vm_ops->fault() and the VMA wasn't fully populated
on ->mmap(), kernel would handle page fault to not populated pte with

Let's change page fault handler to use do_anonymous_page() only on
anonymous VMA (->vm_ops == NULL) and make sure that the VMA is not

For file mappings without vm_ops->fault() or shred VMA without vm_ops,
page fault on pte_none() entry would lead to SIGBUS.

Signed-off-by: Kirill A. Shutemov <>
Acked-by: Oleg Nesterov <>
Cc: Andrew Morton <>
Cc: Willy Tarreau <>
Signed-off-by: Linus Torvalds <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agortnetlink: verify IFLA_VF_INFO attributes before passing them to driver
Daniel Borkmann [Mon, 6 Jul 2015 22:07:52 +0000 (00:07 +0200)]
rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver

commit 4f7d2cdfdde71ffe962399b7020c674050329423 upstream.

Jason Gunthorpe reported that since commit c02db8c6290b ("rtnetlink: make
SR-IOV VF interface symmetric"), we don't verify IFLA_VF_INFO attributes
anymore with respect to their policy, that is, ifla_vfinfo_policy[].

Before, they were part of ifla_policy[], but they have been nested since
placed under IFLA_VFINFO_LIST, that contains the attribute IFLA_VF_INFO,
which is another nested attribute for the actual VF attributes such as

Despite the policy being split out from ifla_policy[] in this commit,
it's never applied anywhere. nla_for_each_nested() only does basic nla_ok()
testing for struct nlattr, but it doesn't know about the data context and
their requirements.

Fix, on top of Jason's initial work, does 1) parsing of the attributes
with the right policy, and 2) using the resulting parsed attribute table
from 1) instead of the nla_for_each_nested() loop (just like we used to
do when still part of ifla_policy[]).

Fixes: c02db8c6290b ("rtnetlink: make SR-IOV VF interface symmetric")
Reported-by: Jason Gunthorpe <>
Cc: Chris Wright <>
Cc: Sucheta Chakraborty <>
Cc: Greg Rose <>
Cc: Jeff Kirsher <>
Cc: Rony Efraim <>
Cc: Vlad Zolotarov <>
Cc: Nicolas Dichtel <>
Cc: Thomas Graf <>
Signed-off-by: Jason Gunthorpe <>
Signed-off-by: Daniel Borkmann <>
Acked-by: Vlad Zolotarov <>
Signed-off-by: David S. Miller <>
[bwh: Backported to 3.2:
 - Drop unsupported attributes
 - Use ndo_set_vf_tx_rate operation, not ndo_set_vf_rate]
Signed-off-by: Ben Hutchings <>
7 years agodrm: add a check for x/y in drm_mode_setcrtc
Zhao Junwang [Tue, 7 Jul 2015 09:08:35 +0000 (17:08 +0800)]
drm: add a check for x/y in drm_mode_setcrtc

commit 01447e9f04ba1c49a9534ae6a5a6f26c2bb05226 upstream.

legacy setcrtc ioctl does take a 32 bit value which might indeed

the checks of crtc_req->x > INT_MAX and crtc_req->y > INT_MAX aren't
needed any more with this

v2: -polish the annotation according to Daniel's comment

Cc: Daniel Vetter <>
Signed-off-by: Zhao Junwang <>
Signed-off-by: Daniel Vetter <>
Signed-off-by: Ben Hutchings <>
7 years agodrm: Check crtc x and y coordinates
Ville Syrjälä [Tue, 13 Mar 2012 10:35:41 +0000 (12:35 +0200)]
drm: Check crtc x and y coordinates

commit 1d97e9154821d52a5ebc226176d4839c7b86b116 upstream.

The crtc x/y panning coordinates are stored as signed integers
internally. The user provides them as unsigned, so we should check
that the user provided values actually fit in the internal datatypes.

Signed-off-by: Ville Syrjälä <>
Reviewed-by: Alex Deucher <>
Signed-off-by: Dave Airlie <>
Signed-off-by: Ben Hutchings <>
7 years agos390/sclp: clear upper register halves in _sclp_print_early
Martin Schwidefsky [Mon, 6 Jul 2015 15:58:19 +0000 (17:58 +0200)]
s390/sclp: clear upper register halves in _sclp_print_early

commit f9c87a6f46d508eae0d9ae640be98d50f237f827 upstream.

If the kernel is compiled with gcc 5.1 and the XZ compression option
the decompress_kernel function calls _sclp_print_early in 64-bit mode
while the content of the upper register half of %r6 is non-zero.
This causes a specification exception on the servc instruction in

The _sclp_print_early function saves and restores the upper registers
halves but it fails to clear them for the 31-bit code of the mini sclp

Signed-off-by: Martin Schwidefsky <>
Signed-off-by: Ben Hutchings <>
7 years agodm btree: silence lockdep lock inversion in dm_btree_del()
Joe Thornber [Fri, 3 Jul 2015 13:51:32 +0000 (14:51 +0100)]
dm btree: silence lockdep lock inversion in dm_btree_del()

commit 1c7518794a3647eb345d59ee52844e8a40405198 upstream.

Allocate memory using GFP_NOIO when deleting a btree.  dm_btree_del()
can be called via an ioctl and we don't want to recurse into the FS or
block layer.

Signed-off-by: Joe Thornber <>
Signed-off-by: Mike Snitzer <>
Signed-off-by: Ben Hutchings <>
7 years agoUSB: cp210x: add ID for Aruba Networks controllers
Peter Sanford [Fri, 26 Jun 2015 00:40:05 +0000 (17:40 -0700)]
USB: cp210x: add ID for Aruba Networks controllers

commit f98a7aa81eeeadcad25665c3501c236d531d4382 upstream.

Add the USB serial console device ID for Aruba Networks 7xxx series
controllers which have a USB port for their serial console.

Signed-off-by: Peter Sanford <>
Signed-off-by: Johan Hovold <>
Signed-off-by: Ben Hutchings <>
7 years agodm thin: allocate the cell_sort_array dynamically
Joe Thornber [Fri, 3 Jul 2015 09:22:42 +0000 (10:22 +0100)]
dm thin: allocate the cell_sort_array dynamically

commit a822c83e47d97cdef38c4352e1ef62d9f46cfe98 upstream.

Given the pool's cell_sort_array holds 8192 pointers it triggers an
order 5 allocation via kmalloc.  This order 5 allocation is prone to
failure as system memory gets more fragmented over time.

Fix this by allocating the cell_sort_array using vmalloc.

Signed-off-by: Joe Thornber <>
Signed-off-by: Mike Snitzer <>
[bwh: Backported to 3.2: make a similar change in prison_{create,destroy}()]
Signed-off-by: Ben Hutchings <>
7 years agodm btree remove: fix bug in redistribute3
Dennis Yang [Fri, 26 Jun 2015 14:25:48 +0000 (15:25 +0100)]
dm btree remove: fix bug in redistribute3

commit 4c7e309340ff85072e96f529582d159002c36734 upstream.

redistribute3() shares entries out across 3 nodes.  Some entries were
being moved the wrong way, breaking the ordering.  This manifested as a
BUG() in dm-btree-remove.c:shift() when entries were removed from the

For additional context see:

Signed-off-by: Dennis Yang <>
Signed-off-by: Joe Thornber <>
Signed-off-by: Mike Snitzer <>
Signed-off-by: Ben Hutchings <>
7 years agoext4: replace open coded nofail allocation in ext4_free_blocks()
Michal Hocko [Sun, 5 Jul 2015 16:33:44 +0000 (12:33 -0400)]
ext4: replace open coded nofail allocation in ext4_free_blocks()

commit 7444a072c387a93ebee7066e8aee776954ab0e41 upstream.

ext4_free_blocks is looping around the allocation request and mimics
__GFP_NOFAIL behavior without any allocation fallback strategy. Let's
remove the open coded loop and replace it with __GFP_NOFAIL. Without the
flag the allocator has no way to find out never-fail requirement and
cannot help in any way.

Signed-off-by: Michal Hocko <>
Signed-off-by: Theodore Ts'o <>
[bwh: Backported to 3.2:
 - Adjust context
 - s/ext4_free_data_cachep/ext4_free_ext_cachep/]
Signed-off-by: Ben Hutchings <>
7 years ago9p: forgetting to cancel request on interrupted zero-copy RPC
Al Viro [Sat, 4 Jul 2015 20:04:19 +0000 (16:04 -0400)]
9p: forgetting to cancel request on interrupted zero-copy RPC

commit a84b69cb6e0a41e86bc593904faa6def3b957343 upstream.

If we'd already sent a request and decide to abort it, we *must*
issue TFLUSH properly and not just blindly reuse the tag, or
we'll get seriously screwed when response eventually arrives
and we confuse it for response to later request that had reused
the same tag.

Signed-off-by: Al Viro <>
Signed-off-by: Ben Hutchings <>
7 years agoKVM: x86: properly restore LVT0
Radim Krčmář [Tue, 30 Jun 2015 20:19:17 +0000 (22:19 +0200)]
KVM: x86: properly restore LVT0

commit db1385624c686fe99fe2d1b61a36e1537b915d08 upstream.

Legacy NMI watchdog didn't work after migration/resume, because
vapics_in_nmi_mode was left at 0.

Signed-off-by: Radim Krčmář <>
Signed-off-by: Paolo Bonzini <>
[bwh: Backported to 3.2:
 - Adjust context
 - s/kvm_apic_get_reg/apic_get_reg/]
Signed-off-by: Ben Hutchings <>
7 years agoKVM: x86: make vapics_in_nmi_mode atomic
Radim Krčmář [Wed, 1 Jul 2015 13:31:49 +0000 (15:31 +0200)]
KVM: x86: make vapics_in_nmi_mode atomic

commit 42720138b06301cc8a7ee8a495a6d021c4b6a9bc upstream.

Writes were a bit racy, but hard to turn into a bug at the same time.
(Particularly because modern Linux doesn't use this feature anymore.)

Signed-off-by: Radim Krčmář <>
[Actually the next patch makes it much, much easier to trigger the race
 so I'm including this one for stable@ as well. - Paolo]
Signed-off-by: Paolo Bonzini <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agonetfilter: bridge: don't leak skb in error paths
Florian Westphal [Tue, 30 Jun 2015 20:27:51 +0000 (22:27 +0200)]
netfilter: bridge: don't leak skb in error paths

commit dd302b59bde0149c20df7278c0d36c765e66afbd upstream.

br_nf_dev_queue_xmit must free skb in its error path.
NF_DROP is misleading -- its an okfn, not a netfilter hook.

Fixes: 462fb2af9788a ("bridge : Sanitize skb before it enters the IP stack")
Fixes: efb6de9b4ba00 ("netfilter: bridge: forward IPv6 fragmented packets")
Signed-off-by: Florian Westphal <>
Signed-off-by: Pablo Neira Ayuso <>
[bwh: Backported to 3.2:
 - Adjust filename
 - Drop IPv6 changes]
Signed-off-by: Ben Hutchings <>
7 years agoext4: avoid deadlocks in the writeback path by using sb_getblk_gfp
Nikolay Borisov [Thu, 2 Jul 2015 05:34:07 +0000 (01:34 -0400)]
ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp

commit c45653c341f5c8a0ce19c8f0ad4678640849cb86 upstream.

Switch ext4 to using sb_getblk_gfp with GFP_NOFS added to fix possible
deadlocks in the page writeback path.

Signed-off-by: Nikolay Borisov <>
Signed-off-by: Theodore Ts'o <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agobufferhead: Add _gfp version for sb_getblk()
Nikolay Borisov [Thu, 2 Jul 2015 05:32:44 +0000 (01:32 -0400)]
bufferhead: Add _gfp version for sb_getblk()

commit bd7ade3cd9b0850264306f5c2b79024a417b6396 upstream.

sb_getblk() is used during ext4 (and possibly other FSes) writeback
paths. Sometimes such path require allocating memory and guaranteeing
that such allocation won't block. Currently, however, there is no way
to provide user flags for sb_getblk which could lead to deadlocks.

This patch implements a sb_getblk_gfp with the only difference it can
accept user-provided GFP flags.

Signed-off-by: Nikolay Borisov <>
Signed-off-by: Theodore Ts'o <>
Signed-off-by: Ben Hutchings <>
7 years agofs/buffer.c: support buffer cache allocations with gfp modifiers
Gioh Kim [Fri, 5 Sep 2014 02:04:42 +0000 (22:04 -0400)]
fs/buffer.c: support buffer cache allocations with gfp modifiers

commit 3b5e6454aaf6b4439b19400d8365e2ec2d24e411 upstream.

A buffer cache is allocated from movable area because it is referred
for a while and released soon.  But some filesystems are taking buffer
cache for a long time and it can disturb page migration.

New APIs are introduced to allocate buffer cache with user specific
flag.  *_gfp APIs are for user want to set page allocation flag for
page cache allocation.  And *_unmovable APIs are for the user wants to
allocate page cache from non-movable area.

Signed-off-by: Gioh Kim <>
Signed-off-by: Theodore Ts'o <>
Reviewed-by: Jan Kara <>
[bwh: Prerequisite for "bufferhead: Add _gfp version for sb_getblk()".
 Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agoACPICA: Tables: Fix an issue that FACS initialization is performed twice
Lv Zheng [Wed, 1 Jul 2015 06:43:26 +0000 (14:43 +0800)]
ACPICA: Tables: Fix an issue that FACS initialization is performed twice

commit c04be18448355441a0c424362df65b6422e27bda upstream.

ACPICA commit 90f5332a15e9d9ba83831ca700b2b9f708274658

This patch adds a new FACS initialization flag for acpi_tb_initialize().
acpi_enable_subsystem() might be invoked several times in OS bootup process,
and we don't want FACS initialization to be invoked twice. Lv Zheng.

Signed-off-by: Lv Zheng <>
Signed-off-by: Bob Moore <>
Signed-off-by: Rafael J. Wysocki <>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <>
7 years agoALSA: usb-audio: Add MIDI support for Steinberg MI2/MI4
Dominic Sacré [Tue, 30 Jun 2015 15:41:33 +0000 (17:41 +0200)]
ALSA: usb-audio: Add MIDI support for Steinberg MI2/MI4

commit 0689a86ae814f39af94a9736a0a5426dd82eb107 upstream.

The Steinberg MI2 and MI4 interfaces are compatible with the USB class
audio spec, but the MIDI part of the devices is reported as a vendor
specific interface.

This patch adds entries to quirks-table.h to recognize the MIDI
endpoints. Audio functionality was already working and is unaffected by
this change.

Signed-off-by: Dominic Sacré <>
Signed-off-by: Albert Huitsing <>
Acked-by: Clemens Ladisch <>
Signed-off-by: Takashi Iwai <>
Signed-off-by: Ben Hutchings <>
7 years agofuse: initialize fc->release before calling it
Miklos Szeredi [Wed, 1 Jul 2015 14:25:55 +0000 (16:25 +0200)]
fuse: initialize fc->release before calling it

commit 0ad0b3255a08020eaf50e34ef0d6df5bdf5e09ed upstream.

fc->release is called from fuse_conn_put() which was used in the error
cleanup before fc->release was initialized.

[Jeremiah Mahler <>: assign fc->release after calling
fuse_conn_init(fc) instead of before.]

Signed-off-by: Miklos Szeredi <>
Fixes: a325f9b92273 ("fuse: update fuse_conn_init() and separate out fuse_conn_kill()")
Signed-off-by: Ben Hutchings <>
7 years agocrush: fix a bug in tree bucket decode
Ilya Dryomov [Mon, 29 Jun 2015 16:30:23 +0000 (19:30 +0300)]
crush: fix a bug in tree bucket decode

commit 82cd003a77173c91b9acad8033fb7931dac8d751 upstream.

struct crush_bucket_tree::num_nodes is u8, so ceph_decode_8_safe()
should be used.  -Wconversion catches this, but I guess it went
unnoticed in all the noise it spews.  The actual problem (at least for
common crushmaps) isn't the u32 -> u8 truncation though - it's the
advancement by 4 bytes instead of 1 in the crushmap buffer.


Signed-off-by: Ilya Dryomov <>
Reviewed-by: Josh Durgin <>
Signed-off-by: Ben Hutchings <>
7 years agoBtrfs: fix race between caching kthread and returning inode to inode cache
Filipe Manana [Sat, 13 Jun 2015 05:52:57 +0000 (06:52 +0100)]
Btrfs: fix race between caching kthread and returning inode to inode cache

commit ae9d8f17118551bedd797406a6768b87c2146234 upstream.

While the inode cache caching kthread is calling btrfs_unpin_free_ino(),
we could have a concurrent call to btrfs_return_ino() that adds a new
entry to the root's free space cache of pinned inodes. This concurrent
call does not acquire the fs_info->commit_root_sem before adding a new
entry if the caching state is BTRFS_CACHE_FINISHED, which is a problem
because the caching kthread calls btrfs_unpin_free_ino() after setting
the caching state to BTRFS_CACHE_FINISHED and therefore races with
the task calling btrfs_return_ino(), which is adding a new entry, while
the former (caching kthread) is navigating the cache's rbtree, removing
and freeing nodes from the cache's rbtree without acquiring the spinlock
that protects the rbtree.

This race resulted in memory corruption due to double free of struct
btrfs_free_space objects because both tasks can end up doing freeing the
same objects. Note that adding a new entry can result in merging it with
other entries in the cache, in which case those entries are freed.
This is particularly important as btrfs_free_space structures are also
used for the block group free space caches.

This memory corruption can be detected by a debugging kernel, which
reports it with the following trace:

[132408.501148] slab error in verify_redzone_free(): cache `btrfs_free_space': double free detected
[132408.505075] CPU: 15 PID: 12248 Comm: btrfs-ino-cache Tainted: G        W       4.1.0-rc5-btrfs-next-10+ #1
[132408.505075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 04/01/2014
[132408.505075]  ffff880023e7d320 ffff880163d73cd8 ffffffff8145eec7 ffffffff81095dce
[132408.505075]  ffff880009735d40 ffff880163d73ce8 ffffffff81154e1e ffff880163d73d68
[132408.505075]  ffffffff81155733 ffffffffa054a95a ffff8801b6099f00 ffffffffa0505b5f
[132408.505075] Call Trace:
[132408.505075]  [<ffffffff8145eec7>] dump_stack+0x4f/0x7b
[132408.505075]  [<ffffffff81095dce>] ? console_unlock+0x356/0x3a2
[132408.505075]  [<ffffffff81154e1e>] __slab_error.isra.28+0x25/0x36
[132408.505075]  [<ffffffff81155733>] __cache_free+0xe2/0x4b6
[132408.505075]  [<ffffffffa054a95a>] ? __btrfs_add_free_space+0x2f0/0x343 [btrfs]
[132408.505075]  [<ffffffffa0505b5f>] ? btrfs_unpin_free_ino+0x8e/0x99 [btrfs]
[132408.505075]  [<ffffffff810f3b30>] ? time_hardirqs_off+0x15/0x28
[132408.505075]  [<ffffffff81084d42>] ? trace_hardirqs_off+0xd/0xf
[132408.505075]  [<ffffffff811563a1>] ? kfree+0xb6/0x14e
[132408.505075]  [<ffffffff811563d0>] kfree+0xe5/0x14e
[132408.505075]  [<ffffffffa0505b5f>] btrfs_unpin_free_ino+0x8e/0x99 [btrfs]
[132408.505075]  [<ffffffffa0505e08>] caching_kthread+0x29e/0x2d9 [btrfs]
[132408.505075]  [<ffffffffa0505b6a>] ? btrfs_unpin_free_ino+0x99/0x99 [btrfs]
[132408.505075]  [<ffffffff8106698f>] kthread+0xef/0xf7
[132408.505075]  [<ffffffff810f3b08>] ? time_hardirqs_on+0x15/0x28
[132408.505075]  [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad
[132408.505075]  [<ffffffff814653d2>] ret_from_fork+0x42/0x70
[132408.505075]  [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad
[132408.505075] ffff880023e7d320: redzone 1:0x9f911029d74e35b, redzone 2:0x9f911029d74e35b.
[132409.501654] slab: double free detected in cache 'btrfs_free_space', objp ffff880023e7d320
[132409.503355] ------------[ cut here ]------------
[132409.504241] kernel BUG at mm/slab.c:2571!

Therefore fix this by having btrfs_unpin_free_ino() acquire the lock
that protects the rbtree while doing the searches and removing entries.

Fixes: 1c70d8fb4dfa ("Btrfs: fix inode caching vs tree log")
Signed-off-by: Filipe Manana <>
Signed-off-by: Chris Mason <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agoBtrfs: use kmem_cache_free when freeing entry in inode cache
Filipe Manana [Sat, 13 Jun 2015 05:52:56 +0000 (06:52 +0100)]
Btrfs: use kmem_cache_free when freeing entry in inode cache

commit c3f4a1685bb87e59c886ee68f7967eae07d4dffa upstream.

The free space entries are allocated using kmem_cache_zalloc(),
through __btrfs_add_free_space(), therefore we should use
kmem_cache_free() and not kfree() to avoid any confusion and
any potential problem. Looking at the kfree() definition at
mm/slab.c it has the following comment:

   * (...)
   * Don't free memory not originally allocated by kmalloc()
   * or you will run into trouble.

So better be safe and use kmem_cache_free().

Signed-off-by: Filipe Manana <>
Reviewed-by: David Sterba <>
Signed-off-by: Chris Mason <>
Signed-off-by: Ben Hutchings <>
7 years agoagp/intel: Fix typo in needs_ilk_vtd_wa()
Chris Wilson [Sun, 28 Jun 2015 13:18:16 +0000 (14:18 +0100)]
agp/intel: Fix typo in needs_ilk_vtd_wa()

commit 8b572a4200828b4e75cc22ed2f494b58d5372d65 upstream.

In needs_ilk_vtd_wa(), we pass in the GPU device but compared it against
the ids for the mobile GPU and the mobile host bridge. That latter is
impossible and so likely was just a typo for the desktop GPU device id
(which is also buggy).

Fixes commit da88a5f7f7d434e2cde1b3e19d952e6d84533662
Author: Chris Wilson <>
Date:   Wed Feb 13 09:31:53 2013 +0000

    drm/i915: Disable WC PTE updates to w/a buggy IOMMU on ILK

Reported-by: Ting-Wei Lan <>
Signed-off-by: Chris Wilson <>
Cc: Daniel Vetter <>
Reviewed-by: Daniel Vetter <>
Signed-off-by: Jani Nikula <>
Signed-off-by: Ben Hutchings <>
7 years ago__bitmap_parselist: fix bug in empty string handling
Chris Metcalf [Thu, 25 Jun 2015 22:02:08 +0000 (15:02 -0700)]
__bitmap_parselist: fix bug in empty string handling

commit 2528a8b8f457d7432552d0e2b6f0f4046bb702f4 upstream.

bitmap_parselist("", &mask, nmaskbits) will erroneously set bit zero in
the mask.  The same bug is visible in cpumask_parselist() since it is
layered on top of the bitmask code, e.g.  if you boot with "isolcpus=",
you will actually end up with cpu zero isolated.

The bug was introduced in commit 4b060420a596 ("bitmap, irq: add
smp_affinity_list interface to /proc/irq") when bitmap_parselist() was
generalized to support userspace as well as kernelspace.

Fixes: 4b060420a596 ("bitmap, irq: add smp_affinity_list interface to /proc/irq")
Signed-off-by: Chris Metcalf <>
Cc: Rasmus Villemoes <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Ben Hutchings <>
7 years agotracing/filter: Do not allow infix to exceed end of string
Steven Rostedt (Red Hat) [Thu, 25 Jun 2015 22:10:09 +0000 (18:10 -0400)]
tracing/filter: Do not allow infix to exceed end of string

commit 6b88f44e161b9ee2a803e5b2b1fbcf4e20e8b980 upstream.

While debugging a WARN_ON() for filtering, I found that it is possible
for the filter string to be referenced after its end. With the filter:

 # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

The filter_parse() function can call infix_get_op() which calls
infix_advance() that updates the infix filter pointers for the cnt
and tail without checking if the filter is already at the end, which
will put the cnt to zero and the tail beyond the end. The loop then calls
infix_next() that has

return ps->infix.string[ps->infix.tail++];

The cnt will now be below zero, and the tail that is returned is
already passed the end of the filter string. So far the allocation
of the filter string usually has some buffer that is zeroed out, but
if the filter string is of the exact size of the allocated buffer
there's no guarantee that the charater after the nul terminating
character will be zero.

Luckily, only root can write to the filter.

Signed-off-by: Steven Rostedt <>
Signed-off-by: Ben Hutchings <>
7 years agotracing/filter: Do not WARN on operand count going below zero
Steven Rostedt (Red Hat) [Thu, 25 Jun 2015 22:02:29 +0000 (18:02 -0400)]
tracing/filter: Do not WARN on operand count going below zero

commit b4875bbe7e68f139bd3383828ae8e994a0df6d28 upstream.

When testing the fix for the trace filter, I could not come up with
a scenario where the operand count goes below zero, so I added a
WARN_ON_ONCE(cnt < 0) to the logic. But there is legitimate case
that it can happen (although the filter would be wrong).

 # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

That is, a single operation without any operands will hit the path
where the WARN_ON_ONCE() can trigger. Although this is harmless,
and the filter is reported as a error. But instead of spitting out
a warning to the kernel dmesg, just fail nicely and report it via
the proper channels.

Reported-by: Vince Weaver <>
Reported-by: Sasha Levin <>
Signed-off-by: Steven Rostedt <>
Signed-off-by: Ben Hutchings <>
7 years agodell-laptop: Fix allocating & freeing SMI buffer page
Pali Rohár [Tue, 23 Jun 2015 08:11:19 +0000 (10:11 +0200)]
dell-laptop: Fix allocating & freeing SMI buffer page

commit b8830a4e71b15d0364ac8e6c55301eea73f211da upstream.

This commit fix kernel crash when probing for rfkill devices in dell-laptop
driver failed. Function free_page() was incorrectly used on struct page *
instead of virtual address of SMI buffer.

This commit also simplify allocating page for SMI buffer by using
__get_free_page() function instead of sequential call of functions
alloc_page() and page_address().

Signed-off-by: Pali Rohár <>
Acked-by: Michal Hocko <>
Signed-off-by: Darren Hart <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agomm: kmemleak: allow safe memory scanning during kmemleak disabling
Catalin Marinas [Wed, 24 Jun 2015 23:58:26 +0000 (16:58 -0700)]
mm: kmemleak: allow safe memory scanning during kmemleak disabling

commit c5f3b1a51a591c18c8b33983908e7fdda6ae417e upstream.

The kmemleak scanning thread can run for minutes.  Callbacks like
kmemleak_free() are allowed during this time, the race being taken care
of by the object->lock spinlock.  Such lock also prevents a memory block
from being freed or unmapped while it is being scanned by blocking the
kmemleak_free() -> ...  -> __delete_object() function until the lock is
released in scan_object().

When a kmemleak error occurs (e.g.  it fails to allocate its metadata),
kmemleak_enabled is set and __delete_object() is no longer called on
freed objects.  If kmemleak_scan is running at the same time,
kmemleak_free() no longer waits for the object scanning to complete,
allowing the corresponding memory block to be freed or unmapped (in the
case of vfree()).  This leads to kmemleak_scan potentially triggering a
page fault.

This patch separates the kmemleak_free() enabling/disabling from the
overall kmemleak_enabled nob so that we can defer the disabling of the
object freeing tracking until the scanning thread completed.  The
kmemleak_free_part() is deliberately ignored by this patch since this is
only called during boot before the scanning thread started.

Signed-off-by: Catalin Marinas <>
Reported-by: Vignesh Radhakrishnan <>
Tested-by: Vignesh Radhakrishnan <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
[bwh: Backported to 3.2:
 - Adjust context
 - Drop changes to kmemleak_free_percpu()]
Signed-off-by: Ben Hutchings <>
7 years agostmmac: troubleshoot unexpected bits in des0 & des1
Alexey Brodkin [Wed, 24 Jun 2015 08:47:41 +0000 (11:47 +0300)]
stmmac: troubleshoot unexpected bits in des0 & des1

commit f1590670ce069eefeb93916391a67643e6ad1630 upstream.

Current implementation of descriptor init procedure only takes
care about setting/clearing ownership flag in "des0"/"des1"
fields while it is perfectly possible to get unexpected bits
set because of the following factors:

 [1] On driver probe underlying memory allocated with
     dma_alloc_coherent() might not be zeroed and so
     it will be filled with garbage.

 [2] During driver operation some bits could be set by SD/MMC
     controller (for example error flags etc).

And unexpected and/or randomly set flags in "des0"/"des1"
fields may lead to unpredictable behavior of GMAC DMA block.

This change addresses both items above with:

 [1] Use of dma_zalloc_coherent() instead of simple
     dma_alloc_coherent() to make sure allocated memory is
     zeroed. That shouldn't affect performance because
     this allocation only happens once on driver probe.

 [2] Do explicit zeroing of both "des0" and "des1" fields
     of all buffer descriptors during initialization of
     DMA transfer.

And while at it fixed identation of dma_free_coherent()
counterpart as well.

Signed-off-by: Alexey Brodkin <>
Cc: Giuseppe Cavallaro <>
Cc: David Miller <>
Signed-off-by: David S. Miller <>
[bwh: Backported to 3.2:
 - Adjust context, indentation
 - Normal and extended descriptors are allocated in the same place here]
Signed-off-by: Ben Hutchings <>
Acked-by: Alexey Brodkin <>
7 years agofs: Fix S_NOSEC handling
Jan Kara [Thu, 21 May 2015 14:05:52 +0000 (16:05 +0200)]
fs: Fix S_NOSEC handling

commit 2426f3910069ed47c0cc58559a6d088af7920201 upstream.

file_remove_suid() could mistakenly set S_NOSEC inode bit when root was
modifying the file. As a result following writes to the file by ordinary
user would avoid clearing suid or sgid bits.

Fix the bug by checking actual mode bits before setting S_NOSEC.

Signed-off-by: Jan Kara <>
Signed-off-by: Al Viro <>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <>
7 years agobridge: multicast: restore router configuration on port link down/up
Satish Ashok [Fri, 19 Jun 2015 08:22:57 +0000 (01:22 -0700)]
bridge: multicast: restore router configuration on port link down/up

commit 754bc547f0a79f7568b5b81c7fc0a8d044a6571a upstream.

When a port goes through a link down/up the multicast router configuration
is not restored.

Signed-off-by: Satish Ashok <>
Signed-off-by: Nikolay Aleksandrov <>
Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries")
Acked-by: Herbert Xu <>
Signed-off-by: David S. Miller <>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <>
7 years agoNET: ROSE: Don't dereference NULL neighbour pointer.
Ralf Baechle [Thu, 18 Jun 2015 22:46:53 +0000 (00:46 +0200)]
NET: ROSE: Don't dereference NULL neighbour pointer.

commit d496f7842aada20c61e6044b3395383fa972872c upstream.

A ROSE socket doesn't necessarily always have a neighbour pointer so check
if the neighbour pointer is valid before dereferencing it.

Signed-off-by: Ralf Baechle <>
Tested-by: Bernard Pidoux <>
Signed-off-by: David S. Miller <>
Signed-off-by: Ben Hutchings <>
7 years agowatchdog: omap: assert the counter being stopped before reprogramming
Uwe Kleine-König [Wed, 29 Apr 2015 18:38:46 +0000 (20:38 +0200)]
watchdog: omap: assert the counter being stopped before reprogramming

commit 530c11d432727c697629ad5f9d00ee8e2864d453 upstream.

The omap watchdog has the annoying behaviour that writes to most
registers don't have any effect when the watchdog is already running.
Quoting the AM335x reference manual:

To modify the timer counter value (the WDT_WCRR register),
prescaler ratio (the WDT_WCLR[4:2] PTV bit field), delay
configuration value (the WDT_WDLY[31:0] DLY_VALUE bit field), or
the load value (the WDT_WLDR[31:0] TIMER_LOAD bit field), the
watchdog timer must be disabled by using the start/stop sequence
(the WDT_WSPR register).

Currently the timer is stopped in the .probe callback but still there
are possibilities that yield to a situation where omap_wdt_start is
entered with the timer running (e.g. when /dev/watchdog is closed
without stopping and then reopened). In such a case programming the
timeout silently fails!

To circumvent this stop the timer before reprogramming.

Assuming one of the first things the watchdog user does is setting the
timeout explicitly nothing too bad should happen because this explicit
setting works fine.

Fixes: 7768a13c252a ("[PATCH] OMAP: Add Watchdog driver support")
Signed-off-by: Uwe Kleine-König <>
Reviewed-by: Guenter Roeck <>
Signed-off-by: Wim Van Sebroeck <>
Signed-off-by: Ben Hutchings <>
7 years agoext4: don't retry file block mapping on bigalloc fs with non-extent file
Darrick J. Wong [Mon, 22 Jun 2015 01:10:51 +0000 (21:10 -0400)]
ext4: don't retry file block mapping on bigalloc fs with non-extent file

commit 292db1bc6c105d86111e858859456bcb11f90f91 upstream.

ext4 isn't willing to map clusters to a non-extent file.  Don't signal
this with an out of space error, since the FS will retry the
allocation (which didn't fail) forever.  Instead, return EUCLEAN so
that the operation will fail immediately all the way back to userspace.

(The fix is either to run e2fsck -E bmap2extent, or to chattr +e the file.)

Signed-off-by: Darrick J. Wong <>
Signed-off-by: Theodore Ts'o <>
Signed-off-by: Ben Hutchings <>
7 years agoiio: DAC: ad5624r_spi: fix bit shift of output data value
JM Friedt [Fri, 19 Jun 2015 12:48:06 +0000 (14:48 +0200)]
iio: DAC: ad5624r_spi: fix bit shift of output data value

commit adfa969850ae93beca57f7527f0e4dc10cbe1309 upstream.

The value sent on the SPI bus is shifted by an erroneous number of bits.
The shift value was already computed in the iio_chan_spec structure and
hence subtracting this argument to 16 yields an erroneous data position
in the SPI stream.

Signed-off-by: JM Friedt <>
Acked-by: Lars-Peter Clausen <>
Signed-off-by: Jonathan Cameron <>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <>
7 years agoext4: call sync_blockdev() before invalidate_bdev() in put_super()
Theodore Ts'o [Sun, 21 Jun 2015 02:50:33 +0000 (22:50 -0400)]
ext4: call sync_blockdev() before invalidate_bdev() in put_super()

commit 89d96a6f8e6491f24fc8f99fd6ae66820e85c6c1 upstream.

Normally all of the buffers will have been forced out to disk before
we call invalidate_bdev(), but there will be some cases, where a file
system operation was aborted due to an ext4_error(), where there may
still be some dirty buffers in the buffer cache for the device.  So
try to force them out to memory before calling invalidate_bdev().

This fixes a warning triggered by generic/081:

WARNING: CPU: 1 PID: 3473 at /usr/projects/linux/ext4/fs/block_dev.c:56 __blkdev_put+0xb5/0x16f()

Signed-off-by: Theodore Ts'o <>
Signed-off-by: Ben Hutchings <>
7 years agoBluetooth: ath3k: add support of 04ca:300f AR3012 device
Dmitry Tunin [Sat, 2 May 2015 10:36:58 +0000 (13:36 +0300)]
Bluetooth: ath3k: add support of 04ca:300f AR3012 device

commit ec0810d2ac1c932dad48f45da67e3adc5c5449a1 upstream.

T:  Bus=01 Lev=01 Prnt=01 Port=04 Cnt=02 Dev#=  3 Spd=12  MxCh= 0
D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=04ca ProdID=300f Rev=00.01
C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb

Signed-off-by: Dmitry Tunin <>
Signed-off-by: Marcel Holtmann <>
Signed-off-by: Ben Hutchings <>