pandora-kernel.git
8 years agovfs: Test for and handle paths that are unreachable from their mnt_root
Eric W. Biederman [Sun, 16 Aug 2015 01:27:13 +0000 (20:27 -0500)]
vfs: Test for and handle paths that are unreachable from their mnt_root

In rare cases a directory can be renamed out from under a bind mount.
In those cases without special handling it becomes possible to walk up
the directory tree to the root dentry of the filesystem and down
from the root dentry to every other file or directory on the filesystem.

Like division by zero .. from an unconnected path can not be given
a useful semantic as there is no predicting at which path component
the code will realize it is unconnected.  We certainly can not match
the current behavior as the current behavior is a security hole.

Therefore when encounting .. when following an unconnected path
return -ENOENT.

- Add a function path_connected to verify path->dentry is reachable
  from path->mnt.mnt_root.  AKA to validate that rename did not do
  something nasty to the bind mount.

  To avoid races path_connected must be called after following a path
  component to it's next path component.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
8 years agodcache: Reduce the scope of i_lock in d_splice_alias
Eric W. Biederman [Sat, 15 Aug 2015 18:36:41 +0000 (13:36 -0500)]
dcache: Reduce the scope of i_lock in d_splice_alias

i_lock is only needed until __d_find_any_alias calls dget on the alias
dentry.  After that the reference to new ensures that dentry_kill and
d_delete will not remove the inode from the dentry, and remove the
dentry from the inode->d_entry list.

The inode i_lock came to be held over the the __d_move calls in
d_splice_alias through a series of introduction of locks with
increasing smaller scope.  First it was the dcache_lock, then
it was the dcache_inode_lock, and finally inode->i_lock.

Furthermore inode->i_lock is not held over any other calls
to d_move or __d_move so it can not provide any meaningful
rename protection.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
8 years agodcache: Handle escaped paths in prepend_path
Eric W. Biederman [Sat, 15 Aug 2015 18:36:12 +0000 (13:36 -0500)]
dcache: Handle escaped paths in prepend_path

A rename can result in a dentry that by walking up d_parent
will never reach it's mnt_root.  For lack of a better term
I call this an escaped path.

prepend_path is called by four different functions __d_path,
d_absolute_path, d_path, and getcwd.

__d_path only wants to see paths are connected to the root it passes
in.  So __d_path needs prepend_path to return an error.

d_absolute_path similarly wants to see paths that are connected to
some root.  Escaped paths are not connected to any mnt_root so
d_absolute_path needs prepend_path to return an error greater
than 1.  So escaped paths will be treated like paths on lazily
unmounted mounts.

getcwd needs to prepend "(unreachable)" so getcwd also needs
prepend_path to return an error.

d_path is the interesting hold out.  d_path just wants to print
something, and does not care about the weird cases.  Which raises
the question what should be printed?

Given that <escaped_path>/<anything> should result in -ENOENT I
believe it is desirable for escaped paths to be printed as empty
paths.  As there are not really any meaninful path components when
considered from the perspective of a mount tree.

So tweak prepend_path to return an empty path with an new error
code of 3 when it encounters an escaped path.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
8 years agomm: fix potential data race in SyS_swapon
Hugh Dickins [Tue, 18 Aug 2015 00:34:27 +0000 (17:34 -0700)]
mm: fix potential data race in SyS_swapon

While running KernelThreadSanitizer (ktsan) on upstream kernel with
trinity, we got a few reports from SyS_swapon, here is one of them:

Read of size 8 by thread T307 (K7621):
 [<     inlined    >] SyS_swapon+0x3c0/0x1850 SYSC_swapon mm/swapfile.c:2395
 [<ffffffff812242c0>] SyS_swapon+0x3c0/0x1850 mm/swapfile.c:2345
 [<ffffffff81e97c8a>] ia32_do_call+0x1b/0x25

Looks like the swap_lock should be taken when iterating through the
swap_info array on lines 2392 - 2401: q->swap_file may be reset to
NULL by another thread before it is dereferenced for f_mapping.

But why is that iteration needed at all?  Doesn't the claim_swapfile()
which follows do all that is needed to check for a duplicate entry -
FMODE_EXCL on a bdev, testing IS_SWAPFILE under i_mutex on a regfile?

Well, not quite: bd_may_claim() allows the same "holder" to claim the
bdev again, so we do need to use a different holder than "sys_swapon";
and we should not replace appropriate -EBUSY by inappropriate -EINVAL.

Index i was reused in a cpu loop further down: renamed cpu there.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
8 years agoMerge branch 'superblock-scaling' of git://git.kernel.org/pub/scm/linux/kernel/git...
Al Viro [Fri, 21 Aug 2015 06:31:20 +0000 (02:31 -0400)]
Merge branch 'superblock-scaling' of git://git./linux/kernel/git/josef/btrfs-next into for-next

Conflicts:
include/linux/fs.h

8 years agoMerge branch 'ufs' into for-next
Al Viro [Wed, 19 Aug 2015 03:52:47 +0000 (23:52 -0400)]
Merge branch 'ufs' into for-next

8 years agoMerge branch 'sb_writers_pcpu_rwsem' of git://git.kernel.org/pub/scm/linux/kernel...
Al Viro [Wed, 19 Aug 2015 03:43:29 +0000 (23:43 -0400)]
Merge branch 'sb_writers_pcpu_rwsem' of git://git./linux/kernel/git/oleg/misc into for-next

8 years agoinode: don't softlockup when evicting inodes
Josef Bacik [Wed, 4 Mar 2015 21:52:52 +0000 (16:52 -0500)]
inode: don't softlockup when evicting inodes

On a box with a lot of ram (148gb) I can make the box softlockup after running
an fs_mark job that creates hundreds of millions of empty files.  This is
because we never generate enough memory pressure to keep the number of inodes on
our unused list low, so when we go to unmount we have to evict ~100 million
inodes.  This makes one processor a very unhappy person, so add a cond_resched()
in dispose_list() and if we need a resched when processing the s_inodes list do
that and run dispose_list() on what we've currently culled.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
8 years agoinode: rename i_wb_list to i_io_list
Dave Chinner [Wed, 4 Mar 2015 19:07:22 +0000 (14:07 -0500)]
inode: rename i_wb_list to i_io_list

There's a small consistency problem between the inode and writeback
naming. Writeback calls the "for IO" inode queues b_io and
b_more_io, but the inode calls these the "writeback list" or
i_wb_list. This makes it hard to an new "under writeback" list to
the inode, or call it an "under IO" list on the bdi because either
way we'll have writeback on IO and IO on writeback and it'll just be
confusing. I'm getting confused just writing this!

So, rename the inode "for IO" list variable to i_io_list so we can
add a new "writeback list" in a subsequent patch.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Dave Chinner <dchinner@redhat.com>
8 years agosync: serialise per-superblock sync operations
Dave Chinner [Wed, 4 Mar 2015 18:40:00 +0000 (13:40 -0500)]
sync: serialise per-superblock sync operations

When competing sync(2) calls walk the same filesystem, they need to
walk the list of inodes on the superblock to find all the inodes
that we need to wait for IO completion on. However, when multiple
wait_sb_inodes() calls do this at the same time, they contend on the
the inode_sb_list_lock and the contention causes system wide
slowdowns. In effect, concurrent sync(2) calls can take longer and
burn more CPU than if they were serialised.

Stop the worst of the contention by adding a per-sb mutex to wrap
around wait_sb_inodes() so that we only execute one sync(2) IO
completion walk per superblock superblock at a time and hence avoid
contention being triggered by concurrent sync(2) calls.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Dave Chinner <dchinner@redhat.com>
8 years agoinode: convert inode_sb_list_lock to per-sb
Dave Chinner [Wed, 4 Mar 2015 17:37:22 +0000 (12:37 -0500)]
inode: convert inode_sb_list_lock to per-sb

The process of reducing contention on per-superblock inode lists
starts with moving the locking to match the per-superblock inode
list. This takes the global lock out of the picture and reduces the
contention problems to within a single filesystem. This doesn't get
rid of contention as the locks still have global CPU scope, but it
does isolate operations on different superblocks form each other.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Dave Chinner <dchinner@redhat.com>
8 years agoinode: add hlist_fake to avoid the inode hash lock in evict
Josef Bacik [Thu, 12 Mar 2015 12:19:11 +0000 (08:19 -0400)]
inode: add hlist_fake to avoid the inode hash lock in evict

Some filesystems don't use the VFS inode hash and fake the fact they
are hashed so that all the writeback code works correctly. However,
this means the evict() path still tries to remove the inode from the
hash, meaning that the inode_hash_lock() needs to be taken
unnecessarily. Hence under certain workloads the inode_hash_lock can
be contended even if the inode is never actually hashed.

To avoid this add hlist_fake to test if the inode isn't actually
hashed to avoid taking the hash lock on inodes that have never been
hashed.  Based on Dave Chinner's

inode: add IOP_NOTHASHED to avoid inode hash lock in evict

basd on Al's suggestions.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Dave Chinner <dchinner@redhat.com>
8 years agowriteback: plug writeback at a high level
Dave Chinner [Wed, 4 Mar 2015 16:16:36 +0000 (11:16 -0500)]
writeback: plug writeback at a high level

Doing writeback on lots of little files causes terrible IOPS storms
because of the per-mapping writeback plugging we do. This
essentially causes imeediate dispatch of IO for each mapping,
regardless of the context in which writeback is occurring.

IOWs, running a concurrent write-lots-of-small 4k files using fsmark
on XFS results in a huge number of IOPS being issued for data
writes.  Metadata writes are sorted and plugged at a high level by
XFS, so aggregate nicely into large IOs. However, data writeback IOs
are dispatched in individual 4k IOs, even when the blocks of two
consecutively written files are adjacent.

Test VM: 8p, 8GB RAM, 4xSSD in RAID0, 100TB sparse XFS filesystem,
metadata CRCs enabled.

Kernel: 3.10-rc5 + xfsdev + my 3.11 xfs queue (~70 patches)

Test:

$ ./fs_mark  -D  10000  -S0  -n  10000  -s  4096  -L  120  -d
/mnt/scratch/0  -d  /mnt/scratch/1  -d  /mnt/scratch/2  -d
/mnt/scratch/3  -d  /mnt/scratch/4  -d  /mnt/scratch/5  -d
/mnt/scratch/6  -d  /mnt/scratch/7

Result:

wall sys create rate Physical write IO
time CPU (avg files/s)  IOPS Bandwidth
----- ----- ------------ ------ ---------
unpatched 6m56s 15m47s 24,000+/-500 26,000 130MB/s
patched 5m06s 13m28s 32,800+/-600  1,500 180MB/s
improvement -26.44% -14.68%   +36.67% -94.23% +38.46%

If I use zero length files, this workload at about 500 IOPS, so
plugging drops the data IOs from roughly 25,500/s to 1000/s.
3 lines of code, 35% better throughput for 15% less CPU.

The benefits of plugging at this layer are likely to be higher for
spinning media as the IO patterns for this workload are going make a
much bigger difference on high IO latency devices.....

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
8 years agoLinux 4.2-rc7
Linus Torvalds [Sun, 16 Aug 2015 23:34:13 +0000 (16:34 -0700)]
Linux 4.2-rc7

8 years agoMerge tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm...
Linus Torvalds [Sun, 16 Aug 2015 22:44:33 +0000 (15:44 -0700)]
Merge tag 'armsoc-for-linus' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "A smallish batch of fixes, a little more than expected this late, but
  all fixes are contained to their platforms and seem reasonably low
  risk:

   - a somewhat large SMP fix for ux500 that still seemed warranted to
     include here
   - OMAP DT fixes for pbias regulator specification that broke due to
     some DT reshuffling
   - PCIe IRQ routing bugfix for i.MX
   - networking fixes for keystone
   - runtime PM for OMAP GPMC
   - a couple of error path bug fixes for exynos"

* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: dts: keystone: Fix the mdio bindings by moving it to soc specific file
  ARM: dts: keystone: fix the clock node for mdio
  memory: omap-gpmc: Don't try to save uninitialized GPMC context
  ARM: imx6: correct i.MX6 PCIe interrupt routing
  ARM: ux500: add an SMP enablement type and move cpu nodes
  ARM: dts: dra7: Fix broken pbias device creation
  ARM: dts: OMAP5: Fix broken pbias device creation
  ARM: dts: OMAP4: Fix broken pbias device creation
  ARM: dts: omap243x: Fix broken pbias device creation
  ARM: EXYNOS: fix double of_node_put() on error path
  ARM: EXYNOS: Fix potentian kfree() of ro memory

8 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Linus Torvalds [Sun, 16 Aug 2015 22:39:31 +0000 (15:39 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS bugfix from Ralf Baechle:
 "Only a single MIPS fix - the math when invoking syscall_trace_enter
  was wrong"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Fix seccomp syscall argument for MIPS64

8 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 16 Aug 2015 22:11:25 +0000 (15:11 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Merge x86 fixes from Ingo Molnar:
 "Two followup fixes related to the previous LDT fix"

Also applied a further FPU emulation fix from Andy Lutomirski to the
branch before actually merging it.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
  x86/ldt: Further fix FPU emulation
  x86/ldt: Correct FPU emulation access to LDT
  x86/ldt: Correct LDT access in single stepping logic

8 years agox86/ldt: Further fix FPU emulation
Andy Lutomirski [Fri, 14 Aug 2015 22:02:55 +0000 (15:02 -0700)]
x86/ldt: Further fix FPU emulation

The previous fix confused a selector with a segment prefix.  Fix it.

Compile-tested only.

Cc: stable@vger.kernel.org
Cc: Juergen Gross <jgross@suse.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 4809146b86c3 ("x86/ldt: Correct FPU emulation access to LDT")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs/fuse: fix ioctl type confusion
Jann Horn [Sun, 16 Aug 2015 18:27:01 +0000 (20:27 +0200)]
fs/fuse: fix ioctl type confusion

fuse_dev_ioctl() performed fuse_get_dev() on a user-supplied fd,
leading to a type confusion issue. Fix it by checking file->f_op.

Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge tag 'keystone-dts-late-fixes-v2' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Sun, 16 Aug 2015 19:29:57 +0000 (21:29 +0200)]
Merge tag 'keystone-dts-late-fixes-v2' of git://git./linux/kernel/git/ssantosh/linux-keystone into fixes

ARM: Couple of Keysyone MDIO DTS fixes for 4.2-rc6+

These are necessary to get the NIC card working on all Keystone
EVMs. Couple of boards are broken without these two fixes.

* tag 'keystone-dts-late-fixes-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
  ARM: dts: keystone: Fix the mdio bindings by moving it to soc specific file
  ARM: dts: keystone: fix the clock node for mdio

Signed-off-by: Olof Johansson <olof@lixom.net>
8 years agoMIPS: Fix seccomp syscall argument for MIPS64
Markos Chandras [Thu, 13 Aug 2015 07:47:59 +0000 (08:47 +0100)]
MIPS: Fix seccomp syscall argument for MIPS64

Commit 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
fixed indirect system calls on O32 but it also introduced a bug for MIPS64
where it erroneously modified the v0 (syscall) register with the assumption
that the sycall offset hasn't been taken into consideration. This breaks
seccomp on MIPS64 n64 and n32 ABIs. We fix this by replacing the addition
with a move instruction.

Fixes: 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
Cc: <stable@vger.kernel.org> # 3.15+
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10951/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
8 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 15 Aug 2015 20:54:53 +0000 (13:54 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This has two libfc fixes for bugs causing rare crashes, one iscsi fix
  for a potential hang on shutdown, and a fix for an I/O blocksize issue
  which caused a regression"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  sd: Fix maximum I/O size for BLOCK_PC requests
  libfc: Fix fc_fcp_cleanup_each_cmd()
  libfc: Fix fc_exch_recv_req() error path
  libiscsi: Fix host busy blocking during connection teardown

8 years agochange sb_writers to use percpu_rw_semaphore
Oleg Nesterov [Tue, 11 Aug 2015 15:05:04 +0000 (17:05 +0200)]
change sb_writers to use percpu_rw_semaphore

We can remove everything from struct sb_writers except frozen
and add the array of percpu_rw_semaphore's instead.

This patch doesn't remove sb_writers->wait_unfrozen yet, we keep
it for get_super_thawed(). We will probably remove it later.

This change tries to address the following problems:

- Firstly, __sb_start_write() looks simply buggy. It does
  __sb_end_write() if it sees ->frozen, but if it migrates
  to another CPU before percpu_counter_dec(), sb_wait_write()
  can wrongly succeed if there is another task which holds
  the same "semaphore": sb_wait_write() can miss the result
  of the previous percpu_counter_inc() but see the result
  of this percpu_counter_dec().

- As Dave Hansen reports, it is suboptimal. The trivial
  microbenchmark that writes to a tmpfs file in a loop runs
  12% faster if we change this code to rely on RCU and kill
  the memory barriers.

- This code doesn't look simple. It would be better to rely
  on the generic locking code.

  According to Dave, this change adds the same performance
  improvement.

Note: with this change both freeze_super() and thaw_super() will do
synchronize_sched_expedited() 3 times. This is just ugly. But:

- This will be "fixed" by the rcu_sync changes we are going
  to merge. After that freeze_super()->percpu_down_write()
  will use synchronize_sched(), and thaw_super() won't use
  synchronize() at all.

  This doesn't need any changes in fs/super.c.

- Once we merge rcu_sync changes, we can also change super.c
  so that all wb_write->rw_sem's will share the single ->rss
  in struct sb_writes, then freeze_super() will need only one
  synchronize_sched().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jan Kara <jack@suse.com>
8 years agoshift percpu_counter_destroy() into destroy_super_work()
Oleg Nesterov [Wed, 22 Jul 2015 18:21:13 +0000 (20:21 +0200)]
shift percpu_counter_destroy() into destroy_super_work()

Of course, this patch is ugly as hell. It will be (partially)
reverted later. We add it to ensure that other WIP changes in
percpu_rw_semaphore won't break fs/super.c.

We do not even need this change right now, percpu_free_rwsem()
is fine in atomic context. But we are going to change this, it
will be might_sleep() after we merge the rcu_sync() patches.

And even after that we do not really need destroy_super_work(),
we will kill it in any case. Instead, destroy_super_rcu() should
just check that rss->cb_state == CB_IDLE and do call_rcu() again
in the (very unlikely) case this is not true.

So this is just the temporary kludge which helps us to avoid the
conflicts with the changes which will be (hopefully) routed via
rcu tree.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jan Kara <jack@suse.com>
8 years agopercpu-rwsem: kill CONFIG_PERCPU_RWSEM
Oleg Nesterov [Tue, 11 Aug 2015 15:26:29 +0000 (17:26 +0200)]
percpu-rwsem: kill CONFIG_PERCPU_RWSEM

Remove CONFIG_PERCPU_RWSEM, the next patch adds the unconditional
user of percpu_rw_semaphore.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
8 years agopercpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire()
Oleg Nesterov [Tue, 21 Jul 2015 18:26:44 +0000 (20:26 +0200)]
percpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire()

Add percpu_rwsem_release() and percpu_rwsem_acquire() for the users
which need to return to userspace with percpu-rwsem lock held and/or
pass the ownership to another thread.

TODO: change percpu_rwsem_release() to use rwsem_clear_owner(). We can
either fold kernel/locking/rwsem.h into include/linux/rwsem.h, or add
the non-inline percpu_rwsem_clear_owner().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
8 years agopercpu-rwsem: introduce percpu_down_read_trylock()
Oleg Nesterov [Tue, 21 Jul 2015 15:45:57 +0000 (17:45 +0200)]
percpu-rwsem: introduce percpu_down_read_trylock()

Add percpu_down_read_trylock(), it will have the user soon.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
8 years agodocument rwsem_release() in sb_wait_write()
Oleg Nesterov [Tue, 11 Aug 2015 14:28:29 +0000 (16:28 +0200)]
document rwsem_release() in sb_wait_write()

Not only we need to avoid the warning from lockdep_sys_exit(), the
caller of freeze_super() can never release this lock. Another thread
can do this, so there is another reason for rwsem_release().

Plus the comment should explain why we have to fool lockdep.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jan Kara <jack@suse.com>
8 years agofix the broken lockdep logic in __sb_start_write()
Oleg Nesterov [Sun, 19 Jul 2015 22:50:55 +0000 (00:50 +0200)]
fix the broken lockdep logic in __sb_start_write()

1. wait_event(frozen < level) without rwsem_acquire_read() is just
   wrong from lockdep perspective. If we are going to deadlock
   because the caller is buggy, lockdep can't detect this problem.

2. __sb_start_write() can race with thaw_super() + freeze_super(),
   and after "goto retry" the 2nd  acquire_freeze_lock() is wrong.

3. The "tell lockdep we are doing trylock" hack doesn't look nice.

   I think this is correct, but this logic should be more explicit.
   Yes, the recursive read_lock() is fine if we hold the lock on a
   higher level. But we do not need to fool lockdep. If we can not
   deadlock in this case then try-lock must not fail and we can use
   use wait == F throughout this code.

Note: as Dave Chinner explains, the "trylock" hack and the fat comment
can be probably removed. But this needs a separate change and it will
be trivial: just kill __sb_start_write() and rename do_sb_start_write()
back to __sb_start_write().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jan Kara <jack@suse.com>
8 years agointroduce __sb_writers_{acquired,release}() helpers
Oleg Nesterov [Sun, 19 Jul 2015 21:48:20 +0000 (23:48 +0200)]
introduce __sb_writers_{acquired,release}() helpers

Preparation to hide the sb->s_writers internals from xfs and btrfs.
Add 2 trivial define's they can use rather than play with ->s_writers
directly. No changes in btrfs/transaction.o and xfs/xfs_aops.o.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Jan Kara <jack@suse.com>
8 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 15 Aug 2015 00:27:52 +0000 (17:27 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Just two very small & simple patches"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Use adjustment in guest cycles when handling MSR_IA32_TSC_ADJUST
  KVM: x86: zero IDT limit on entry to SMM

8 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 15 Aug 2015 00:05:26 +0000 (17:05 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge fixes from Andrew Morton:
 "11 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  Update maintainers for DRM STI driver
  mm: cma: mark cma_bitmap_maxno() inline in header
  zram: fix pool name truncation
  memory-hotplug: fix wrong edge when hot add a new node
  .mailmap: Andrey Ryabinin has moved
  ipc/sem.c: update/correct memory barriers
  mm/hwpoison: fix panic due to split huge zero page
  ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()
  ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits
  mm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held
  mm/hwpoison: fix page refcount of unknown non LRU page

8 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 14 Aug 2015 23:10:04 +0000 (16:10 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clock fix from Stephen Boyd:
 "A one-liner for a regression found in the PXA clock driver"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: pxa: pxa3xx: fix CKEN register access

8 years agoUpdate maintainers for DRM STI driver
Benjamin Gaignard [Fri, 14 Aug 2015 22:35:24 +0000 (15:35 -0700)]
Update maintainers for DRM STI driver

Add Vincent Abriou and myself as maintainers.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Vincent Abriou <vincent.abriou@st.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: cma: mark cma_bitmap_maxno() inline in header
Gregory Fong [Fri, 14 Aug 2015 22:35:21 +0000 (15:35 -0700)]
mm: cma: mark cma_bitmap_maxno() inline in header

cma_bitmap_maxno() was marked as static and not static inline, which can
cause warnings about this function not being used if this file is included
in a file that does not call that function, and violates the conventions
used elsewhere.  The two options are to move the function implementation
back to mm/cma.c or make it inline here, and it's simple enough for the
latter to make sense.

Signed-off-by: Gregory Fong <gregory.0xf0@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agozram: fix pool name truncation
Sergey Senozhatsky [Fri, 14 Aug 2015 22:35:19 +0000 (15:35 -0700)]
zram: fix pool name truncation

zram_meta_alloc() constructs a pool name for zs_create_pool() call as

    snprintf(pool_name, sizeof(pool_name), "zram%d", device_id);

However, it defines pool name buffer to be only 8 bytes long (minus
trailing zero), which means that we can have only 1000 pool names: zram0
-- zram999.

With CONFIG_ZSMALLOC_STAT enabled an attempt to create a device zram1000
can fail if device zram100 already exists, because snprintf() will
truncate new pool name to zram100 and pass it debugfs_create_dir(),
causing:

  debugfs dir <zram100> creation failed
  zram: Error creating memory pool

... and so on.

Fix it by passing zram->disk->disk_name to zram_meta_alloc() instead of
divice_id.  We construct zram%d name earlier and keep it as a ->disk_name,
no need to snprintf() it again.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomemory-hotplug: fix wrong edge when hot add a new node
Xishi Qiu [Fri, 14 Aug 2015 22:35:16 +0000 (15:35 -0700)]
memory-hotplug: fix wrong edge when hot add a new node

When we add a new node, the edge of memory may be wrong.

e.g. system has 4 nodes, and node3 is movable, node3 mem:[24G-32G],

1. hotremove the node3,
2. then hotadd node3 with a part of memory, mem:[26G-30G],
3. call hotadd_new_pgdat()
        free_area_init_node()
                get_pfn_range_for_nid()
4. it will return wrong start_pfn and end_pfn, because we have not
update the memblock.

This patch also fixes a BUG_ON during hot-addition, please see
http://marc.info/?l=linux-kernel&m=142961156129456&w=2

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years ago.mailmap: Andrey Ryabinin has moved
Andrey Ryabinin [Fri, 14 Aug 2015 22:35:13 +0000 (15:35 -0700)]
.mailmap: Andrey Ryabinin has moved

Update my email address.

Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc/sem.c: update/correct memory barriers
Manfred Spraul [Fri, 14 Aug 2015 22:35:10 +0000 (15:35 -0700)]
ipc/sem.c: update/correct memory barriers

sem_lock() did not properly pair memory barriers:

!spin_is_locked() and spin_unlock_wait() are both only control barriers.
The code needs an acquire barrier, otherwise the cpu might perform read
operations before the lock test.

As no primitive exists inside <include/spinlock.h> and since it seems
noone wants another primitive, the code creates a local primitive within
ipc/sem.c.

With regards to -stable:

The change of sem_wait_array() is a bugfix, the change to sem_lock() is a
nop (just a preprocessor redefinition to improve the readability).  The
bugfix is necessary for all kernels that use sem_wait_array() (i.e.:
starting from 3.10).

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm/hwpoison: fix panic due to split huge zero page
Wanpeng Li [Fri, 14 Aug 2015 22:35:08 +0000 (15:35 -0700)]
mm/hwpoison: fix panic due to split huge zero page

Bug:

  ------------[ cut here ]------------
  kernel BUG at mm/huge_memory.c:1957!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: snd_hda_codec_hdmi i915 rpcsec_gss_krb5 snd_hda_codec_realtek snd_hda_codec_generic nfsv4 dns_re
  CPU: 2 PID: 2576 Comm: test_huge Not tainted 4.2.0-rc5-mm1+ #27
  Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03 01/08/2015
  task: ffff880204e3d600 ti: ffff8800db16c000 task.ti: ffff8800db16c000
  RIP: split_huge_page_to_list+0xdb/0x120
  Call Trace:
    memory_failure+0x32e/0x7c0
    madvise_hwpoison+0x8b/0x160
    SyS_madvise+0x40/0x240
    ? do_page_fault+0x37/0x90
    entry_SYSCALL_64_fastpath+0x12/0x71
  Code: ff f0 41 ff 4c 24 30 74 0d 31 c0 48 83 c4 08 5b 41 5c 41 5d c9 c3 4c 89 e7 e8 e2 58 fd ff 48 83 c4 08 31 c0
  RIP  split_huge_page_to_list+0xdb/0x120
   RSP <ffff8800db16fde8>
  ---[ end trace aee7ce0df8e44076 ]---

Testcase:

    #define _GNU_SOURCE
    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/mman.h>
    #include <unistd.h>
    #include <fcntl.h>
    #include <sys/types.h>
    #include <errno.h>
    #include <string.h>

    #define MB 1024*1024

    int main(void)
    {
            char *mem;

            posix_memalign((void **)&mem, 2 * MB, 200 * MB);

            madvise(mem, 200 * MB, MADV_HWPOISON);

            free(mem);

            return 0;
    }

Huge zero page is allocated if page fault w/o FAULT_FLAG_WRITE flag.
The get_user_pages_fast() which called in madvise_hwpoison() will get
huge zero page if the page is not allocated before.  Huge zero page is a
tranparent huge page, however, it is not an anonymous page.
memory_failure will split the huge zero page and trigger
BUG_ON(is_huge_zero_page(page));

After commit 98ed2b0052e6 ("mm/memory-failure: give up error handling
for non-tail-refcounted thp"), memory_failure will not catch non anon
thp from madvise_hwpoison path and this bug occur.

Fix it by catching non anon thp in memory_failure in order to not split
huge zero page in madvise_hwpoison path.

After this patch:

  Injecting memory failure for page 0x202800 at 0x7fd8ae800000
  MCE: 0x202800: non anonymous thp
  [...]

[akpm@linux-foundation.org: remove second split, per Wanpeng]
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()
Herton R. Krzesinski [Fri, 14 Aug 2015 22:35:05 +0000 (15:35 -0700)]
ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()

After we acquire the sma->sem_perm lock in exit_sem(), we are protected
against a racing IPC_RMID operation.  Also at that point, we are the last
user of sem_undo_list.  Therefore it isn't required that we acquire or use
ulp->lock.

Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Rafael Aquini <aquini@redhat.com>
CC: Aristeu Rozanski <aris@redhat.com>
Cc: David Jeffery <djeffery@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits
Herton R. Krzesinski [Fri, 14 Aug 2015 22:35:02 +0000 (15:35 -0700)]
ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits

The current semaphore code allows a potential use after free: in
exit_sem we may free the task's sem_undo_list while there is still
another task looping through the same semaphore set and cleaning the
sem_undo list at freeary function (the task called IPC_RMID for the same
semaphore set).

For example, with a test program [1] running which keeps forking a lot
of processes (which then do a semop call with SEM_UNDO flag), and with
the parent right after removing the semaphore set with IPC_RMID, and a
kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
in the kernel log:

   Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
   000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b  kkkkkkkk.kkkkkkk
   010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff  ....kkkk........
   Prev obj: start=ffff88003b45c180, len=64
   000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
   010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff  ...........7....
   Next obj: start=ffff88003b45c200, len=64
   000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
   010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff  ........h).<....
   BUG: spinlock wrong CPU on CPU#2, test/18028
   general protection fault: 0000 [#1] SMP
   Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
   CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
   RIP: spin_dump+0x53/0xc0
   Call Trace:
     spin_bug+0x30/0x40
     do_raw_spin_unlock+0x71/0xa0
     _raw_spin_unlock+0xe/0x10
     freeary+0x82/0x2a0
     ? _raw_spin_lock+0xe/0x10
     semctl_down.clone.0+0xce/0x160
     ? __do_page_fault+0x19a/0x430
     ? __audit_syscall_entry+0xa8/0x100
     SyS_semctl+0x236/0x2c0
     ? syscall_trace_leave+0xde/0x130
     entry_SYSCALL_64_fastpath+0x12/0x71
   Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
   RIP  [<ffffffff810d6053>] spin_dump+0x53/0xc0
    RSP <ffff88003750fd68>
   ---[ end trace 783ebb76612867a0 ]---
   NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
   Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
   CPU: 3 PID: 18053 Comm: test Tainted: G      D         4.2.0-rc5+ #1
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
   RIP: native_read_tsc+0x0/0x20
   Call Trace:
     ? delay_tsc+0x40/0x70
     __delay+0xf/0x20
     do_raw_spin_lock+0x96/0x140
     _raw_spin_lock+0xe/0x10
     sem_lock_and_putref+0x11/0x70
     SYSC_semtimedop+0x7bf/0x960
     ? handle_mm_fault+0xbf6/0x1880
     ? dequeue_task_fair+0x79/0x4a0
     ? __do_page_fault+0x19a/0x430
     ? kfree_debugcheck+0x16/0x40
     ? __do_page_fault+0x19a/0x430
     ? __audit_syscall_entry+0xa8/0x100
     ? do_audit_syscall_entry+0x66/0x70
     ? syscall_trace_enter_phase1+0x139/0x160
     SyS_semtimedop+0xe/0x10
     SyS_semop+0x10/0x20
     entry_SYSCALL_64_fastpath+0x12/0x71
   Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
   Kernel panic - not syncing: softlockup: hung tasks

I wasn't able to trigger any badness on a recent kernel without the
proper config debugs enabled, however I have softlockup reports on some
kernel versions, in the semaphore code, which are similar as above (the
scenario is seen on some servers running IBM DB2 which uses semaphore
syscalls).

The patch here fixes the race against freeary, by acquiring or waiting
on the sem_undo_list lock as necessary (exit_sem can race with freeary,
while freeary sets un->semid to -1 and removes the same sem_undo from
list_proc or when it removes the last sem_undo).

After the patch I'm unable to reproduce the problem using the test case
[1].

[1] Test case used below:

    #include <stdio.h>
    #include <sys/types.h>
    #include <sys/ipc.h>
    #include <sys/sem.h>
    #include <sys/wait.h>
    #include <stdlib.h>
    #include <time.h>
    #include <unistd.h>
    #include <errno.h>

    #define NSEM 1
    #define NSET 5

    int sid[NSET];

    void thread()
    {
            struct sembuf op;
            int s;
            uid_t pid = getuid();

            s = rand() % NSET;
            op.sem_num = pid % NSEM;
            op.sem_op = 1;
            op.sem_flg = SEM_UNDO;

            semop(sid[s], &op, 1);
            exit(EXIT_SUCCESS);
    }

    void create_set()
    {
            int i, j;
            pid_t p;
            union {
                    int val;
                    struct semid_ds *buf;
                    unsigned short int *array;
                    struct seminfo *__buf;
            } un;

            /* Create and initialize semaphore set */
            for (i = 0; i < NSET; i++) {
                    sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
                    if (sid[i] < 0) {
                            perror("semget");
                            exit(EXIT_FAILURE);
                    }
            }
            un.val = 0;
            for (i = 0; i < NSET; i++) {
                    for (j = 0; j < NSEM; j++) {
                            if (semctl(sid[i], j, SETVAL, un) < 0)
                                    perror("semctl");
                    }
            }

            /* Launch threads that operate on semaphore set */
            for (i = 0; i < NSEM * NSET * NSET; i++) {
                    p = fork();
                    if (p < 0)
                            perror("fork");
                    if (p == 0)
                            thread();
            }

            /* Free semaphore set */
            for (i = 0; i < NSET; i++) {
                    if (semctl(sid[i], NSEM, IPC_RMID))
                            perror("IPC_RMID");
            }

            /* Wait for forked processes to exit */
            while (wait(NULL)) {
                    if (errno == ECHILD)
                            break;
            };
    }

    int main(int argc, char **argv)
    {
            pid_t p;

            srand(time(NULL));

            while (1) {
                    p = fork();
                    if (p < 0) {
                            perror("fork");
                            exit(EXIT_FAILURE);
                    }
                    if (p == 0) {
                            create_set();
                            goto end;
                    }

                    /* Wait for forked processes to exit */
                    while (wait(NULL)) {
                            if (errno == ECHILD)
                                    break;
                    };
            }
    end:
            return 0;
    }

[akpm@linux-foundation.org: use normal comment layout]
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Rafael Aquini <aquini@redhat.com>
CC: Aristeu Rozanski <aris@redhat.com>
Cc: David Jeffery <djeffery@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held
Wanpeng Li [Fri, 14 Aug 2015 22:34:59 +0000 (15:34 -0700)]
mm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held

Hugetlbfs pages will get a refcount in get_any_page() or
madvise_hwpoison() if soft offlining through madvise.  The refcount which
is held by the soft offline path should be released if we fail to isolate
hugetlbfs pages.

Fix it by reducing the refcount for both isolation success and failure.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org> [3.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm/hwpoison: fix page refcount of unknown non LRU page
Wanpeng Li [Fri, 14 Aug 2015 22:34:56 +0000 (15:34 -0700)]
mm/hwpoison: fix page refcount of unknown non LRU page

After trying to drain pages from pagevec/pageset, we try to get reference
count of the page again, however, the reference count of the page is not
reduced if the page is still not on LRU list.

Fix it by adding the put_page() to drop the page reference which is from
__get_any_page().

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org> [3.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 14 Aug 2015 18:06:43 +0000 (11:06 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
 "A single clocksource driver suspend/resume fix"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clockevents/drivers/sh_cmt: Only perform clocksource suspend/resume if enabled

8 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 14 Aug 2015 17:57:16 +0000 (10:57 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Misc fixes: PMU driver corner cases, tooling fixes, and an 'AUX'
  (Intel PT) race related core fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler
  perf/x86/intel: Fix memory leak on hot-plug allocation fail
  perf: Fix PERF_EVENT_IOC_PERIOD migration race
  perf: Fix double-free of the AUX buffer
  perf: Fix fasync handling on inherited events
  perf tools: Fix test build error when bindir contains double slash
  perf stat: Fix transaction lenght metrics
  perf: Fix running time accounting

8 years agoMerge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 14 Aug 2015 17:45:23 +0000 (10:45 -0700)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull locking fix from Ingo Molnar:
 "A single fix for a locking self-test crash"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/pvqspinlock: Fix kernel panic in locking-selftest

8 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 14 Aug 2015 17:39:32 +0000 (10:39 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Back from holidays, found these in the cracks: one nouveau revert, one
  vmwgfx locking fix and a bunch of exynos fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"
  drm/vmwgfx: Fix execbuf locking issues
  drm/exynos/fimc: fix runtime pm support
  drm/exynos/mixer: always update INT_EN cache
  drm/exynos/mixer: correct vsync configuration sequence
  drm/exynos/mixer: fix interrupt clearing
  drm/exynos/hdmi: fix edid memory leak
  drm/exynos: gsc: fix wrong bitwise operation for swap detection

8 years agoRevert "drm/nouveau/fifo/gk104: kick channels when deactivating them"
Alexandre Courbot [Wed, 12 Aug 2015 04:17:38 +0000 (13:17 +0900)]
Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"

This reverts commit 1addc1264852

This commit seems to cause crashes in gk104_fifo_intr_runlist() by
returning 0xbad0da00 when register 0x2a00 is read. Since this commit was
intended for GM20B which is not completely supported yet, let's revert
it for the time being.

Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Tested-by: Afzal Mohammed <afzal.mohd.ma@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agodrm/vmwgfx: Fix execbuf locking issues
Thomas Hellstrom [Wed, 12 Aug 2015 05:31:17 +0000 (22:31 -0700)]
drm/vmwgfx: Fix execbuf locking issues

This addresses two issues that cause problems with viewperf maya-03 in
situation with memory pressure.

The first issue causes attempts to unreserve buffers if batched
reservation fails due to, for example, a signal pending. While previously
the ttm_eu api was resistant against this type of error, it is no longer
and the lockdep code will complain about attempting to unreserve buffers
that are not reserved. The issue is resolved by avoid calling
ttm_eu_backoff_reservation in the buffer reserve error path.

The second issue is that the binding_mutex may be held when user-space
fence objects are created and hence during memory reclaims. This may cause
recursive attempts to grab the binding mutex. The issue is resolved by not
holding the binding mutex across fence creation and submission.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoMerge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Dave Airlie [Thu, 13 Aug 2015 23:47:07 +0000 (09:47 +1000)]
Merge branch 'exynos-drm-fixes' of git://git./linux/kernel/git/daeinki/drm-exynos into drm-fixes

   This pull request fixes memory leak and some issues related to
   mixer and gscaler driver issues.

* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
  drm/exynos/fimc: fix runtime pm support
  drm/exynos/mixer: always update INT_EN cache
  drm/exynos/mixer: correct vsync configuration sequence
  drm/exynos/mixer: fix interrupt clearing
  drm/exynos/hdmi: fix edid memory leak
  drm/exynos: gsc: fix wrong bitwise operation for swap detection

8 years agoMerge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Linus Torvalds [Thu, 13 Aug 2015 23:34:56 +0000 (16:34 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
 "Another few small ARM fixes, mostly addressing some VDSO issues"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8410/1: VDSO: fix coarse clock monotonicity regression
  ARM: 8409/1: Mark ret_fast_syscall as a function
  ARM: 8408/1: Fix the secondary_startup function in Big Endian case
  ARM: 8405/1: VDSO: fix regression with toolchains lacking ld.bfd executable

8 years agox86: fix error handling for 32-bit compat out-of-range system call numbers
Linus Torvalds [Thu, 13 Aug 2015 23:19:44 +0000 (16:19 -0700)]
x86: fix error handling for 32-bit compat out-of-range system call numbers

Commit 3f5159a9221f ("x86/asm/entry/32: Update -ENOSYS handling to match
the 64-bit logic") broke the ENOSYS handling for the 32-bit compat case.
The proper error return value was never loaded into %rax, except if
things just happened to go through the audit paths, which ended up
reloading the return value.

This moves the loading or %rax into the normal system call path, just to
make sure the error case triggers it.  It's kind of sad, since it adds a
useless instruction to reload the register to the fast path, but it's
not like that single load from the stack is going to be noticeable.

Reported-by: David Drysdale <drysdale@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device...
Linus Torvalds [Thu, 13 Aug 2015 20:52:46 +0000 (13:52 -0700)]
Merge tag 'dm-4.2-fixes-5' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - two stable fixes for corruption seen in a snapshot of thinp metadata;
   metadata snapshots aren't widely used but help provide a consistent
   view of the metadata associated with an active thin-pool.

 - a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"

* tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm cache policy smq: move 'dm-cache-default' module alias to SMQ
  dm btree: add ref counting ops for the leaves of top level btrees
  dm thin metadata: delete btrees when releasing metadata snapshot

8 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 13 Aug 2015 20:44:32 +0000 (13:44 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull xen block driver fixes from Jens Axboe:
 "A few small bug fixes for xen-blk{front,back} that have been sitting
  over my vacation"

* 'for-linus' of git://git.kernel.dk/linux-block:
  xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
  xen-blkfront: don't add indirect pages to list when !feature_persistent
  xen-blkfront: introduce blkfront_gather_backend_features()

8 years agoMerge tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 13 Aug 2015 20:36:22 +0000 (13:36 -0700)]
Merge tag 'for-linus-4.2-rc6-tag' of git://git./linux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:

 - revert a fix from 4.2-rc5 that was causing lots of WARNING spam.

 - fix a memory leak affecting backends in HVM guests.

 - fix PV domU hang with certain configurations.

* tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
  Revert "xen/events/fifo: Handle linked events when closing a port"
  x86/xen: build "Xen PV" APIC driver for domU as well

8 years agoRevert x86 sigcontext cleanups
Linus Torvalds [Thu, 13 Aug 2015 15:25:20 +0000 (08:25 -0700)]
Revert x86 sigcontext cleanups

This reverts commits 9a036b93a344 ("x86/signal/64: Remove 'fs' and 'gs'
from sigcontext") and c6f2062935c8 ("x86/signal/64: Fix SS handling for
signals delivered to 64-bit programs").

They were cleanups, but they break dosemu by changing the signal return
behavior (and removing 'fs' and 'gs' from the sigcontext struct - while
not actually changing any behavior - causes build problems).

Reported-and-tested-by: Stas Sergeev <stsp@list.ru>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 13 Aug 2015 17:46:39 +0000 (10:46 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Workaround hw bug when acquiring PCI bos ownership of iwlwifi
    devices, from Emmanuel Grumbach.

 2) Falling back to vmalloc in conntrack should not emit a warning, from
    Pablo Neira Ayuso.

 3) Fix NULL deref when rtlwifi driver is used as an AP, from Luis
    Felipe Dominguez Vega.

 4) Rocker doesn't free netdev on device removal, from Ido Schimmel.

 5) UDP multicast early sock demux has route handling races, from Eric
    Dumazet.

 6) Fix L4 checksum handling in openvswitch, from Glenn Griffin.

 7) Fix use-after-free in skb_set_peeked, from Herbert Xu.

 8) Don't advertize NETIF_F_FRAGLIST in virtio_net driver, this can lead
    to fraglists longer than the driver can support.  From Jason Wang.

 9) Fix mlx5 on non-4k-pagesize systems, from Carol L Soto.

10) Fix interrupt storm in bna driver, from Ivan Vecera.

11) Don't propagate -EBUSY from netlink_insert(), from Daniel Borkmann.

12) Fix inet request sock leak, from Eric Dumazet.

13) Fix TX interrupt masking and marking in TX descriptors of fs_enet
    driver, from LEROY Christophe.

14) Get rid of rule optimizer in gianfar driver, it's buggy and unlikely
    to get fixed any time soon.  From Jakub Kicinski

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
  cosa: missing error code on failure in probe()
  gianfar: remove faulty filer optimizer
  gianfar: correct list membership accounting
  gianfar: correct filer table writing
  bonding: Gratuitous ARP gets dropped when first slave added
  net: dsa: Do not override PHY interface if already configured
  net: fs_enet: mask interrupts for TX partial frames.
  net: fs_enet: explicitly remove I flag on TX partial frames
  inet: fix possible request socket leak
  inet: fix races with reqsk timers
  mkiss: Fix error handling in mkiss_open()
  bnx2x: Free NVRAM lock at end of each page
  bnx2x: Prevent null pointer dereference on SKB release
  cxgb4: missing curly braces in t4_setup_debugfs()
  net-timestamp: Update skb_complete_tx_timestamp comment
  ipv6: don't reject link-local nexthop on other interface
  netlink: make sure -EBUSY won't escape from netlink_insert
  bna: fix interrupts storm caused by erroneous packets
  net: mvpp2: replace TX coalescing interrupts with hrtimer
  net: mvpp2: enable proper per-CPU TX buffers unmapping
  ...

8 years agoMerge tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Linus Torvalds [Thu, 13 Aug 2015 17:22:11 +0000 (10:22 -0700)]
Merge tag 'edac_fix_for_4.2' of git://git./linux/kernel/git/bp/bp

Pull EDAC fix from Borislav Petkov:
 "A ppc4xx_edac fix for accessing ->csrows properly.  This driver was
  missed during the conversion a couple of years ago"

* tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, ppc4xx: Access mci->csrows array elements properly

8 years agoARM: dts: keystone: Fix the mdio bindings by moving it to soc specific file
Murali Karicheri [Mon, 10 Aug 2015 03:06:27 +0000 (20:06 -0700)]
ARM: dts: keystone: Fix the mdio bindings by moving it to soc specific file

Currently mdio bindings are defined in keystone.dtsi and this results
in incorrect unit address for the node on K2E and K2L SoCs. Fix this
by moving them to SoC specific DTS file.

Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
8 years agoARM: dts: keystone: fix the clock node for mdio
Murali Karicheri [Mon, 10 Aug 2015 03:06:27 +0000 (20:06 -0700)]
ARM: dts: keystone: fix the clock node for mdio

Currently the MDIO clock is pointing to clkpa instead of clkcpgmac.
MDIO is part of the ethss and the clock should be clkcpgmac.

Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
8 years agoMerge tag 'omap-for-v4.2/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Thu, 13 Aug 2015 12:27:12 +0000 (14:27 +0200)]
Merge tag 'omap-for-v4.2/fixes-rc6' of git://git./linux/kernel/git/tmlind/linux-omap into fixes

Fix a NULL pointer exception for omap GPMC bus code if probe fails.

* tag 'omap-for-v4.2/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  memory: omap-gpmc: Don't try to save uninitialized GPMC context

Signed-off-by: Olof Johansson <olof@lixom.net>
8 years agoMerge tag 'imx-fixes-4.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo...
Olof Johansson [Thu, 13 Aug 2015 10:24:55 +0000 (12:24 +0200)]
Merge tag 'imx-fixes-4.2-3' of git://git./linux/kernel/git/shawnguo/linux into fixes

The i.MX fixes for 4.2, 3rd round:
 - Fix i.MX6 PCIe interrupt routing which gets missed from stacked IRQ
   domain conversion.  The PCIe wakeup support is currently broken
   because of this.

* tag 'imx-fixes-4.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: imx6: correct i.MX6 PCIe interrupt routing

Signed-off-by: Olof Johansson <olof@lixom.net>
8 years agoMerge tag 'samsung-mach-fixes-4.2' of https://github.com/krzk/linux into fixes
Olof Johansson [Thu, 13 Aug 2015 10:18:45 +0000 (12:18 +0200)]
Merge tag 'samsung-mach-fixes-4.2' of https://github.com/krzk/linux into fixes

Two fixes for bugs in Exynos power domain error exit path:
1. kfree() of read-only memory (name of power domain returned
   by kstrdup_const()),
2. Doubled of_node_put() leading to invalid ref count for OF node.

* tag 'samsung-mach-fixes-4.2' of https://github.com/krzk/linux:
  ARM: EXYNOS: fix double of_node_put() on error path
  ARM: EXYNOS: Fix potentian kfree() of ro memory

Signed-off-by: Olof Johansson <olof@lixom.net>
8 years agoEDAC, ppc4xx: Access mci->csrows array elements properly
Michael Walle [Tue, 21 Jul 2015 09:00:53 +0000 (11:00 +0200)]
EDAC, ppc4xx: Access mci->csrows array elements properly

The commit

  de3910eb79ac ("edac: change the mem allocation scheme to
 make Documentation/kobject.txt happy")

changed the memory allocation for the csrows member. But ppc4xx_edac was
forgotten in the patch. Fix it.

Signed-off-by: Michael Walle <michael@walle.cc>
Cc: <stable@vger.kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Link: http://lkml.kernel.org/r/1437469253-8611-1-git-send-email-michael@walle.cc
Signed-off-by: Borislav Petkov <bp@suse.de>
8 years agocosa: missing error code on failure in probe()
Dan Carpenter [Wed, 12 Aug 2015 21:08:01 +0000 (00:08 +0300)]
cosa: missing error code on failure in probe()

If register_hdlc_device() fails, the current code returns 0 but we
should return an error code instead.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'gianfar-fixes'
David S. Miller [Wed, 12 Aug 2015 21:47:06 +0000 (14:47 -0700)]
Merge branch 'gianfar-fixes'

Jakub Kicinski says:

====================
gianfar: filer changes

respinning with examples as requested.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agogianfar: remove faulty filer optimizer
Jakub Kicinski [Wed, 12 Aug 2015 00:41:57 +0000 (02:41 +0200)]
gianfar: remove faulty filer optimizer

Current filer rule optimization is broken in several ways:
 (1) Can perform reads/writes beyond end of allocated tables.
     (gianfar_ethtool.c:1326).

(2) It breaks badly for rules with more than 2 specifiers
     (e.g. matching ip, port, tos).

Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 tos 1 action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 tos 2 action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 tos 3 action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND         Q:00           ctrl:00000080 prop:00000210
01: FPR  == 00000210 AND CLE     Q:00           ctrl:00000281 prop:00000210
02: MASK == ffffffff AND         Q:00           ctrl:00000080 prop:ffffffff
03: DPT  == 00000003 AND         Q:00           ctrl:0000008e prop:00000003
04: TOS  == 00000003 AND         Q:00           ctrl:0000008a prop:00000003
05: DIA  == 0a000003 AND         Q:11           ctrl:0000448c prop:0a000003
06: DPT  == 00000002 AND         Q:00           ctrl:0000008e prop:00000002
07: TOS  == 00000002 AND         Q:00           ctrl:0000008a prop:00000002
08: DIA  == 0a000002 AND         Q:09           ctrl:0000248c prop:0a000002
09: DIA  == 0a000001 AND         Q:00           ctrl:0000008c prop:0a000001
0a: DPT  == 00000001 AND         Q:00           ctrl:0000008e prop:00000001
0b: TOS  == 00000001     CLE     Q:01           ctrl:0000060a prop:00000001
ff: MASK >= 00000000             Q:00           ctrl:00000020 prop:00000000

(Entire cluster gets AND-ed together).

 (3) We observed that the masking rules it generates do not
     play well with clustering on P2020.  Only first rule
     of the cluster would ever fire.  Given that optimizer
     relies heavily on masking this is very hard to fix.

Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1  action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2  action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3  action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND         Q:00           ctrl:00000080 prop:00000210
01: FPR  == 00000210 AND CLE     Q:00           ctrl:00000281 prop:00000210
02: MASK == ffffffff AND         Q:00           ctrl:00000080 prop:ffffffff
03: DPT  == 00000003 AND         Q:00           ctrl:0000008e prop:00000003
04: DIA  == 0a000003             Q:11           ctrl:0000440c prop:0a000003
05: DPT  == 00000002 AND         Q:00           ctrl:0000008e prop:00000002
06: DIA  == 0a000002             Q:09           ctrl:0000240c prop:0a000002
07: DIA  == 0a000001 AND         Q:00           ctrl:0000008c prop:0a000001
08: DPT  == 00000001     CLE     Q:01           ctrl:0000060e prop:00000001
ff: MASK >= 00000000             Q:00           ctrl:00000020 prop:00000000

Which looks correct according to the spec but only the first
(eth id 252)/last added rule for 10.0.0.3 will ever trigger.
As if filer did not treat the AND CLE as cluster start but
also kept AND-ing the rules.  We found no errata covering this.

The fact that nobody noticed (2) or (3) makes me think
that this feature is not very widely used and we should just
remove it.

Reported-by: Aleksander Dutkowski <adutkowski@gmail.com>
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Acked-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agogianfar: correct list membership accounting
Jakub Kicinski [Wed, 12 Aug 2015 00:41:56 +0000 (02:41 +0200)]
gianfar: correct list membership accounting

At a cost of one line let's make sure .count is correct
when calling gfar_process_filer_changes().

Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agogianfar: correct filer table writing
Jakub Kicinski [Wed, 12 Aug 2015 00:41:55 +0000 (02:41 +0200)]
gianfar: correct filer table writing

MAX_FILER_IDX is the last usable index.  Using less-than
will already guarantee that one entry for catch-all rule
will be left, no need to subtract 1 here.

Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobonding: Gratuitous ARP gets dropped when first slave added
Venkat Venkatsubra [Tue, 11 Aug 2015 14:57:23 +0000 (07:57 -0700)]
bonding: Gratuitous ARP gets dropped when first slave added

When the first slave is added (such as during bootup) the first
gratuitous ARP gets dropped. We don't see this drop during a failover.
The packet gets dropped in qdisc (noop_enqueue).

The fix is to delay the sending of gratuitous ARPs till the bond dev's
carrier is present.

It can also be worked around by setting num_grat_arp to more than 1.

Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: dsa: Do not override PHY interface if already configured
Florian Fainelli [Sat, 8 Aug 2015 19:58:57 +0000 (12:58 -0700)]
net: dsa: Do not override PHY interface if already configured

In case we need to divert reads/writes using the slave MII bus, we may have
already fetched a valid PHY interface property from Device Tree, and that
mode is used by the PHY driver to make configuration decisions.

If we could not fetch the "phy-mode" property, we will assign p->phy_interface
to PHY_INTERFACE_MODE_NA, such that we can actually check for that condition as
to whether or not we should override the interface value.

Fixes: 19334920eaf7 ("net: dsa: Set valid phy interface type")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosd: Fix maximum I/O size for BLOCK_PC requests
Martin K. Petersen [Tue, 23 Jun 2015 16:13:59 +0000 (12:13 -0400)]
sd: Fix maximum I/O size for BLOCK_PC requests

Commit bcdb247c6b6a ("sd: Limit transfer length") clamped the maximum
size of an I/O request to the MAXIMUM TRANSFER LENGTH field in the BLOCK
LIMITS VPD. This had the unfortunate effect of also limiting the maximum
size of non-filesystem requests sent to the device through sg/bsg.

Avoid using blk_queue_max_hw_sectors() and set the max_sectors queue
limit directly.

Also update the comment in blk_limits_max_hw_sectors() to clarify that
max_hw_sectors defines the limit for the I/O controller only.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Brian King <brking@linux.vnet.ibm.com>
Tested-by: Brian King <brking@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 3.17+
Signed-off-by: James Bottomley <JBottomley@Odin.com>
8 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Wed, 12 Aug 2015 18:25:01 +0000 (11:25 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Fix coarse clock monotonicity (VDSO timestamp off by one jiffy
  compared to the syscall one)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: VDSO: fix coarse clock monotonicity regression

8 years agolibfc: Fix fc_fcp_cleanup_each_cmd()
Bart Van Assche [Fri, 5 Jun 2015 21:20:51 +0000 (14:20 -0700)]
libfc: Fix fc_fcp_cleanup_each_cmd()

Since fc_fcp_cleanup_cmd() can sleep this function must not
be called while holding a spinlock. This patch avoids that
fc_fcp_cleanup_each_cmd() triggers the following bug:

BUG: scheduling while atomic: sg_reset/1512/0x00000202
1 lock held by sg_reset/1512:
 #0:  (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Call Trace:
 [<ffffffff816c612c>] dump_stack+0x4f/0x7b
 [<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
 [<ffffffff816c87aa>] __schedule+0x71a/0xa10
 [<ffffffff816c8ad2>] schedule+0x32/0x80
 [<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
 [<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
 [<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
 [<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
 [<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
 [<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
 [<ffffffff814a2650>] scsi_ioctl+0x150/0x440
 [<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
 [<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
 [<ffffffff811da608>] block_ioctl+0x38/0x40
 [<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
 [<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
 [<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
8 years agolibfc: Fix fc_exch_recv_req() error path
Bart Van Assche [Fri, 5 Jun 2015 21:20:46 +0000 (14:20 -0700)]
libfc: Fix fc_exch_recv_req() error path

Due to patch "libfc: Do not invoke the response handler after
fc_exch_done()" (commit ID 7030fd62) the lport_recv() call
in fc_exch_recv_req() is passed a dangling pointer. Avoid this
by moving the fc_frame_free() call from fc_invoke_resp() to its
callers. This patch fixes the following crash:

general protection fault: 0000 [#3] PREEMPT SMP
RIP: fc_lport_recv_req+0x72/0x280 [libfc]
Call Trace:
 fc_exch_recv+0x642/0xde0 [libfc]
 fcoe_percpu_receive_thread+0x46a/0x5ed [fcoe]
 kthread+0x10a/0x120
 ret_from_fork+0x42/0x70

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
8 years agoMerge branch 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux
Linus Torvalds [Wed, 12 Aug 2015 18:13:54 +0000 (11:13 -0700)]
Merge branch 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux

Pull amd drm fixes from Alex Deucher:
 "Dave is on vacation at the moment, so please pull these radeon and
  amdgpu fixes directly.

  Just a few minor things for 4.2:

   - add a new radeon pci id
   - fix a power management regression in amdgpu
   - fix HEVC command buffer validation in amdgpu"

* 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: add new OLAND pci id
  Revert "drm/amdgpu: Configure doorbell to maximum slots"
  drm/amdgpu: add context buffer size check for HEVC

8 years agolibiscsi: Fix host busy blocking during connection teardown
John Soni Jose [Wed, 24 Jun 2015 01:11:58 +0000 (06:41 +0530)]
libiscsi: Fix host busy blocking during connection teardown

In case of hw iscsi offload, an host can have N-number of active
connections. There can be IO's running on some connections which
make host->host_busy always TRUE. Now if logout from a connection
is tried then the code gets into an infinite loop as host->host_busy
is always TRUE.

 iscsi_conn_teardown(....)
 {
   .........
    /*
     * Block until all in-progress commands for this connection
     * time out or fail.
     */
     for (;;) {
      spin_lock_irqsave(session->host->host_lock, flags);
      if (!atomic_read(&session->host->host_busy)) { /* OK for ERL == 0 */
      spin_unlock_irqrestore(session->host->host_lock, flags);
              break;
      }
     spin_unlock_irqrestore(session->host->host_lock, flags);
     msleep_interruptible(500);
     iscsi_conn_printk(KERN_INFO, conn, "iscsi conn_destroy(): "
                 "host_busy %d host_failed %d\n",
          atomic_read(&session->host->host_busy),
          session->host->host_failed);

................
...............
     }
  }

This is not an issue with software-iscsi/iser as each cxn is a separate
host.

Fix:
Acquiring eh_mutex in iscsi_conn_teardown() before setting
session->state = ISCSI_STATE_TERMINATE.

Signed-off-by: John Soni Jose <sony.john@avagotech.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Chris Leech <cleech@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Odin.com>
8 years agodrm/radeon: add new OLAND pci id
Alex Deucher [Mon, 10 Aug 2015 19:28:49 +0000 (15:28 -0400)]
drm/radeon: add new OLAND pci id

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
8 years agoRevert "drm/amdgpu: Configure doorbell to maximum slots"
Alex Deucher [Mon, 10 Aug 2015 15:08:31 +0000 (11:08 -0400)]
Revert "drm/amdgpu: Configure doorbell to maximum slots"

This reverts commit 78ad5cdd21f0d614983fc397338944e797ec70b9.
This commit breaks dpm and suspend/resume on CZ.

8 years agodrm/amdgpu: add context buffer size check for HEVC
Boyuan Zhang [Wed, 5 Aug 2015 18:03:48 +0000 (14:03 -0400)]
drm/amdgpu: add context buffer size check for HEVC

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
8 years agoMerge tag 'regmap-fix-v4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 12 Aug 2015 16:06:39 +0000 (09:06 -0700)]
Merge tag 'regmap-fix-v4.2-rc6' of git://git./linux/kernel/git/broonie/regmap

Pull regmap fix from Mark Brown:
 "regmap: Fix handling of present bits on rbtree cache block resize

  When expanding a cache block we use krealloc() to resize the register
  present bitmap without initialising the newly allocated data (the
  original code was written for kzalloc()).  Add an appropraite memset()
  to fix that"

* tag 'regmap-fix-v4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: regcache-rbtree: Clean new present bits on present bitmap resize

8 years agodm cache policy smq: move 'dm-cache-default' module alias to SMQ
Yi Zhang [Wed, 12 Aug 2015 11:22:43 +0000 (19:22 +0800)]
dm cache policy smq: move 'dm-cache-default' module alias to SMQ

When creating dm-cache with the default policy, it will call
request_module("dm-cache-default") to register the default policy.
But the "dm-cache-default" alias was left referring to the MQ policy.
Fix this by moving the module alias to SMQ.

Fixes: bccab6a0 (dm cache: switch the "default" cache replacement policy from mq to smq)
Signed-off-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
8 years agodm btree: add ref counting ops for the leaves of top level btrees
Joe Thornber [Wed, 12 Aug 2015 14:12:09 +0000 (15:12 +0100)]
dm btree: add ref counting ops for the leaves of top level btrees

When using nested btrees, the top leaves of the top levels contain
block addresses for the root of the next tree down.  If we shadow a
shared leaf node the leaf values (sub tree roots) should be incremented
accordingly.

This is only an issue if there is metadata sharing in the top levels.
Which only occurs if metadata snapshots are being used (as is possible
with dm-thinp).  And could result in a block from the thinp metadata
snap being reused early, thus corrupting the thinp metadata snap.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
8 years agodm thin metadata: delete btrees when releasing metadata snapshot
Joe Thornber [Wed, 12 Aug 2015 14:10:21 +0000 (15:10 +0100)]
dm thin metadata: delete btrees when releasing metadata snapshot

The device details and mapping trees were just being decremented
before.  Now btree_del() is called to do a deep delete.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
8 years agoperf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler
Matt Fleming [Thu, 6 Aug 2015 12:12:43 +0000 (13:12 +0100)]
perf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler

Tony reports that booting his 144-cpu machine with maxcpus=10 triggers
the following WARN_ON():

[   21.045727] WARNING: CPU: 8 PID: 647 at arch/x86/kernel/cpu/perf_event_intel_cqm.c:1267 intel_cqm_cpu_prepare+0x75/0x90()
[   21.045744] CPU: 8 PID: 647 Comm: systemd-udevd Not tainted 4.2.0-rc4 #1
[   21.045745] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXSD1.86B.0066.R00.1506021730 06/02/2015
[   21.045747]  0000000000000000 0000000082771b09 ffff880856333ba8 ffffffff81669b67
[   21.045748]  0000000000000000 0000000000000000 ffff880856333be8 ffffffff8107b02a
[   21.045750]  ffff88085b789800 ffff88085f68a020 ffffffff819e2470 000000000000000a
[   21.045750] Call Trace:
[   21.045757]  [<ffffffff81669b67>] dump_stack+0x45/0x57
[   21.045759]  [<ffffffff8107b02a>] warn_slowpath_common+0x8a/0xc0
[   21.045761]  [<ffffffff8107b15a>] warn_slowpath_null+0x1a/0x20
[   21.045762]  [<ffffffff81036725>] intel_cqm_cpu_prepare+0x75/0x90
[   21.045764]  [<ffffffff81036872>] intel_cqm_cpu_notifier+0x42/0x160
[   21.045767]  [<ffffffff8109a33d>] notifier_call_chain+0x4d/0x80
[   21.045769]  [<ffffffff8109a44e>] __raw_notifier_call_chain+0xe/0x10
[   21.045770]  [<ffffffff8107b538>] _cpu_up+0xe8/0x190
[   21.045771]  [<ffffffff8107b65a>] cpu_up+0x7a/0xa0
[   21.045774]  [<ffffffff8165e920>] cpu_subsys_online+0x40/0x90
[   21.045777]  [<ffffffff81433b37>] device_online+0x67/0x90
[   21.045778]  [<ffffffff81433bea>] online_store+0x8a/0xa0
[   21.045782]  [<ffffffff81430e78>] dev_attr_store+0x18/0x30
[   21.045785]  [<ffffffff8126b6ba>] sysfs_kf_write+0x3a/0x50
[   21.045786]  [<ffffffff8126ad40>] kernfs_fop_write+0x120/0x170
[   21.045789]  [<ffffffff811f0b77>] __vfs_write+0x37/0x100
[   21.045791]  [<ffffffff811f38b8>] ? __sb_start_write+0x58/0x110
[   21.045795]  [<ffffffff81296d2d>] ? security_file_permission+0x3d/0xc0
[   21.045796]  [<ffffffff811f1279>] vfs_write+0xa9/0x190
[   21.045797]  [<ffffffff811f2075>] SyS_write+0x55/0xc0
[   21.045800]  [<ffffffff81067300>] ? do_page_fault+0x30/0x80
[   21.045804]  [<ffffffff816709ae>] entry_SYSCALL_64_fastpath+0x12/0x71
[   21.045805] ---[ end trace fe228b836d8af405 ]---

The root cause is that CPU_UP_PREPARE is completely the wrong notifier
action from which to access cpu_data(), because smp_store_cpu_info()
won't have been executed by the target CPU at that point, which in turn
means that ->x86_cache_max_rmid and ->x86_cache_occ_scale haven't been
filled out.

Instead let's invoke our handler from CPU_STARTING and rename it
appropriately.

Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Link: http://lkml.kernel.org/r/1438863163-14083-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86/intel: Fix memory leak on hot-plug allocation fail
Peter Zijlstra [Mon, 10 Aug 2015 12:17:34 +0000 (14:17 +0200)]
perf/x86/intel: Fix memory leak on hot-plug allocation fail

We fail to free the shared_regs allocation if the constraint_list
allocation fails.

Cure this and be more consistent in NULL-ing the pointers after free.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf: Fix PERF_EVENT_IOC_PERIOD migration race
Peter Zijlstra [Tue, 4 Aug 2015 17:22:49 +0000 (19:22 +0200)]
perf: Fix PERF_EVENT_IOC_PERIOD migration race

I ran the perf fuzzer, which triggered some WARN()s which are due to
trying to stop/restart an event on the wrong CPU.

Use the normal IPI pattern to ensure we run the code on the correct CPU.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: bad7192b842c ("perf: Fix PERF_EVENT_IOC_PERIOD to force-reset the period")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf: Fix double-free of the AUX buffer
Ben Hutchings [Sun, 26 Jul 2015 23:31:08 +0000 (00:31 +0100)]
perf: Fix double-free of the AUX buffer

If rb->aux_refcount is decremented to zero before rb->refcount,
__rb_free_aux() may be called twice resulting in a double free of
rb->aux_pages.  Fix this by adding a check to __rb_free_aux().

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 57ffc5ca679f ("perf: Fix AUX buffer refcounting")
Link: http://lkml.kernel.org/r/1437953468.12842.17.camel@decadent.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agomemory: omap-gpmc: Don't try to save uninitialized GPMC context omap-for-v4.2/fixes-rc6
Tomeu Vizoso [Wed, 5 Aug 2015 12:24:15 +0000 (14:24 +0200)]
memory: omap-gpmc: Don't try to save uninitialized GPMC context

If for some reason the GPMC device hasn't been probed yet, gpmc_base is
going to be NULL. Because there's no context yet to be saved, just turn
these functions into no-ops until that device gets probed.

Unable to handle kernel NULL pointer dereference at virtual address 00000010
pgd = c0204000
[00000010] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc5-next-20150804-05947-g23f38fe8eda9 #1
Hardware name: Generic OMAP3-GP (Flattened Device Tree)
task: c0e623e8 ti: c0e5c000 task.ti: c0e5c000
PC is at omap3_gpmc_save_context+0x8/0xc4
LR is at omap_sram_idle+0x154/0x23c
pc : [<c087c7ac>]    lr : [<c023262c>]    psr: 60000193
sp : c0e5df40  ip : c0f92a80  fp : c0999eb0
r10: c0e57364  r9 : c0e66f14  r8 : 00000003
r7 : 00000000  r6 : 00000003  r5 : 00000000  r4 : c0f5f174
r3 : c0fa4fe8  r2 : 00000000  r1 : 00000000  r0 : fa200280
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 80204019  DAC: 00000015
Process swapper/0 (pid: 0, stack limit = 0xc0e5c220)
Stack: (0xc0e5df40 to 0xc0e5e000)
df40: 00000000 c0e66ef8 c0f5f1a4 00000000 00000003 c02333a4 c3813822 00000000
df60: 00000000 c0e5a5c8 cfb8a5d0 c07f0c44 0e4f1d7e 00000000 00000000 00000000
df80: c3813822 00000000 cfb8a5d0 c0e5e4e4 cfb8a5d0 c0e66f14 c0e5a5c8 c0e5e54c
dfa0: c0e5e544 c0e57364 c0999eb0 c0277758 000000fa c0f5d000 00000000 c0d61c18
dfc0: ffffffff ffffffff 00000000 c0d61674 00000000 c0df7a48 00000000 c0f5d5d4
dfe0: c0e5e4c0 c0df7a44 c0e634f8 80204059 00000000 8020807c 00000000 00000000
[<c087c7ac>] (omap3_gpmc_save_context) from [<c023262c>] (omap_sram_idle+0x154/0x23c)
[<c023262c>] (omap_sram_idle) from [<c02333a4>] (omap3_enter_idle_bm+0xec/0x1a8)
[<c02333a4>] (omap3_enter_idle_bm) from [<c07f0c44>] (cpuidle_enter_state+0xbc/0x284)
[<c07f0c44>] (cpuidle_enter_state) from [<c0277758>] (cpu_startup_entry+0x174/0x24c)
[<c0277758>] (cpu_startup_entry) from [<c0d61c18>] (start_kernel+0x358/0x3c0)
[<c0d61c18>] (start_kernel) from [<8020807c>] (0x8020807c)
Code: c0ccace8 c0ccacc0 e59f30b4 e5932000 (e5921010)

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Suggested-by: Javier Martinez Canillas <javier@dowhile0.org>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Roger Quadros <rogerq@ti.com>
[tony@atomide.com: updated description as suggested by Javier]
Signed-off-by: Tony Lindgren <tony@atomide.com>
8 years agoMerge tag 'localmodconfig-v4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 11 Aug 2015 22:13:41 +0000 (15:13 -0700)]
Merge tag 'localmodconfig-v4.2-rc6' of git://git./linux/kernel/git/rostedt/linux-kconfig

Pull localmodconfig fix from Steven Rostedt:
 "Leonidas Spyropoulos found that modules like nouveau were being
  unselected by make localmodconfig even though their configs were set
  and the module was loaded and visible by lsmod.

  The reason for this was because streamline-config.pl only looks at
  Makefiles, and not Kbuild files.  As these modules use Kbuild for
  their names, they too need to be checked by localmodconfig.  This was
  fixed by Richard Weinberger"

* tag 'localmodconfig-v4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-kconfig:
  localmodconfig: Use Kbuild files too

8 years agolocalmodconfig: Use Kbuild files too
Richard Weinberger [Sun, 26 Jul 2015 22:06:55 +0000 (00:06 +0200)]
localmodconfig: Use Kbuild files too

In kbuild it is allowed to define objects in files named "Makefile"
and "Kbuild".
Currently localmodconfig reads objects only from "Makefile"s and misses
modules like nouveau.

Link: http://lkml.kernel.org/r/1437948415-16290-1-git-send-email-richard@nod.at
Cc: stable@vger.kernel.org
Reported-and-tested-by: Leonidas Spyropoulos <artafinde@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
8 years agoMerge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetoot...
David S. Miller [Tue, 11 Aug 2015 21:16:07 +0000 (14:16 -0700)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth

Johan Hedberg says:

====================
pull request: bluetooth 2015-08-11

Here's an important regression fix for the 4.2-rc series that ensures
user space isn't given invalid LTK values. The bug essentially prevents
the encryption of subsequent LE connections, i.e. makes it impossible to
pair devices over LE.

Let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: fs_enet: mask interrupts for TX partial frames.
LEROY Christophe [Tue, 11 Aug 2015 10:11:03 +0000 (12:11 +0200)]
net: fs_enet: mask interrupts for TX partial frames.

We are not interested in interrupts for partially transmitted frames.
Unlike SCC and FCC, the FEC doesn't handle the I bit in buffer
descriptors, instead it defines two interrupt bits, TXB and TXF.

We have to mask TXB in order to only get interrupts once the
frame is fully transmitted.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: fs_enet: explicitly remove I flag on TX partial frames
LEROY Christophe [Tue, 11 Aug 2015 10:11:00 +0000 (12:11 +0200)]
net: fs_enet: explicitly remove I flag on TX partial frames

We are not interested in interrupts for partially transmitted frames,
we have to clear BD_ENET_TX_INTR explicitly otherwise it may remain
from a previously used descriptor.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge tag 'fbdev-fixes-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba...
Linus Torvalds [Tue, 11 Aug 2015 17:23:59 +0000 (10:23 -0700)]
Merge tag 'fbdev-fixes-4.2' of git://git./linux/kernel/git/tomba/linux

Pull fbdev fixes from Tomi Valkeinen:
 - fix display regression on Versatile boards
 - fix OF node refcount bugs on omapdss
 - fix WARN about clock prepare on pxa3xx_gcu
 - fix mem leak in videomode helpers
 - fix fbconsole related boot problem on sun7i-a20-olinuxino-micro

* tag 'fbdev-fixes-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
  fbcon: unconditionally initialize cursor blink interval
  video: Fix possible leak in of_get_videomode()
  video: fbdev: pxa3xx_gcu: prepare the clocks
  OMAPDSS: Fix omap_dss_find_output_by_port_node() port refcount decrement
  OMAPDSS: Fix node refcount leak in omapdss_of_get_next_port()
  fbdev: select versatile helpers for the integrator

8 years agoMerge tag 'omap-for-v4.2/fixes-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Tue, 11 Aug 2015 13:22:10 +0000 (15:22 +0200)]
Merge tag 'omap-for-v4.2/fixes-rc5' of git://git./linux/kernel/git/tmlind/linux-omap into fixes

Few trivial omap MMC regression fixes for card voltages where
the syscon areas for PBIAS regulator were missing "simple-bus"
that prevents probing of the children in the mapped region.

This probably was not noticed earlier as the bootloader has
already configured the regulator for the card in the slot.

* tag 'omap-for-v4.2/fixes-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: dra7: Fix broken pbias device creation
  ARM: dts: OMAP5: Fix broken pbias device creation
  ARM: dts: OMAP4: Fix broken pbias device creation
  ARM: dts: omap243x: Fix broken pbias device creation

Signed-off-by: Olof Johansson <olof@lixom.net>
8 years agoARM: 8410/1: VDSO: fix coarse clock monotonicity regression
Nathan Lynch [Mon, 10 Aug 2015 16:36:06 +0000 (17:36 +0100)]
ARM: 8410/1: VDSO: fix coarse clock monotonicity regression

Since 906c55579a63 ("timekeeping: Copy the shadow-timekeeper over the
real timekeeper last") it has become possible on ARM to:

- Obtain a CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE timestamp
  via syscall.
- Subsequently obtain a timestamp for the same clock ID via VDSO which
  predates the first timestamp (by one jiffy).

This is because ARM's update_vsyscall is deriving the coarse time
using the __current_kernel_time interface, when it should really be
using the timekeeper object provided to it by the timekeeping core.
It happened to work before only because __current_kernel_time would
access the same timekeeper object which had been passed to
update_vsyscall.  This is no longer the case.

Cc: stable@vger.kernel.org
Fixes: 906c55579a63 ("timekeeping: Copy the shadow-timekeeper over the real timekeeper last")
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
8 years agoxen/xenbus: Don't leak memory when unmapping the ring on HVM backend
Julien Grall [Mon, 10 Aug 2015 18:10:38 +0000 (19:10 +0100)]
xen/xenbus: Don't leak memory when unmapping the ring on HVM backend

The commit ccc9d90a9a8b5c4ad7e9708ec41f75ff9e98d61d "xenbus_client:
Extend interface to support multi-page ring" removes the call to
free_xenballooned_pages() in xenbus_unmap_ring_vfree_hvm(), leaking a
page for every shared ring.

Only with backends running in HVM domains were affected.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
8 years agoRevert "xen/events/fifo: Handle linked events when closing a port"
David Vrabel [Mon, 10 Aug 2015 17:11:06 +0000 (18:11 +0100)]
Revert "xen/events/fifo: Handle linked events when closing a port"

This reverts commit fcdf31a7c162de0c93a2bee51df4688ab0a348f8.

This was causing a WARNING whenever a PIRQ was closed since
shutdown_pirq() is called with irqs disabled.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: <stable@vger.kernel.org>