pandora-kernel.git
12 years agoext4: replace cut'n'pasted llseek code with generic_file_llseek_size
Andi Kleen [Thu, 15 Sep 2011 23:06:51 +0000 (16:06 -0700)]
ext4: replace cut'n'pasted llseek code with generic_file_llseek_size

This gives ext4 the benefits of unlocked llseek.

Cc: tytso@mit.edu
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agovfs: add generic_file_llseek_size
Andi Kleen [Thu, 15 Sep 2011 23:06:50 +0000 (16:06 -0700)]
vfs: add generic_file_llseek_size

Add a generic_file_llseek variant to the VFS that allows passing in
the maximum file size of the file system, instead of always
using maxbytes from the superblock.

This can be used to eliminate some cut'n'paste seek code in ext4.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agovfs: do (nearly) lockless generic_file_llseek
Andi Kleen [Thu, 15 Sep 2011 23:06:48 +0000 (16:06 -0700)]
vfs: do (nearly) lockless generic_file_llseek

The i_mutex lock use of generic _file_llseek hurts.  Independent processes
accessing the same file synchronize over a single lock, even though
they have no need for synchronization at all.

Under high utilization this can cause llseek to scale very poorly on larger
systems.

This patch does some rethinking of the llseek locking model:

First the 64bit f_pos is not necessarily atomic without locks
on 32bit systems. This can already cause races with read() today.
This was discussed on linux-kernel in the past and deemed acceptable.
The patch does not change that.

Let's look at the different seek variants:

SEEK_SET: Doesn't really need any locking.
If there's a race one writer wins, the other loses.

For 32bit the non atomic update races against read()
stay the same. Without a lock they can also happen
against write() now.  The read() race was deemed
acceptable in past discussions, and I think if it's
ok for read it's ok for write too.

=> Don't need a lock.

SEEK_END: This behaves like SEEK_SET plus it reads
the maximum size too. Reading the maximum size would have the
32bit atomic problem. But luckily we already have a way to read
the maximum size without locking (i_size_read), so we
can just use that instead.

Without i_mutex there is no synchronization with write() anymore,
however since the write() update is atomic on 64bit it just behaves
like another racy SEEK_SET.  On non atomic 32bit it's the same
as SEEK_SET.

=> Don't need a lock, but need to use i_size_read()

SEEK_CUR: This has a read-modify-write race window
on the same file. One could argue that any application
doing unsynchronized seeks on the same file is already broken.
But for the sake of not adding a regression here I'm
using the file->f_lock to synchronize this. Using this
lock is much better than the inode mutex because it doesn't
synchronize between processes.

=> So still need a lock, but can use a f_lock.

This patch implements this new scheme in generic_file_llseek.
I dropped generic_file_llseek_unlocked and changed all callers.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agodirect-io: merge direct_io_walker into __blockdev_direct_IO
Andi Kleen [Tue, 2 Aug 2011 04:38:09 +0000 (21:38 -0700)]
direct-io: merge direct_io_walker into __blockdev_direct_IO

This doesn't change anything for the compiler, but hch thought it would
make the code clearer.

I moved the reference counting into its own little inline.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agodirect-io: inline the complete submission path
Andi Kleen [Tue, 2 Aug 2011 04:38:08 +0000 (21:38 -0700)]
direct-io: inline the complete submission path

Add inlines to all the submission path functions. While this increases
code size it also gives gcc a lot of optimization opportunities
in this critical hotpath.

In particular -- together with some other changes -- this
allows gcc to get rid of the unnecessary clearing of
sdio at the beginning and optimize the messy parameter passing.
Any non inlining of a function which takes a sdio parameter
would break this optimization because they cannot be done if the
address of a structure is taken.

Note that benefits are only seen with CONFIG_OPTIMIZE_INLINING
and CONFIG_CC_OPTIMIZE_FOR_SIZE both set to off.

This gives about 2.2% improvement on a large database benchmark
with a high IOPS rate.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agodirect-io: separate map_bh from dio
Andi Kleen [Tue, 2 Aug 2011 04:38:07 +0000 (21:38 -0700)]
direct-io: separate map_bh from dio

Only a single b_private field in the map_bh buffer head is needed after
the submission path. Move map_bh separately to avoid storing
this information in the long term slab.

This avoids the weird 104 byte hole in struct dio_submit which also needed
to be memseted early.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agodirect-io: use a slab cache for struct dio
Andi Kleen [Tue, 2 Aug 2011 04:38:06 +0000 (21:38 -0700)]
direct-io: use a slab cache for struct dio

A direct slab call is slightly faster than kmalloc and can be better cached
per CPU. It also avoids rounding to the next kmalloc slab.

In addition this enforces cache line alignment for struct dio to avoid
any false sharing.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agodirect-io: rearrange fields in dio/dio_submit to avoid holes
Andi Kleen [Tue, 2 Aug 2011 04:38:05 +0000 (21:38 -0700)]
direct-io: rearrange fields in dio/dio_submit to avoid holes

Fix most problems reported by pahole.

There is still a weird 104 byte hole after map_bh. I'm not sure what
causes this.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agodirect-io: fix a wrong comment
Andi Kleen [Tue, 2 Aug 2011 04:38:04 +0000 (21:38 -0700)]
direct-io: fix a wrong comment

There's nothing on the stack, even before my changes.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agodirect-io: separate fields only used in the submission path from struct dio
Andi Kleen [Tue, 2 Aug 2011 04:38:03 +0000 (21:38 -0700)]
direct-io: separate fields only used in the submission path from struct dio

This large, but largely mechanic, patch moves all fields in struct dio
that are only used in the submission path into a separate on stack
data structure. This has the advantage that the memory is very likely
cache hot, which is not guaranteed for memory fresh out of kmalloc.

This also gives gcc more optimization potential because it can easier
determine that there are no external aliases for these variables.

The sdio initialization is a initialization now instead of memset.
This allows gcc to break sdio into individual fields and optimize
away unnecessary zeroing (after all the functions are inlined)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agovfs: fix spinning prevention in prune_icache_sb
Christoph Hellwig [Fri, 28 Oct 2011 08:03:41 +0000 (10:03 +0200)]
vfs: fix spinning prevention in prune_icache_sb

We need to move the inode to the end of the list to actually make the
spinning prevention explained in the comment above it work.  With a
plain list_move it will simply stay in place as we're always reclaiming
from the head of the list.

Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agovfs: add a comment to inode_permission()
Andreas Gruenbacher [Sun, 23 Oct 2011 17:43:33 +0000 (23:13 +0530)]
vfs: add a comment to inode_permission()

Acked-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agovfs: pass all mask flags check_acl and posix_acl_permission
Andreas Gruenbacher [Sun, 23 Oct 2011 17:43:32 +0000 (23:13 +0530)]
vfs: pass all mask flags check_acl and posix_acl_permission

Acked-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agovfs: add hex format for MAY_* flag values
Aneesh Kumar K.V [Sun, 23 Oct 2011 17:43:31 +0000 (23:13 +0530)]
vfs: add hex format for MAY_* flag values

We are going to add more flags and having them in hex format
make it simpler

Acked-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agovfs: indicate that the permission functions take all the MAY_* flags
Andreas Gruenbacher [Sun, 23 Oct 2011 17:43:30 +0000 (23:13 +0530)]
vfs: indicate that the permission functions take all the MAY_* flags

Acked-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agocompat: sync compat_stats with statfs.
Eric W. Biederman [Mon, 17 Oct 2011 20:40:02 +0000 (13:40 -0700)]
compat: sync compat_stats with statfs.

This was found by inspection while tracking a similar
bug in compat_statfs64, that has been fixed in mainline
since decemeber.

- This fixes a bug where not all of the f_spare fields
  were cleared on mips and s390.
- Add the f_flags field to struct compat_statfs
- Copy f_flags to userspace in case someone cares.
- Use __clear_user to copy the f_spare field to userspace
  to ensure that all of the elements of f_spare are cleared.
  On some architectures f_spare is has 5 ints and on some
  architectures f_spare only has 4 ints.  Which makes
  the previous technique of clearing each int individually
  broken.

I don't expect anyone actually uses the old statfs system
call anymore but if they do let them benefit from having
the compat and the native version working the same.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agovfs: add "device" tag to /proc/self/mountstats
Bryan Schumaker [Fri, 7 Oct 2011 17:41:15 +0000 (13:41 -0400)]
vfs: add "device" tag to /proc/self/mountstats

nfsiostat was failing to find mounted filesystems on kernels after
2.6.38 because of changes to show_vfsstat() by commit
c7f404b40a3665d9f4e9a927cc5c1ee0479ed8f9.  This patch adds back the
"device" tag before the nfs server entry so scripts can parse the
mountstats file correctly.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
CC: stable@kernel.org [>=2.6.39]
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agocleanup: vfs: small comment fix for block_invalidatepage
Wang Sheng-Hui [Thu, 1 Sep 2011 00:22:57 +0000 (08:22 +0800)]
cleanup: vfs: small comment fix for block_invalidatepage

The patch is aganist 3.1-rc3.

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agovfs: iov_iter: have iov_iter_advance decrement nr_segs appropriately
Jeff Layton [Thu, 27 Oct 2011 21:53:08 +0000 (23:53 +0200)]
vfs: iov_iter: have iov_iter_advance decrement nr_segs appropriately

Currently, when you call iov_iter_advance, then the pointer to the iovec
array can be incremented, but it does not decrement the nr_segs value in
the iov_iter struct. The result is a iov_iter struct with a nr_segs
value that goes beyond the end of the array.

While I'm not aware of anything that's specifically broken by this, it
seems odd and a bit dangerous not to decrement that value. If someone
were to trust the nr_segs value to be correct, then they could end up
walking off the end of the array.

Changing this might also provide some micro-optimization when dealing
with the last iovec in an array. Many of the other routines that deal
with iov_iter have optimized codepaths when nr_segs == 1.

Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
12 years agoLinux 3.1 v3.1
Linus Torvalds [Mon, 24 Oct 2011 07:10:05 +0000 (09:10 +0200)]
Linux 3.1

12 years agoMerge git://git.infradead.org/iommu-2.6
Linus Torvalds [Mon, 24 Oct 2011 05:08:24 +0000 (07:08 +0200)]
Merge git://git.infradead.org/iommu-2.6

* git://git.infradead.org/iommu-2.6:
  intel-iommu: fix superpage support in pfn_to_dma_pte()
  intel-iommu: set iommu_superpage on VM domains to lowest common denominator
  intel-iommu: fix return value of iommu_unmap() API
  MAINTAINERS: Update VT-d entry for drivers/pci -> drivers/iommu move
  intel-iommu: Export a flag indicating that the IOMMU is used for iGFX.
  intel-iommu: Workaround IOTLB hang on Ironlake GPU
  intel-iommu: Fix AB-BA lockdep report

12 years agoMerge branch 'for-linus' of http://people.redhat.com/agk/git/linux-dm
Linus Torvalds [Mon, 24 Oct 2011 05:05:38 +0000 (07:05 +0200)]
Merge branch 'for-linus' of people.redhat.com/agk/git/linux-dm

* 'for-linus' of http://people.redhat.com/agk/git/linux-dm:
  dm kcopyd: fix job_pool leak

12 years agox86: Fix S4 regression
Takashi Iwai [Sun, 23 Oct 2011 21:19:12 +0000 (23:19 +0200)]
x86: Fix S4 regression

Commit 4b239f458 ("x86-64, mm: Put early page table high") causes a S4
regression since 2.6.39, namely the machine reboots occasionally at S4
resume.  It doesn't happen always, overall rate is about 1/20.  But,
like other bugs, once when this happens, it continues to happen.

This patch fixes the problem by essentially reverting the memory
assignment in the older way.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: <stable@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Yinghai Lu <yinghai.lu@oracle.com>
[ We'll hopefully find the real fix, but that's too late for 3.1 now ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agodm kcopyd: fix job_pool leak
Alasdair G Kergon [Sun, 23 Oct 2011 19:55:17 +0000 (20:55 +0100)]
dm kcopyd: fix job_pool leak

Fix memory leak introduced by commit a6e50b409d3f9e0833e69c3c9cca822e8fa4adbb
(dm snapshot: skip reading origin when overwriting complete chunk).

When allocating a set of jobs from kc->job_pool, job->master_job must be
set (to point to itself) so that the mempool item gets freed when the
master_job completes.

master_job was introduced by commit c6ea41fbbe08f270a8edef99dc369faf809d1bd6
(dm kcopyd: preallocate sub jobs to avoid deadlock)

Reported-by: Michael Leun <ml@newton.leun.net>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
12 years agoMerge branch 'samsung-fixes-4' of git://github.com/kgene/linux-samsung
Linus Torvalds [Sun, 23 Oct 2011 07:44:40 +0000 (10:44 +0300)]
Merge branch 'samsung-fixes-4' of git://github.com/kgene/linux-samsung

* 'samsung-fixes-4' of git://github.com/kgene/linux-samsung:
  ARM: S3C24XX: Fix s3c24xx build errors if !CONFIG_PM
  ARM: S5P: fix offset calculation on gpio-interrupt

12 years agoMerge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groec...
Linus Torvalds [Sun, 23 Oct 2011 07:43:31 +0000 (10:43 +0300)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (w83627ehf) Fix negative 8-bit temperature values

12 years agoARM: S3C24XX: Fix s3c24xx build errors if !CONFIG_PM
Domenico Andreoli [Fri, 21 Oct 2011 19:00:53 +0000 (04:00 +0900)]
ARM: S3C24XX: Fix s3c24xx build errors if !CONFIG_PM

v2:
- register_syscore_ops(&s3c24xx_irq_syscore_ops) does not need to be
  conditionally compiled out, it is already optimized out on !CONFIG_PM
- fix also s3c2412 and s3c2416 affected by the same build issue

v1:
s3c2440.c fails to build if !CONFIG_PM because in such case
s3c2410_pm_syscore_ops is not defined. Same error should happen also
in s3c2410.c and s3c2442.c

Signed-off-by: Domenico Andreoli <cavokz@gmail.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
12 years agoMerge git://github.com/herbertx/crypto
Linus Torvalds [Fri, 21 Oct 2011 14:02:18 +0000 (17:02 +0300)]
Merge git://github.com/herbertx/crypto

* git://github.com/herbertx/crypto:
  crypto: ghash - Avoid null pointer dereference if no key is set

12 years agoMerge branch 'fix/hda' of git://github.com/tiwai/sound
Linus Torvalds [Fri, 21 Oct 2011 14:01:21 +0000 (17:01 +0300)]
Merge branch 'fix/hda' of git://github.com/tiwai/sound

* 'fix/hda' of git://github.com/tiwai/sound:
  ALSA: HDA: conexant support for Lenovo T520/W520
  ALSA: hda - Add position_fix quirk for Dell Inspiron 1010

12 years agocrypto: ghash - Avoid null pointer dereference if no key is set
Nick Bowler [Thu, 20 Oct 2011 12:16:55 +0000 (14:16 +0200)]
crypto: ghash - Avoid null pointer dereference if no key is set

The ghash_update function passes a pointer to gf128mul_4k_lle which will
be NULL if ghash_setkey is not called or if the most recent call to
ghash_setkey failed to allocate memory.  This causes an oops.  Fix this
up by returning an error code in the null case.

This is trivially triggered from unprivileged userspace through the
AF_ALG interface by simply writing to the socket without setting a key.

The ghash_final function has a similar issue, but triggering it requires
a memory allocation failure in ghash_setkey _after_ at least one
successful call to ghash_update.

  BUG: unable to handle kernel NULL pointer dereference at 00000670
  IP: [<d88c92d4>] gf128mul_4k_lle+0x23/0x60 [gf128mul]
  *pde = 00000000
  Oops: 0000 [#1] PREEMPT SMP
  Modules linked in: ghash_generic gf128mul algif_hash af_alg nfs lockd nfs_acl sunrpc bridge ipv6 stp llc

  Pid: 1502, comm: hashatron Tainted: G        W   3.1.0-rc9-00085-ge9308cf #32 Bochs Bochs
  EIP: 0060:[<d88c92d4>] EFLAGS: 00000202 CPU: 0
  EIP is at gf128mul_4k_lle+0x23/0x60 [gf128mul]
  EAX: d69db1f0 EBX: d6b8ddac ECX: 00000004 EDX: 00000000
  ESI: 00000670 EDI: d6b8ddac EBP: d6b8ddc8 ESP: d6b8dda4
   DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
  Process hashatron (pid: 1502, ti=d6b8c000 task=d6810000 task.ti=d6b8c000)
  Stack:
   00000000 d69db1f0 00000163 00000000 d6b8ddc8 c101a520 d69db1f0 d52aa000
   00000ff0 d6b8dde8 d88d310f d6b8a3f8 d52aa000 00001000 d88d502c d6b8ddfc
   00001000 d6b8ddf4 c11676ed d69db1e8 d6b8de24 c11679ad d52aa000 00000000
  Call Trace:
   [<c101a520>] ? kmap_atomic_prot+0x37/0xa6
   [<d88d310f>] ghash_update+0x85/0xbe [ghash_generic]
   [<c11676ed>] crypto_shash_update+0x18/0x1b
   [<c11679ad>] shash_ahash_update+0x22/0x36
   [<c11679cc>] shash_async_update+0xb/0xd
   [<d88ce0ba>] hash_sendpage+0xba/0xf2 [algif_hash]
   [<c121b24c>] kernel_sendpage+0x39/0x4e
   [<d88ce000>] ? 0xd88cdfff
   [<c121b298>] sock_sendpage+0x37/0x3e
   [<c121b261>] ? kernel_sendpage+0x4e/0x4e
   [<c10b4dbc>] pipe_to_sendpage+0x56/0x61
   [<c10b4e1f>] splice_from_pipe_feed+0x58/0xcd
   [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10
   [<c10b51f5>] __splice_from_pipe+0x36/0x55
   [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10
   [<c10b6383>] splice_from_pipe+0x51/0x64
   [<c10b63c2>] ? default_file_splice_write+0x2c/0x2c
   [<c10b63d5>] generic_splice_sendpage+0x13/0x15
   [<c10b4d66>] ? splice_from_pipe_begin+0x10/0x10
   [<c10b527f>] do_splice_from+0x5d/0x67
   [<c10b6865>] sys_splice+0x2bf/0x363
   [<c129373b>] ? sysenter_exit+0xf/0x16
   [<c104dc1e>] ? trace_hardirqs_on_caller+0x10e/0x13f
   [<c129370c>] sysenter_do_call+0x12/0x32
  Code: 83 c4 0c 5b 5e 5f c9 c3 55 b9 04 00 00 00 89 e5 57 8d 7d e4 56 53 8d 5d e4 83 ec 18 89 45 e0 89 55 dc 0f b6 70 0f c1 e6 04 01 d6 <f3> a5 be 0f 00 00 00 4e 89 d8 e8 48 ff ff ff 8b 45 e0 89 da 0f
  EIP: [<d88c92d4>] gf128mul_4k_lle+0x23/0x60 [gf128mul] SS:ESP 0068:d6b8dda4
  CR2: 0000000000000670
  ---[ end trace 4eaa2a86a8e2da24 ]---
  note: hashatron[1502] exited with preempt_count 1
  BUG: scheduling while atomic: hashatron/1502/0x10000002
  INFO: lockdep is turned off.
  [...]

Signed-off-by: Nick Bowler <nbowler@elliptictech.com>
Cc: stable@kernel.org [2.6.37+]
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
12 years agoARM: S5P: fix offset calculation on gpio-interrupt
Marek Szyprowski [Fri, 21 Oct 2011 09:04:54 +0000 (18:04 +0900)]
ARM: S5P: fix offset calculation on gpio-interrupt

Offsets of the irq controller registers were calculated
correctly only for first GPIO bank. This patch fixes
calculation of the register offsets for all GPIO banks.

Reported-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Linus Torvalds [Thu, 20 Oct 2011 19:16:28 +0000 (22:16 +0300)]
Merge git://git./linux/kernel/git/davem/sparc

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Add alignment flag to PCI expansion resources
  sparc: Avoid calling sigprocmask()
  sparc: Use set_current_blocked()
  sparc32,leon: SRMMU MMU Table probe fix

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 20 Oct 2011 19:15:20 +0000 (22:15 +0300)]
Merge git://git./linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  fib_rules: fix unresolved_rules counting
  r8169: fix wrong eee setting for rlt8111evl
  r8169: fix driver shutdown WoL regression.
  ehea: Change maintainer to me
  pptp: pptp_rcv_core() misses pskb_may_pull() call
  tproxy: copy transparent flag when creating a time wait
  pptp: fix skb leak in pptp_xmit()
  bonding: use local function pointer of bond->recv_probe in bond_handle_frame
  smsc911x: Add support for SMSC LAN89218
  tg3: negate USE_PHYLIB flag check
  netconsole: enable netconsole can make net_device refcnt incorrent
  bluetooth: Properly clone LSM attributes to newly created child connections
  l2tp: fix a potential skb leak in l2tp_xmit_skb()
  bridge: fix hang on removal of bridge via netlink
  x25: Prevent skb overreads when checking call user data
  x25: Handle undersized/fragmented skbs
  x25: Validate incoming call user data lengths
  udplite: fast-path computation of checksum coverage
  IPVS netns shutdown/startup dead-lock
  netfilter: nf_conntrack: fix event flooding in GRE protocol tracker

12 years agohwmon: (w83627ehf) Fix negative 8-bit temperature values
Jean Delvare [Thu, 20 Oct 2011 07:06:45 +0000 (03:06 -0400)]
hwmon: (w83627ehf) Fix negative 8-bit temperature values

Since 8-bit temperature values are now handled in 16-bit struct
members, values have to be cast to s8 for negative temperatures to be
properly handled. This is broken since kernel version 2.6.39
(commit bce26c58df86599c9570cee83eac58bdaae760e4.)

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: stable@kernel.org # 2.6.39+
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
12 years agomm: fix race between mremap and removing migration entry
Hugh Dickins [Wed, 19 Oct 2011 19:50:35 +0000 (12:50 -0700)]
mm: fix race between mremap and removing migration entry

I don't usually pay much attention to the stale "? " addresses in
stack backtraces, but this lucky report from Pawel Sikora hints that
mremap's move_ptes() has inadequate locking against page migration.

 3.0 BUG_ON(!PageLocked(p)) in migration_entry_to_page():
 kernel BUG at include/linux/swapops.h:105!
 RIP: 0010:[<ffffffff81127b76>]  [<ffffffff81127b76>]
                       migration_entry_wait+0x156/0x160
  [<ffffffff811016a1>] handle_pte_fault+0xae1/0xaf0
  [<ffffffff810feee2>] ? __pte_alloc+0x42/0x120
  [<ffffffff8112c26b>] ? do_huge_pmd_anonymous_page+0xab/0x310
  [<ffffffff81102a31>] handle_mm_fault+0x181/0x310
  [<ffffffff81106097>] ? vma_adjust+0x537/0x570
  [<ffffffff81424bed>] do_page_fault+0x11d/0x4e0
  [<ffffffff81109a05>] ? do_mremap+0x2d5/0x570
  [<ffffffff81421d5f>] page_fault+0x1f/0x30

mremap's down_write of mmap_sem, together with i_mmap_mutex or lock,
and pagetable locks, were good enough before page migration (with its
requirement that every migration entry be found) came in, and enough
while migration always held mmap_sem; but not enough nowadays, when
there's memory hotremove and compaction.

The danger is that move_ptes() lets a migration entry dodge around
behind remove_migration_pte()'s back, so it's in the old location when
looking at the new, then in the new location when looking at the old.

Either mremap's move_ptes() must additionally take anon_vma lock(), or
migration's remove_migration_pte() must stop peeking for is_swap_entry()
before it takes pagetable lock.

Consensus chooses the latter: we prefer to add overhead to migration
than to mremapping, which gets used by JVMs and by exec stack setup.

Reported-and-tested-by: Paweł Sikora <pluto@agmk.net>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agosparc: Add alignment flag to PCI expansion resources
Kjetil Oftedal [Wed, 19 Oct 2011 23:20:50 +0000 (16:20 -0700)]
sparc: Add alignment flag to PCI expansion resources

Currently no type of alignment is specified for PCI expansion roms while
parsing the openfirmware tree. This causes calls to pci_map_rom() to fail.
IORESOURCE_SIZEALIGN is the default alignment used for rom resouces in
pci/probe.c, and has been verified to work with various cards on a ultra 10.

Signed-off-By: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agofib_rules: fix unresolved_rules counting
Yan, Zheng [Mon, 17 Oct 2011 15:20:28 +0000 (15:20 +0000)]
fib_rules: fix unresolved_rules counting

we should decrease ops->unresolved_rules when deleting a unresolved rule.

Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agor8169: fix wrong eee setting for rlt8111evl
hayeswang [Thu, 13 Oct 2011 20:14:37 +0000 (20:14 +0000)]
r8169: fix wrong eee setting for rlt8111evl

Correct the wrong parameter for setting EEE for RTL8111E-VL.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agor8169: fix driver shutdown WoL regression.
françois romieu [Fri, 14 Oct 2011 00:57:45 +0000 (00:57 +0000)]
r8169: fix driver shutdown WoL regression.

Due to commit 92fc43b4159b518f5baae57301f26d770b0834c9 ("r8169: modify the
flow of the hw reset."), rtl8169_hw_reset stomps during driver shutdown on
RxConfig bits which are needed for WOL on some versions of the hardware.

As these bits were formerly set from the r81{0x, 68}_pll_power_down methods,
factor them out for use in the driver shutdown (rtl_shutdown) handler.

I favored __rtl8169_get_wol() -hardware state indication- over
RTL_FEATURE_WOL as the latter has become a good candidate for removal.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes <hayeswang@realtek.com>
Tested-by: Marc Ballarin <ballarin.marc@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoehea: Change maintainer to me
Thadeu Lima de Souza Cascardo [Thu, 13 Oct 2011 09:56:19 +0000 (09:56 +0000)]
ehea: Change maintainer to me

Breno Leitao has passed the maintainership to me.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: Breno Leitao <leitao@linux.vnet.ibm.com>
Acked-by: Breno Leitão <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus
Linus Torvalds [Wed, 19 Oct 2011 13:44:11 +0000 (06:44 -0700)]
Merge branch 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus

* 'v4l_for_linus' of git://linuxtv.org/mchehab/for_linus:
  [media] videodev: fix a NULL pointer dereference in v4l2_device_release()

12 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Wed, 19 Oct 2011 13:43:24 +0000 (06:43 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon/kms/atom: fix handling of FB scratch indices
  drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)
  drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls
  drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping
  ttm: Fix error-path using an uninitialized value

12 years ago[media] videodev: fix a NULL pointer dereference in v4l2_device_release()
Antonio Ospite [Wed, 12 Oct 2011 20:59:26 +0000 (17:59 -0300)]
[media] videodev: fix a NULL pointer dereference in v4l2_device_release()

The change in 8280b66 does not cover the case when v4l2_dev is already
NULL, fix that.

With a Kinect sensor, seen as an USB camera using GSPCA in this context,
a NULL pointer dereference BUG can be triggered by just unplugging the
device after the camera driver has been loaded.

Signed-off-by: Antonio Ospite <ospite@studenti.unina.it>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
12 years agointel-iommu: fix superpage support in pfn_to_dma_pte()
Allen Kay [Fri, 14 Oct 2011 19:32:46 +0000 (12:32 -0700)]
intel-iommu: fix superpage support in pfn_to_dma_pte()

If target_level == 0, current code breaks out of the while-loop if
SUPERPAGE bit is set. We should also break out if PTE is not present.
If we don't do this, KVM calls to iommu_iova_to_phys() will cause
pfn_to_dma_pte() to create mapping for 4KiB pages.

Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
12 years agointel-iommu: set iommu_superpage on VM domains to lowest common denominator
Allen Kay [Fri, 14 Oct 2011 19:32:17 +0000 (12:32 -0700)]
intel-iommu: set iommu_superpage on VM domains to lowest common denominator

set dmar->iommu_superpage field to the smallest common denominator
of super page sizes supported by all active VT-d engines.  Initialize
this field in intel_iommu_domain_init() API so intel_iommu_map() API
will be able to use iommu_superpage field to determine the appropriate
super page size to use.

Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
12 years agointel-iommu: fix return value of iommu_unmap() API
Allen Kay [Fri, 14 Oct 2011 19:31:54 +0000 (12:31 -0700)]
intel-iommu: fix return value of iommu_unmap() API

iommu_unmap() API expects IOMMU drivers to return the actual page order
of the address being unmapped.  Previous code was just returning page
order passed in from the caller.  This patch fixes this problem.

Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
12 years agoMAINTAINERS: Update VT-d entry for drivers/pci -> drivers/iommu move
Roland Dreier [Tue, 11 Oct 2011 00:07:15 +0000 (17:07 -0700)]
MAINTAINERS: Update VT-d entry for drivers/pci -> drivers/iommu move

Commit 166e9278a3f9 ("x86/ia64: intel-iommu: move to drivers/iommu/")
moved the VT-d driver to drivers/iommu, but left the "F:" line in
MAINTAINERS pointing to drivers/pci, which breaks scripts/get_maintainer.pl.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
12 years agodrm/radeon/kms/atom: fix handling of FB scratch indices
Alex Deucher [Wed, 19 Oct 2011 00:10:05 +0000 (20:10 -0400)]
drm/radeon/kms/atom: fix handling of FB scratch indices

FB scratch indices are dword indices, but we were treating
them as byte indices.  As such, we were getting the wrong
FB scratch data for non-0 indices.  Fix the indices and
guard the indexing against indices larger than the scratch
allocation.

Fixes memory corruption on some boards if data was written
past the end of the FB scratch array.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dave Airlie <airlied@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agopptp: pptp_rcv_core() misses pskb_may_pull() call
Eric Dumazet [Mon, 17 Oct 2011 17:59:53 +0000 (17:59 +0000)]
pptp: pptp_rcv_core() misses pskb_may_pull() call

e1000e uses paged frags, so any layer incorrectly pulling bytes from skb
can trigger a BUG in skb_pull()

[951.142737]  [<ffffffff813d2f36>] skb_pull+0x15/0x17
[951.142737]  [<ffffffffa0286824>] pptp_rcv_core+0x126/0x19a [pptp]
[951.152725]  [<ffffffff813d17c4>] sk_receive_skb+0x69/0x105
[951.163558]  [<ffffffffa0286993>] pptp_rcv+0xc8/0xdc [pptp]
[951.165092]  [<ffffffffa02800a3>] gre_rcv+0x62/0x75 [gre]
[951.165092]  [<ffffffff81410784>] ip_local_deliver_finish+0x150/0x1c1
[951.177599]  [<ffffffff81410634>] ? ip_local_deliver_finish+0x0/0x1c1
[951.177599]  [<ffffffff81410846>] NF_HOOK.clone.7+0x51/0x58
[951.177599]  [<ffffffff81410996>] ip_local_deliver+0x51/0x55
[951.177599]  [<ffffffff814105b9>] ip_rcv_finish+0x31a/0x33e
[951.177599]  [<ffffffff8141029f>] ? ip_rcv_finish+0x0/0x33e
[951.204898]  [<ffffffff81410846>] NF_HOOK.clone.7+0x51/0x58
[951.214651]  [<ffffffff81410bb5>] ip_rcv+0x21b/0x246

pptp_rcv_core() is a nice example of a function assuming everything it
needs is available in skb head.

Reported-by: Bradley Peterson <despite@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotproxy: copy transparent flag when creating a time wait
KOVACS Krisztian [Tue, 18 Oct 2011 10:17:35 +0000 (10:17 +0000)]
tproxy: copy transparent flag when creating a time wait

The transparent socket option setting was not copied to the time wait
socket when an inet socket was being replaced by a time wait socket. This
broke the --transparent option of the socket match and may have caused
that FIN packets belonging to sockets in FIN_WAIT2 or TIME_WAIT state
were being dropped by the packet filter.

Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agopptp: fix skb leak in pptp_xmit()
Eric Dumazet [Mon, 17 Oct 2011 17:01:47 +0000 (17:01 +0000)]
pptp: fix skb leak in pptp_xmit()

In case we cant transmit skb, we must free it

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobonding: use local function pointer of bond->recv_probe in bond_handle_frame
Mitsuo Hayasaka [Wed, 12 Oct 2011 16:04:29 +0000 (16:04 +0000)]
bonding: use local function pointer of bond->recv_probe in bond_handle_frame

The bond->recv_probe is called in bond_handle_frame() when
a packet is received, but bond_close() sets it to NULL. So,
a panic occurs when both functions work in parallel.

Why this happen:
After null pointer check of bond->recv_probe, an sk_buff is
duplicated and bond->recv_probe is called in bond_handle_frame.
So, a panic occurs when bond_close() is called between the
check and call of bond->recv_probe.

Patch:
This patch uses a local function pointer of bond->recv_probe
in bond_handle_frame(). So, it can avoid the null pointer
dereference.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosmsc911x: Add support for SMSC LAN89218
Phil Edworthy [Wed, 12 Oct 2011 02:29:39 +0000 (02:29 +0000)]
smsc911x: Add support for SMSC LAN89218

LAN89218 is register compatible with LAN911x.

Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agotg3: negate USE_PHYLIB flag check
Jiri Pirko [Tue, 11 Oct 2011 23:00:41 +0000 (23:00 +0000)]
tg3: negate USE_PHYLIB flag check

USE_PHYLIB flag in tg3_remove_one() is being checked incorrectly. This
results tg3_phy_fini->phy_disconnect is never called and when tg3 module
is removed.

In my case this resulted in panics in phy_state_machine calling function
phydev->adjust_link.

So correct this check.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonetconsole: enable netconsole can make net_device refcnt incorrent
Gao feng [Tue, 11 Oct 2011 16:08:11 +0000 (16:08 +0000)]
netconsole: enable netconsole can make net_device refcnt incorrent

There is no check if netconsole is enabled current.
so when exec echo 1 > enabled;
the reference of net_device will increment always.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobluetooth: Properly clone LSM attributes to newly created child connections
Paul Moore [Fri, 7 Oct 2011 09:40:59 +0000 (09:40 +0000)]
bluetooth: Properly clone LSM attributes to newly created child connections

The Bluetooth stack has internal connection handlers for all of the various
Bluetooth protocols, and unfortunately, they are currently lacking the LSM
hooks found in the core network stack's connection handlers.  I say
unfortunately, because this can cause problems for users who have have an
LSM enabled and are using certain Bluetooth devices.  See one problem
report below:

 * http://bugzilla.redhat.com/show_bug.cgi?id=741703

In order to keep things simple at this point in time, this patch fixes the
problem by cloning the parent socket's LSM attributes to the newly created
child socket.  If we decide we need a more elaborate LSM marking mechanism
for Bluetooth (I somewhat doubt this) we can always revisit this decision
in the future.

Reported-by: James M. Cape <jcape@ignore-your.tv>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agol2tp: fix a potential skb leak in l2tp_xmit_skb()
Eric Dumazet [Fri, 7 Oct 2011 05:35:46 +0000 (05:35 +0000)]
l2tp: fix a potential skb leak in l2tp_xmit_skb()

l2tp_xmit_skb() can leak one skb if skb_cow_head() returns an error.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobridge: fix hang on removal of bridge via netlink
stephen hemminger [Thu, 6 Oct 2011 11:19:41 +0000 (11:19 +0000)]
bridge: fix hang on removal of bridge via netlink

Need to cleanup bridge device timers and ports when being bridge
device is being removed via netlink.

This fixes the problem of observed when doing:
 ip link add br0 type bridge
 ip link set dev eth1 master br0
 ip link set br0 up
 ip link del br0

which would cause br0 to hang in unregister_netdev because
of leftover reference count.

Reported-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agocputimer: Cure lock inversion
Peter Zijlstra [Mon, 17 Oct 2011 09:50:30 +0000 (11:50 +0200)]
cputimer: Cure lock inversion

There's a lock inversion between the cputimer->lock and rq->lock;
notably the two callchains involved are:

 update_rlimit_cpu()
   sighand->siglock
   set_process_cpu_timer()
     cpu_timer_sample_group()
       thread_group_cputimer()
         cputimer->lock
         thread_group_cputime()
           task_sched_runtime()
             ->pi_lock
             rq->lock

 scheduler_tick()
   rq->lock
   task_tick_fair()
     update_curr()
       account_group_exec()
         cputimer->lock

Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
the second one is keeping up-to-date.

This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
SMP accounting oddities").

Cure the problem by removing the cputimer->lock and rq->lock nesting,
this leaves concurrent enablers doing duplicate work, but the time
wasted should be on the same order otherwise wasted spinning on the
lock and the greater-than assignment filter should ensure we preserve
monotonicity.

Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Simon Kirby <sim@hostway.ca>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/1318928713.21167.4.camel@twins
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agodrm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)
Alex Deucher [Wed, 12 Oct 2011 22:49:53 +0000 (18:49 -0400)]
drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)

Settings in this table reflect the physical panel/connector rather
than the internal dig encoding.

v2: fix typo for DRM_MODE_CONNECTOR_VGA case.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agodrm/radeon/kms/DCE4.1: ss is not supported on the internal pplls
Alex Deucher [Wed, 12 Oct 2011 22:44:33 +0000 (18:44 -0400)]
drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls

It's handled via external clock.  It should already be protected
by the external ss flag, but add an explicit check just in case.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agodrm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping
Alex Deucher [Wed, 12 Oct 2011 22:44:32 +0000 (18:44 -0400)]
drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping

llano has fully routeable dig encoders similar to DCE3.2 while
ontario has a hardcoded mapping similar to DCE4.0.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agoALSA: HDA: conexant support for Lenovo T520/W520
Daniel Suchy [Tue, 18 Oct 2011 09:09:44 +0000 (11:09 +0200)]
ALSA: HDA: conexant support for Lenovo T520/W520

This is patch for Conexant codec of Intel HDA driver, adding new quirk
for Lenovo Thinkpad T520 and W520. Conexant autodetection works fine for
T520 (similar subsystem ID is used also in W520 model) and detects more
mixer features compared to generic (fallback) Lenovo quirk with
hardcoded options in Conexant codec.

Patch was activelly tested with Linux 3.0.4, 3.0.6 and 3.0.7 without any
problems.

Signed-off-by: Daniel Suchy <danny@danysek.cz>
Cc: <stable@kernel.org> [3.0+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 years agoALSA: hda - Add position_fix quirk for Dell Inspiron 1010
Takashi Iwai [Tue, 18 Oct 2011 08:44:05 +0000 (10:44 +0200)]
ALSA: hda - Add position_fix quirk for Dell Inspiron 1010

The previous fix for the position-buffer check gives yet another
regression on a Dell laptop.  The safest fix right now is to add a
static quirk for this device (and better to apply it for stable
kernels too).

Reported-by: Éric Piel <Eric.Piel@tremplin-utc.net>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 years agottm: Fix error-path using an uninitialized value
Thomas Hellstrom [Mon, 17 Oct 2011 11:27:34 +0000 (13:27 +0200)]
ttm: Fix error-path using an uninitialized value

Pointed out by Michel Daenzer.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agoLinux 3.1-rc10 v3.1-rc10
Linus Torvalds [Tue, 18 Oct 2011 04:06:23 +0000 (21:06 -0700)]
Linux 3.1-rc10

12 years agoMerge branch 'nf' of git://1984.lsi.us.es/net
David S. Miller [Mon, 17 Oct 2011 23:38:03 +0000 (19:38 -0400)]
Merge branch 'nf' of git://1984.lsi.us.es/net

12 years agox25: Prevent skb overreads when checking call user data
Matthew Daley [Fri, 14 Oct 2011 18:45:05 +0000 (18:45 +0000)]
x25: Prevent skb overreads when checking call user data

x25_find_listener does not check that the amount of call user data given
in the skb is big enough in per-socket comparisons, hence buffer
overreads may occur.  Fix this by adding a check.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Andrew Hendry <andrew.hendry@gmail.com>
Cc: stable <stable@kernel.org>
Acked-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agox25: Handle undersized/fragmented skbs
Matthew Daley [Fri, 14 Oct 2011 18:45:04 +0000 (18:45 +0000)]
x25: Handle undersized/fragmented skbs

There are multiple locations in the X.25 packet layer where a skb is
assumed to be of at least a certain size and that all its data is
currently available at skb->data.  These assumptions are not checked,
hence buffer overreads may occur.  Use pskb_may_pull to check these
minimal size assumptions and ensure that data is available at skb->data
when necessary, as well as use skb_copy_bits where needed.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Andrew Hendry <andrew.hendry@gmail.com>
Cc: stable <stable@kernel.org>
Acked-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agox25: Validate incoming call user data lengths
Matthew Daley [Fri, 14 Oct 2011 18:45:03 +0000 (18:45 +0000)]
x25: Validate incoming call user data lengths

X.25 call user data is being copied in its entirety from incoming messages
without consideration to the size of the destination buffers, leading to
possible buffer overflows. Validate incoming call user data lengths before
these copies are performed.

It appears this issue was noticed some time ago, however nothing seemed to
come of it: see http://www.spinics.net/lists/linux-x25/msg00043.html and
commit 8db09f26f912f7c90c764806e804b558da520d4f.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Andrew Hendry <andrew.hendry@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoudplite: fast-path computation of checksum coverage
Gerrit Renker [Mon, 17 Oct 2011 23:07:30 +0000 (19:07 -0400)]
udplite: fast-path computation of checksum coverage

Commit 903ab86d195cca295379699299c5fc10beba31c7 of 1 March this year ("udp: Add
lockless transmit path") introduced a new fast TX path that broke the checksum
coverage computation of UDP-lite, which so far depended on up->len (only set
if the socket is locked and 0 in the fast path).

Fixed by providing both fast- and slow-path computation of checksum coverage.
The latter can be removed when UDP(-lite)v6 also uses a lockless transmit path.

Reported-by: Thomas Volkert <thomas@homer-conferencing.com>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoAvoid using variable-length arrays in kernel/sys.c
Linus Torvalds [Mon, 17 Oct 2011 15:24:24 +0000 (08:24 -0700)]
Avoid using variable-length arrays in kernel/sys.c

The size is always valid, but variable-length arrays generate worse code
for no good reason (unless the function happens to be inlined and the
compiler sees the length for the simple constant it is).

Also, there seems to be some code generation problem on POWER, where
Henrik Bakken reports that register r28 can get corrupted under some
subtle circumstances (interrupt happening at the wrong time?).  That all
indicates some seriously broken compiler issues, but since variable
length arrays are bad regardless, there's little point in trying to
chase it down.

"Just don't do that, then".

Reported-by: Henrik Grindal Bakken <henribak@cisco.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur...
Linus Torvalds [Sun, 16 Oct 2011 20:08:27 +0000 (13:08 -0700)]
Merge branch 'fixes' of ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm

* 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
  ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS
  ARM: 7122/1: localtimer: add header linux/errno.h explicitly
  ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9
  ARM: 7113/1: mm: Align bank start to MAX_ORDER_NR_PAGES

12 years agoARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS
Zoltan Devai [Mon, 10 Oct 2011 13:54:12 +0000 (14:54 +0100)]
ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS

This is unneeded and causes an abort on the SPMP8000 platform.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Zoltan Devai <zoss@devai.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
12 years agoARM: 7122/1: localtimer: add header linux/errno.h explicitly
Shawn Guo [Thu, 6 Oct 2011 13:57:24 +0000 (14:57 +0100)]
ARM: 7122/1: localtimer: add header linux/errno.h explicitly

Per the text in  Documentation/SubmitChecklist as below, we should
explicitly have header linux/errno.h in localtimer.h for ENXIO
reference.

1: If you use a facility then #include the file that defines/declares
   that facility.  Don't depend on other header files pulling in ones
   that you use.

Otherwise, we may run into some compiling error like the following one,
if any file includes localtimer.h without CONFIG_LOCAL_TIMERS defined.

  arch/arm/include/asm/localtimer.h: In function â€˜local_timer_setup’:
  arch/arm/include/asm/localtimer.h:53:10: error: â€˜ENXIO’ undeclared (first use in this function)

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
12 years agoARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9
Will Deacon [Mon, 3 Oct 2011 17:30:53 +0000 (18:30 +0100)]
ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9

Using COHERENT_LINE_{MISS,HIT} for cache misses and references
respectively is completely wrong. Instead, use the L1D events which
are a better and more useful approximation despite ignoring instruction
traffic.

Reported-by: Alasdair Grant <alasdair.grant@arm.com>
Reported-by: Matt Horsnell <matt.horsnell@arm.com>
Reported-by: Michael Williams <michael.williams@arm.com>
Cc: stable@kernel.org
Cc: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
12 years agoMerge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groec...
Linus Torvalds [Fri, 14 Oct 2011 20:29:09 +0000 (08:29 +1200)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (w83627ehf) Properly report thermal diode sensors

12 years agointel-iommu: Export a flag indicating that the IOMMU is used for iGFX.
David Woodhouse [Fri, 14 Oct 2011 19:59:46 +0000 (20:59 +0100)]
intel-iommu: Export a flag indicating that the IOMMU is used for iGFX.

We really don't want this to work in the general case; device drivers
*shouldn't* care whether they are behind an IOMMU or not. But the
integrated graphics is a special case, because the IOMMU and the GTT are
all kind of smashed into one and generally horrifically buggy, so it's
reasonable for the graphics driver to want to know when the IOMMU is
active for the graphics hardware.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
12 years agointel-iommu: Workaround IOTLB hang on Ironlake GPU
David Woodhouse [Mon, 26 Sep 2011 02:11:14 +0000 (19:11 -0700)]
intel-iommu: Workaround IOTLB hang on Ironlake GPU

To work around a hardware issue, we have to submit IOTLB flushes while
the graphics engine is idle. The graphics driver will (we hope) go to
great lengths to ensure that it gets that right on the affected
chipset(s)... so let's not screw it over by deferring the unmap and
doing it later. That wouldn't be very helpful.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
12 years agoMerge branch 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6
Linus Torvalds [Fri, 14 Oct 2011 05:07:52 +0000 (17:07 +1200)]
Merge branch 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6

* 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6:
  gpio-pca953x: fix gpio_base
  gpio/omap: fix build error with certain OMAP1 configs

12 years agoMerge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
Linus Torvalds [Fri, 14 Oct 2011 05:06:39 +0000 (17:06 +1200)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs

* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: revert to using a kthread for AIL pushing
  xfs: force the log if we encounter pinned buffers in .iop_pushbuf
  xfs: do not update xa_last_pushed_lsn for locked items

12 years agoMerge branch 'stable' of git://github.com/cmetcalf-tilera/linux-tile
Linus Torvalds [Fri, 14 Oct 2011 04:59:11 +0000 (16:59 +1200)]
Merge branch 'stable' of git://github.com/cmetcalf-tilera/linux-tile

* 'stable' of git://github.com/cmetcalf-tilera/linux-tile:
  tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files

12 years agoMerge branch 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip
Linus Torvalds [Fri, 14 Oct 2011 04:54:56 +0000 (16:54 +1200)]
Merge branch 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip

* 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  x86: Default to vsyscall=native for now

12 years agox86, mrst: use a temporary variable for SFI irq
Mika Westerberg [Thu, 13 Oct 2011 09:04:20 +0000 (12:04 +0300)]
x86, mrst: use a temporary variable for SFI irq

SFI tables reside in RAM and should not be modified once they are
written.  Current code went to set pentry->irq to zero which causes
subsequent reads to fail with invalid SFI table checksum.  This will
break kexec as the second kernel fails to validate SFI tables.

To fix this we use temporary variable for irq number.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agohwmon: (w83627ehf) Properly report thermal diode sensors
Jean Delvare [Thu, 13 Oct 2011 19:49:08 +0000 (15:49 -0400)]
hwmon: (w83627ehf) Properly report thermal diode sensors

The w83627ehf driver is improperly reporting thermal diode sensors as
type 2, instead of 3. This caused "sensors" and possibly other
monitoring tools to report these sensors as "transistor" instead of
"thermal diode".

Furthermore, diode subtype selection (CPU vs. external) is only
supported by the original W83627EHF/EHG. All later models only support
CPU diode type, and some (NCT6776F) don't even have the register in
question so we should avoid reading from it.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
12 years agogpio-pca953x: fix gpio_base
Hartmut Knaack [Mon, 10 Oct 2011 22:22:45 +0000 (00:22 +0200)]
gpio-pca953x: fix gpio_base

gpio_base was set to 0 if no system platform data or open firmware
platform data was provided. This led to conflicts, if any other gpiochip
with a gpiobase of 0 was instantiated already. Setting it to -1 will
automatically use the first one available.

Signed-off-by: Hartmut Knaack <knaack.h@gmx.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
12 years agogpio/omap: fix build error with certain OMAP1 configs
Janusz Krzysztofik [Tue, 23 Aug 2011 11:42:24 +0000 (13:42 +0200)]
gpio/omap: fix build error with certain OMAP1 configs

With commit f64ad1a0e21a, "gpio/omap: cleanup _set_gpio_wakeup(), remove
ifdefs", access to build time conditionally omitted 'suspend_wakeup'
member of the 'gpio_bank' structure has been placed unconditionally in
function _set_gpio_wakeup(), which is always built. This resulted in the
driver compilation broken for certain OMAP1, i.e., non-OMAP16xx,
configurations.

Really required or not in previously excluded cases, define this
structure member unconditionally as a fix.

Tested with a custom OMAP1510 only configuration.

Signed-off-by: Janusz Krzysztofik <jkrzyszt@tis.icnet.pl>
Acked-by: Kevin Hilman <khilman@ti.com>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
12 years agotile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files
Chris Metcalf [Wed, 5 Oct 2011 21:09:29 +0000 (17:09 -0400)]
tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files

The 32-bit TILEPro support uses some #defines in <asm/atomic_32.h>
for atomic support routines in assembly.  To make this more explicit,
I've turned those includes into includes of <asm/atomic_32.h>, which
should hopefully make it clear that they shouldn't be bombed into
<linux/atomic.h> in any cleanups.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 13 Oct 2011 06:25:45 +0000 (18:25 +1200)]
Merge git://git./linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  mscan: too much data copied to CAN frame due to 16 bit accesses
  gro: refetch inet6_protos[] after pulling ext headers
  bnx2x: fix cl_id allocation for non-eth clients for NPAR mode
  mlx4_en: fix endianness with blue frame support

12 years agoide: Fix file references in drivers/ide/
Johann Felix Soden [Mon, 10 Oct 2011 18:37:00 +0000 (11:37 -0700)]
ide: Fix file references in drivers/ide/

Fix file references in drivers/ide/

There are a lot of file references to now moved or deleted files in the
whole tree, especially in documentation and Kconfig files.  This patch
fixes the references in drivers/ide/.

Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net>
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'btrfs-3.0' of git://github.com/chrismason/linux
Linus Torvalds [Thu, 13 Oct 2011 06:20:40 +0000 (18:20 +1200)]
Merge branch 'btrfs-3.0' of git://github.com/chrismason/linux

* 'btrfs-3.0' of git://github.com/chrismason/linux:
  Btrfs: make sure not to defrag extents past i_size
  Btrfs: fix recursive auto-defrag

12 years agosparc: Avoid calling sigprocmask()
David S. Miller [Wed, 12 Oct 2011 19:27:35 +0000 (12:27 -0700)]
sparc: Avoid calling sigprocmask()

Use set_current_blocked() instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosparc: Use set_current_blocked()
Matt Fleming [Thu, 11 Aug 2011 13:57:02 +0000 (14:57 +0100)]
sparc: Use set_current_blocked()

As described in e6fa16ab ("signal: sigprocmask() should do
retarget_shared_pending()") the modification of current->blocked is
incorrect as we need to check whether the signal we're about to block
is pending in the shared queue.

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoIPVS netns shutdown/startup dead-lock
Hans Schillstrom [Tue, 11 Oct 2011 01:54:35 +0000 (10:54 +0900)]
IPVS netns shutdown/startup dead-lock

ip_vs_mutext is used by both netns shutdown code and startup
and both implicit uses sk_lock-AF_INET mutex.

cleanup CPU-1         startup CPU-2
ip_vs_dst_event()     ip_vs_genl_set_cmd()
 sk_lock-AF_INET     __ip_vs_mutex
                     sk_lock-AF_INET
__ip_vs_mutex
* DEAD LOCK *

A new mutex placed in ip_vs netns struct called sync_mutex is added.

Comments from Julian and Simon added.
This patch has been running for more than 3 month now and it seems to work.

Ver. 3
    IP_VS_SO_GET_DAEMON in do_ip_vs_get_ctl protected by sync_mutex
    instead of __ip_vs_mutex as sugested by Julian.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
12 years agoxfs: revert to using a kthread for AIL pushing
Christoph Hellwig [Tue, 11 Oct 2011 15:14:10 +0000 (11:14 -0400)]
xfs: revert to using a kthread for AIL pushing

Currently we have a few issues with the way the workqueue code is used to
implement AIL pushing:

 - it accidentally uses the same workqueue as the syncer action, and thus
   can be prevented from running if there are enough sync actions active
   in the system.
 - it doesn't use the HIGHPRI flag to queue at the head of the queue of
   work items

At this point I'm not confident enough in getting all the workqueue flags and
tweaks right to provide a perfectly reliable execution context for AIL
pushing, which is the most important piece in XFS to make forward progress
when the log fills.

Revert back to use a kthread per filesystem which fixes all the above issues
at the cost of having a task struct and stack around for each mounted
filesystem.  In addition this also gives us much better ways to diagnose
any issues involving hung AIL pushing and removes a small amount of code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
12 years agoxfs: force the log if we encounter pinned buffers in .iop_pushbuf
Christoph Hellwig [Tue, 11 Oct 2011 15:14:09 +0000 (15:14 +0000)]
xfs: force the log if we encounter pinned buffers in .iop_pushbuf

We need to check for pinned buffers even in .iop_pushbuf given that inode
items flush into the same buffers that may be pinned directly due operations
on the unlinked inode list operating directly on buffers.  To do this add a
return value to .iop_pushbuf that tells the AIL push about this and use
the existing log force mechanisms to unpin it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
12 years agoxfs: do not update xa_last_pushed_lsn for locked items
Christoph Hellwig [Tue, 11 Oct 2011 15:14:08 +0000 (15:14 +0000)]
xfs: do not update xa_last_pushed_lsn for locked items

If an item was locked we should not update xa_last_pushed_lsn and thus skip
it when restarting the AIL scan as we need to be able to lock and write it
out as soon as possible.  Otherwise heavy lock contention might starve AIL
pushing too easily, especially given the larger backoff once we moved
xa_last_pushed_lsn all the way to the target lsn.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
12 years agoBtrfs: make sure not to defrag extents past i_size
Chris Mason [Tue, 11 Oct 2011 15:41:40 +0000 (11:41 -0400)]
Btrfs: make sure not to defrag extents past i_size

The btrfs file defrag code will loop through the extents and
force COW on them.  But there is a concurrent truncate in the middle of
the defrag, it might end up defragging the same range over and over
again.

The problem is that writepage won't go through and do anything on pages
past i_size, so the cow won't happen, so the file will appear to still
be fragmented.  defrag will end up hitting the same extents again and
again.

In the worst case, the truncate can actually live lock with the defrag
because the defrag keeps creating new ordered extents which the truncate
code keeps waiting on.

The fix here is to make defrag check for i_size inside the main loop,
instead of just once before the looping starts.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
12 years agox86: Default to vsyscall=native for now
Adrian Bunk [Wed, 5 Oct 2011 21:40:47 +0000 (00:40 +0300)]
x86: Default to vsyscall=native for now

This UML breakage:

  linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790
  linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790

Is caused by commit 3ae36655 ("x86-64: Rework vsyscall emulation and add
vsyscall= parameter") - the vsyscall emulation code is not fully cooked
yet as UML relies on some rather fragile SIGSEGV semantics.

Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default
to vsyscall=native for now, this patch implements that.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Andrew Lutomirski <luto@mit.edu>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/20111005214047.GE14406@localhost.pp.htv.fi
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agointel-iommu: Fix AB-BA lockdep report
Roland Dreier [Wed, 20 Jul 2011 13:22:21 +0000 (06:22 -0700)]
intel-iommu: Fix AB-BA lockdep report

When unbinding a device so that I could pass it through to a KVM VM, I
got the lockdep report below.  It looks like a legitimate lock
ordering problem:

 - domain_context_mapping_one() takes iommu->lock and calls
   iommu_support_dev_iotlb(), which takes device_domain_lock (inside
   iommu->lock).

 - domain_remove_one_dev_info() starts by taking device_domain_lock
   then takes iommu->lock inside it (near the end of the function).

So this is the classic AB-BA deadlock.  It looks like a safe fix is to
simply release device_domain_lock a bit earlier, since as far as I can
tell, it doesn't protect any of the stuff accessed at the end of
domain_remove_one_dev_info() anyway.

BTW, the use of device_domain_lock looks a bit unsafe to me... it's
at least not obvious to me why we aren't vulnerable to the race below:

  iommu_support_dev_iotlb()
                                          domain_remove_dev_info()

  lock device_domain_lock
    find info
  unlock device_domain_lock

                                          lock device_domain_lock
                                            find same info
                                          unlock device_domain_lock

                                          free_devinfo_mem(info)

  do stuff with info after it's free

However I don't understand the locking here well enough to know if
this is a real problem, let alone what the best fix is.

Anyway here's the full lockdep output that prompted all of this:

     =======================================================
     [ INFO: possible circular locking dependency detected ]
     2.6.39.1+ #1
     -------------------------------------------------------
     bash/13954 is trying to acquire lock:
      (&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230

     but task is already holding lock:
      (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     which lock already depends on the new lock.

     the existing dependency chain (in reverse order) is:

     -> #1 (device_domain_lock){-.-...}:
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750
            [<ffffffff812f84df>] domain_context_mapping+0x3f/0x120
            [<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0
            [<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e
            [<ffffffff81cab204>] pci_iommu_init+0x16/0x41
            [<ffffffff81002165>] do_one_initcall+0x45/0x190
            [<ffffffff81ca3d3f>] kernel_init+0xe3/0x168
            [<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10

     -> #0 (&(&iommu->lock)->rlock){......}:
            [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
            [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
            [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
            [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
            [<ffffffff812f8b42>] device_notifier+0x72/0x90
            [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
            [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
            [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
            [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
            [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
            [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
            [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
            [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
            [<ffffffff8117569e>] vfs_write+0xce/0x190
            [<ffffffff811759e4>] sys_write+0x54/0xa0
            [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

     other info that might help us debug this:

     6 locks held by bash/13954:
      #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170
      #1:  (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170
      #2:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0
      #3:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50
      #4:  (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0
      #5:  (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230

     stack backtrace:
     Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1
     Call Trace:
      [<ffffffff810993a7>] print_circular_bug+0xf7/0x100
      [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180
      [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0
      [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230
      [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10
      [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230
      [<ffffffff812f8b42>] device_notifier+0x72/0x90
      [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0
      [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0
      [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20
      [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0
      [<ffffffff81373ccf>] device_release_driver+0x2f/0x50
      [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0
      [<ffffffff813724ac>] drv_attr_store+0x2c/0x30
      [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170
      [<ffffffff8117569e>] vfs_write+0xce/0x190
      [<ffffffff811759e4>] sys_write+0x54/0xa0
      [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>