Russell King [Thu, 30 Jun 2011 07:45:49 +0000 (08:45 +0100)]
 
ARM: pm: omap3: move saving of the auxiliary control registers to C
Move the saving of the auxiliary control registers into C; there's
no need for this to be in assembly code.  This results in less
assembly code to deal with in OMAP.
Kevin tested full-chip retention and off on 3430/n900, 3530/Overo and
3630/Zoom3.
Tested-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Jean Pihet [Wed, 29 Jun 2011 16:40:23 +0000 (18:40 +0200)]
 
ARM: pm: omap3: run the ASM sleep code from DDR
Most of the ASM sleep code (in arch/arm/mach-omap2/sleep34xx.S)
is copied to internal SRAM at boot and after wake-up from CORE OFF
mode.  However only a small part of the code really needs to run from
internal SRAM.
This fix lets most of the ASM idle code run from the DDR in order to
minimize the SRAM usage and the overhead in the code copy.
The only pieces of code that are mandatory in SRAM are:
- the i443 erratum WA,
- the i581 erratum WA,
- the security extension code.
SRAM usage:
- original code:
  . 560 bytes for omap3_sram_configure_core_dpll (used by DVFS),
  . 852 bytes for omap_sram_idle (used by suspend/resume in RETention),
  . 124 bytes for es3_sdrc_fix (used by suspend/resume in OFF mode on ES3.x),
  . 108 bytes for save_secure_ram_context (used on HS parts only).
With this fix the usage for suspend/resume in RETention goes down 288
bytes, so the gain in SRAM usage for suspend/resume is 564 bytes.
Also fixed the SRAM initialization sequence to avoid an unnecessary
copy to SRAM at boot time and for readability.
Tested on Beagleboard (ES2.x) in idle with full RET and OFF modes.
Kevin Hilman tested retention and off on 3430/n900, 3530/Overo and
3630/Zoom3
Signed-off-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Thu, 23 Jun 2011 13:24:09 +0000 (14:24 +0100)]
 
ARM: pm: ensure our temporary page table entry is removed from the TLB
Ensure that our temporary page table entry is flushed from the TLB
before we resume normal operations.  This ensures that userspace
won't trip over the stale TLB entry.
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Wed, 22 Jun 2011 16:41:48 +0000 (17:41 +0100)]
 
ARM: pm: hide 1st and 2nd arguments to cpu_suspend from platform code
The first and second arguments shouldn't concern platform code, so
hide them from each platforms caller.
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Kevin Hilman [Fri, 24 Jun 2011 00:16:14 +0000 (17:16 -0700)]
 
ARM: pm: omap34xx: remove get_*_restore_pointer functions, directly use entry points
Upon return from off-mode, the ROM code jumps to a restore function
saved in the scratchpad.  Based on SoC revision or errata, this
restore entry point is different.  Current code uses some helper
functions in sleep34xx.S (get_*_restore_pointer) to get the restore
function entry point.
When returning from off-mode, this code is executed from SDRAM, so
there's no reason to use these helper functions when using the SDRAM
entry points directly would work just fine.
This patch uses ENTRY/ENDPROC to create "real" entry points for these
functions, and uses those values directly when writing the scratchpad.
Tested all three entry points
- restore_es3: 3430/n900
- restore_3630: 3630/Zoom3
- restore: 3530/Overo
Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Jean Pihet <j-pihet@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Wed, 22 Jun 2011 14:42:54 +0000 (15:42 +0100)]
 
ARM: pm: omap34xx: convert to generic suspend/resume support
Convert omap34xx to use the generic CPU suspend/resume support, rather
than implementing its own version.  Tested on 3430 LDP.
Reviewed-by: Kevin Hilman <khilman@ti.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Wed, 22 Jun 2011 11:54:41 +0000 (12:54 +0100)]
 
ARM: pm: omap34xx: remove misleading comment and use of r9
The code alludes to r9 being used to indicate what was lost over the
suspend/resume transition.  However, although r9 is set, it is never
actually used.
Also, the comments before the code (which refer to the value of r9)
and the comments against the assignment of r9 contradict each other,
so just remove them to avoid confusion.
Reviewed-by: Kevin Hilman <khilman@ti.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Wed, 22 Jun 2011 11:44:32 +0000 (12:44 +0100)]
 
ARM: pm: omap34xx: no need to save all registers in sleep34xx.S
The ABI allows called functions to corrupt r0-r3 and ip (r12).  So
its pointless saving these registers in the suspend code - the
calling function will expect them to be corrupted and so won't rely
on their contents after resume.
Reviewed-by: Kevin Hilman <khilman@ti.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Tue, 21 Jun 2011 15:29:30 +0000 (16:29 +0100)]
 
ARM: pm: pxa: move cpu_suspend into C code
We don't need a veneer for cpu_suspend, it can be called directly from
C code now.  Move it into the PXA CPU suspend functions, along with
the accumulator register saving/restoring.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Tue, 21 Jun 2011 18:34:34 +0000 (19:34 +0100)]
 
ARM: pm: samsung: no need to call flush_cache_all()
The core suspend code calls flush_cache_all() immediately prior to
calling the suspend finisher function, so remove these needless calls
from the finisher functions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Tue, 21 Jun 2011 18:29:26 +0000 (19:29 +0100)]
 
ARM: pm: samsung: move cpu_suspend into C code
Move the call to cpu_suspend into C code, and noticing that all the
s3c_cpu_save implementations are now identical, we can move this
into the common samsung code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 15:33:30 +0000 (16:33 +0100)]
 
ARM: pm: mach-s3c64xx: cleanup s3c_cpu_save
s3c_cpu_save does not need to save any registers with the new
cpu_suspend calling convention.  Remove these redundant instructions.
Acked-by: Frank Hofmann <frank.hofmann@tomtom.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 15:12:27 +0000 (16:12 +0100)]
 
ARM: pm: mach-exynos4: cleanup s3c_cpu_save
s3c_cpu_save does not need to save any registers with the new
cpu_suspend calling convention.  Remove these redundant instructions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 15:10:16 +0000 (16:10 +0100)]
 
ARM: pm: mach-s5pv210: cleanup s3c_cpu_save
s3c_cpu_save does not need to save any registers with the new
cpu_suspend calling convention.  Remove these redundant instructions.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 15:03:19 +0000 (16:03 +0100)]
 
ARM: pm: plat-s3c24xx: cleanup s3c_cpu_save
s3c_cpu_save does not need to save any registers with the new
cpu_suspend calling convention.  Remove these redundant instructions.
Tested-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Tue, 21 Jun 2011 15:26:29 +0000 (16:26 +0100)]
 
ARM: pm: sa1100: move cpu_suspend into C code
We don't need a veneer for cpu_suspend, it can be called directly from
C code now.  Move it into sa11x0_pm_enter() along with the re-enabling
of clock switching.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Tue, 21 Jun 2011 15:32:58 +0000 (16:32 +0100)]
 
ARM: pm: move cpu_init() call into core code
As we have core code dealing with CPU suspend/resume, we can
re-initialize the CPUs exception banked registers via that code rather
than having platforms deal with that level of detail.  So, move the
call to cpu_init() out of platform code into core code.
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 14:58:34 +0000 (15:58 +0100)]
 
ARM: pm: convert cpu_suspend() to a normal function
cpu_suspend() has a weird calling method which makes it only possible to
call from assembly code: it returns with a modified stack pointer to
finish the suspend, but on resume, it 'returns' via a provided pointer.
We can make cpu_suspend() appear to be a normal function merely by
swapping the resume pointer argument and the link register.
Do so, and update all callers to take account of this more traditional
behaviour.
Acked-by: Frank Hofmann <frank.hofmann@tomtom.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 14:52:47 +0000 (15:52 +0100)]
 
ARM: pm: move sa1100 to use proper suspend func arg0
In the previous commit, we introduced an official way to supply an
argument to the suspend function.  Convert the sa1100 suspend code
to use this method.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 14:28:40 +0000 (15:28 +0100)]
 
ARM: pm: rejig suspend follow-on function calling convention
Save the suspend function pointer onto the stack for use when returning.
Allocate r2 to pass an argument to the suspend function.
Acked-by: Frank Hofmann <frank.hofmann@tomtom.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 14:25:11 +0000 (15:25 +0100)]
 
ARM: pm: reallocate registers to avoid r2, r3
Avoid using r2 and r3 in the suspend code, allowing these to be
passed further into the function as arguments.
Acked-by: Frank Hofmann <frank.hofmann@tomtom.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 14:04:14 +0000 (15:04 +0100)]
 
ARM: pm: preserve r4 - r11 across a suspend
Make cpu_suspend()..return function preserve r4 to r11 across a suspend
cycle.  This is in preparation of relieving platform support code from
this task.
Acked-by: Frank Hofmann <frank.hofmann@tomtom.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 12:53:06 +0000 (13:53 +0100)]
 
ARM: pm: extract common code from MULTI_CPU/!MULTI_CPU paths
Very little code is different between these two paths now, so extract
the common code.
Acked-by: Frank Hofmann <frank.hofmann@tomtom.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 12:45:34 +0000 (13:45 +0100)]
 
ARM: pm: move return address (for cpu_resume) to top of stack
Move the return address for cpu_resume to the top of stack so that
cpu_resume looks more like a normal function.
Acked-by: Frank Hofmann <frank.hofmann@tomtom.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 13 Jun 2011 12:39:44 +0000 (13:39 +0100)]
 
ARM: pm: make MULTI_CPU and !MULTI_CPU resume paths the same
Eliminate the differences between MULTI_CPU and non-MULTI_CPU resume
paths, making the saved structure identical irrespective of the way
the kernel was configured.
Acked-by: Frank Hofmann <frank.hofmann@tomtom.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Tue, 21 Jun 2011 17:59:33 +0000 (18:59 +0100)]
 
ARM: pm: sa1100: no need to re-enable clock switching
This is now taken care of by calling cpu_proc_init() in the resume
path, so eliminate this unnecessary call.
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Tue, 21 Jun 2011 17:57:31 +0000 (18:57 +0100)]
 
ARM: pm: arrange for cpu_proc_init() to be called on resume
cpu_proc_init() does processor specific initialization, which we do
at boot time.  We have been omitting to do this on resume, which
causes some of this initialization to be skipped.  We've also been
skipping this on SMP initialization too.
Ensure that cpu_proc_init() is always called appropriately by
moving it into cpu_init(), and move cpu_init() to a more appropriate
point in the boot initialization.
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Wed, 22 Jun 2011 14:41:58 +0000 (15:41 +0100)]
 
ARM: pm: ensure ARMv7 CPUs save and restore the TLS register
Ensure that the TLS register is saved and restored over a suspend
cycle, so that userspace programs don't see a corrupted TLS value.
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Thu, 23 Jun 2011 21:00:20 +0000 (22:00 +0100)]
 
ARM: pm: proc-v7: fix missing struct processor pointers for suspend code
Add the missing suspend/resume pointers for the suspend code.  This
is needed when building for multiple CPUs.
Tested-by: Kevin Hilman <khilman@ti.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Walleij [Mon, 20 Jun 2011 07:13:00 +0000 (08:13 +0100)]
 
ARM: 6969/1: plat-iop: fix build error
The iop13xx_defconfig didn't build since the platform code uses
defines from <asm/ptrace.h>. Simply add the include so it
compiles.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Dave Martin [Tue, 14 Jun 2011 13:20:44 +0000 (14:20 +0100)]
 
ARM: 6961/1: zImage: Add build-time check for correctly-sized proc_type entries
It is easy to mis-maintain the proc_types table such that the
entries become wrongly-sized and misaligned when the kernel is
built in Thumb-2.
This patch adds an assembly-time check which will turn most common
size/alignment mistakes in this table into build failures, to avoid
having to debug the boot-time kernel hang which would happen if the
resulting kernel were actually booted.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Mon, 20 Jun 2011 15:46:01 +0000 (16:46 +0100)]
 
ARM: SMP: wait for CPU to be marked active
When we bring a CPU online, we should wait for it to become active
before entering the idle thread, so we know that the scheduler and
thread migration is going to work.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Dave Martin [Thu, 16 Jun 2011 11:09:37 +0000 (12:09 +0100)]
 
ARM: 6963/1: Thumb-2: Relax relocation requirements for non-function symbols
The "Thumb bit" of a symbol is only really meaningful for function
symbols (STT_FUNC).
However, sometimes a branch is relocated against a non-function
symbol; for example, PC-relative branches to anonymous assembler
local symbols are typically fixed up against the start-of-section
symbol, which is not a function symbol.  Some inline assembler
generates references of this type, such as fixup code generated by
macros in <asm/uaccess.h>.
The existing relocation code for R_ARM_THM_CALL/R_ARM_THM_JUMP24
interprets this case as an error, because the target symbol appears
to be an ARM symbol; but this is really not the case, since the
target symbol is just a base in these cases.  The addend defines
the precise offset to the target location, but since the addend is
encoded in a non-interworking Thumb branch instruction, there is no
explicit Thumb bit in the addend.  Because these instructions never
interwork, the implied Thumb bit in the addend is 1, and the
destination is Thumb by definition.
This patch removes the extraneous Thumb bit check for non-function
symbols, enabling modules containing the affected relocation types
to be loaded.  No modification to the actual relocation code is
required, since this code does not take bit[0] of the
location->destination offset into account in any case.
Function symbols are always checked for interworking conflicts, as
before.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Walleij [Thu, 16 Jun 2011 07:26:06 +0000 (08:26 +0100)]
 
ARM: 6962/1: mach-h720x: fix build error
The h7201/h7202 machines did not build since they define
ARM_DMA_ZONE_OFFSET but do not select ZONE_DMA. Fix it up by
selecting ZONE_DMA in their Kconfig.
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Magnus Damm [Mon, 13 Jun 2011 05:46:44 +0000 (06:46 +0100)]
 
ARM: 6959/1: SMP build fix for entry-macro-multi.S
The assembly code in entry-macro-multi.S does not build without
the include asm/assembler.h in the case of CONFIG_SMP=y.
Fixes the rather theoretical SMP build of mach-shmobile/entry-intc.c:
arch/arm/include/asm/entry-macro-multi.S: Assembler messages:
arch/arm/include/asm/entry-macro-multi.S:20: Error: bad instruction `alt_smp(test_for_ipi r0,r6,r5,lr)'
arch/arm/include/asm/entry-macro-multi.S:20: Error: bad instruction `alt_up_b(9997f)'
make[1]: *** [arch/arm/mach-shmobile/entry-intc.o] Error 1
make: *** [arch/arm/mach-shmobile] Error 2
make: *** Waiting for unfinished jobs....
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Fri, 17 Jun 2011 00:54:41 +0000 (17:54 -0700)]
 
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: use helper functions for fence read/write
  drm/radeon/kms: set DP link config properly for DP bridges
  drm/radeon/kms/atom: AdjustPixelClock fixes for DP bridges
  drm/radeon/kms: fix handling of DP to LVDS bridges
  drm/radeon/kms: issue blank/unblank commands for ext encoders
  drm/radeon/kms: fix support for DDC on dp bridges
  drm/radeon/kms: add support for load detection on dp bridges
  drm/radeon/kms: add missing external encoder action
  drm/radeon/kms: rework atombios_get_encoder_mode()
  drm/radeon/kms: fix num crtcs for Cedar and Caicos
  Revert "drm/i915: Enable GMBUS for post-gen2 chipsets"
  drivers/gpu/drm: use printk_ratelimited instead of printk_ratelimit
  drm/radeon: workaround a hw bug on some radeon chipsets with all-0 EDIDs.
  drm: make debug levels match in edid failure code.
  drm/radeon/kms: clear wb memory by default
  drm/radeon/kms: be more pedantic about the g5 quirk (v2)
  drm/radeon/kms: signed fix for evergreen thermal
  drm: populate irq_by_busid-member for pci
Alex Deucher [Mon, 13 Jun 2011 21:39:06 +0000 (17:39 -0400)]
 
drm/radeon/kms: use helper functions for fence read/write
The existing code assumed scratch registers in a number
of places while in most cases we are be using writeback
and events rather than scratch registers.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Thu, 16 Jun 2011 14:06:17 +0000 (10:06 -0400)]
 
drm/radeon/kms: set DP link config properly for DP bridges
DP clock and lanes were not set properly for DP bridges.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Thu, 16 Jun 2011 14:06:16 +0000 (10:06 -0400)]
 
drm/radeon/kms/atom: AdjustPixelClock fixes for DP bridges
Need to set the external transmitter type properly in
AdjustPixelClock to get the properly output.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Mon, 13 Jun 2011 21:13:35 +0000 (17:13 -0400)]
 
drm/radeon/kms: fix handling of DP to LVDS bridges
They need to be treated like eDP rather than DP.
May fix:
https://bugzilla.kernel.org/show_bug.cgi?id=34822
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Mon, 13 Jun 2011 21:13:36 +0000 (17:13 -0400)]
 
drm/radeon/kms: issue blank/unblank commands for ext encoders
Required for DPMS on some systems.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Mon, 13 Jun 2011 21:13:34 +0000 (17:13 -0400)]
 
drm/radeon/kms: fix support for DDC on dp bridges
Need to set up the bridge for DDC prior to the
i2c over aux transaction.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Mon, 13 Jun 2011 21:13:33 +0000 (17:13 -0400)]
 
drm/radeon/kms: add support for load detection on dp bridges
dp to vga bridges for example.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Mon, 13 Jun 2011 21:13:32 +0000 (17:13 -0400)]
 
drm/radeon/kms: add missing external encoder action
required for ddc.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Mon, 13 Jun 2011 21:13:31 +0000 (17:13 -0400)]
 
drm/radeon/kms: rework atombios_get_encoder_mode()
This should give us more reliable results if the table
is called before an active device is set.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Thu, 16 Jun 2011 18:14:22 +0000 (18:14 +0000)]
 
drm/radeon/kms: fix num crtcs for Cedar and Caicos
Only support 4 rather than 6.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jean Delvare [Sat, 4 Jun 2011 19:34:56 +0000 (19:34 +0000)]
 
Revert "drm/i915: Enable GMBUS for post-gen2 chipsets"
Revert commit 
8f9a3f9b63b8cd3f03be9dc53533f90bd4120e5f. This fixes a
hang when loading the eeprom driver (see bug #35572.) GMBUS will be
re-enabled later, differently.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Reported-by: Marek Otahal <markotahal@gmail.com>
Tested-by: Yermandu Patapitafious <yermandu.dev@gmail.com>
Tested-by: Andrew Lutomirski <luto@mit.edu>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Linus Torvalds [Thu, 16 Jun 2011 22:02:20 +0000 (15:02 -0700)]
 
Merge git://git./linux/kernel/git/ebiederm/linux-2.6-nsfd
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd:
  proc: Fix Oops on stat of /proc/<zombie pid>/ns/net
Andrea Arcangeli [Thu, 16 Jun 2011 19:56:19 +0000 (12:56 -0700)]
 
migrate: don't account swapcache as shmem
swapcache will reach the below code path in migrate_page_move_mapping,
and swapcache is accounted as NR_FILE_PAGES but it's not accounted as
NR_SHMEM.
Hugh pointed out we must use PageSwapCache instead of comparing
mapping to &swapper_space, to avoid build failure with CONFIG_SWAP=n.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 16 Jun 2011 17:26:58 +0000 (10:26 -0700)]
 
Merge branch 'rc-fixes' of git://git./linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  kbuild: Call depmod.sh via shell
  perf: clear out make flags when calling kernel make kernelver
Linus Torvalds [Thu, 16 Jun 2011 17:21:59 +0000 (10:21 -0700)]
 
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  AFS: Use i_generation not i_version for the vnode uniquifier
  AFS: Set s_id in the superblock to the volume name
  vfs: Fix data corruption after failed write in __block_write_begin()
  afs: afs_fill_page reads too much, or wrong data
  VFS: Fix vfsmount overput on simultaneous automount
  fix wrong iput on d_inode introduced by 
e6bc45d65d
  Delay struct net freeing while there's a sysfs instance refering to it
  afs: fix sget() races, close leak on umount
  ubifs: fix sget races
  ubifs: split allocation of ubifs_info into a separate function
  fix leak in proc_set_super()
Linus Torvalds [Thu, 16 Jun 2011 16:46:24 +0000 (09:46 -0700)]
 
Merge branch 'sh-fixes-for-linus' of git://git./linux/kernel/git/lethal/sh-3.x
* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-3.x:
  sh: sh7724: Add USBHS DMAEngine support
  sh: ecovec: Add renesas_usbhs support
  sh, exec: remove redundant set_fs(USER_DS)
  drivers: sh: resume enabled clocks fix
  dmaengine: shdma: SH_DMAC_MAX_CHANNELS message fix
  sh: Fix up xchg/cmpxchg corruption with gUSA RB.
  sh: Remove compressed kernel libgcc dependency.
  sh: fix wrong icache/dcache address-array start addr in cache-debugfs.
Linus Torvalds [Thu, 16 Jun 2011 16:46:08 +0000 (09:46 -0700)]
 
Merge branch 'rmobile-fixes-for-linus' of git://git./linux/kernel/git/lethal/sh-3.x
* 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-3.x:
  ARM: mach-shmobile: mackerel: tidyup usbhs driver settings
  ARM: mach-shmobile: Correct SCIF port types for SH7367.
  ARM: mach-shmobile: sh73a0 gic_arch_extn.irq_set_wake() fix
  ARM: mach-shmobile: Mackerel USB platform data update
  ARM: mach-shmobile: AG5EVM SDHI1 platform data update
Linus Torvalds [Thu, 16 Jun 2011 16:45:47 +0000 (09:45 -0700)]
 
Merge branch 'fbdev-fixes-for-linus' of git://git./linux/kernel/git/lethal/fbdev-3.x
* 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-3.x:
  fbdev: sh_mobile_hdmi: fix regression: statically enable RTPM
  fbdev/atyfb: Fix 2 defined-but-not-used warnings
  efifb: Fix call to wrong unregister function
  video: s3c-fb: move enabling channel for window
  video: s3c-fb: fix virtual resolution checking
  video: s3c-fb: fix misleading kfree in remove function
Linus Torvalds [Thu, 16 Jun 2011 16:44:20 +0000 (09:44 -0700)]
 
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  SELinux: skip file_name_trans_write() when policy downgraded.
  selinux: fix case of names with whitespace/multibytes on /selinux/create
David Howells [Mon, 13 Jun 2011 23:45:44 +0000 (00:45 +0100)]
 
AFS: Use i_generation not i_version for the vnode uniquifier
Store the AFS vnode uniquifier in the i_generation field, not the i_version
field of the inode struct.  i_version can then be given the AFS data version
number.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Howells [Mon, 13 Jun 2011 23:38:44 +0000 (00:38 +0100)]
 
AFS: Set s_id in the superblock to the volume name
Set s_id in the superblock to the name of the AFS volume that this superblock
corresponds to.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Jan Kara [Mon, 13 Jun 2011 22:58:27 +0000 (00:58 +0200)]
 
vfs: Fix data corruption after failed write in __block_write_begin()
I've got a report of a file corruption from fsxlinux on ext3. The important
operations to the page were:
mapwrite to a hole
partial write to the page
read - found the page zeroed from the end of the normal write
The culprit seems to be that if get_block() fails in __block_write_begin()
(e.g. transient ENOSPC in ext3), the function does ClearPageUptodate(page).
Thus when we retry the write, the logic in __block_write_begin() thinks zeroing
of the page is needed and overwrites old data.  In fact, I don't see why we
should ever need to zero the uptodate bit here - either the page was uptodate
when we entered __block_write_begin() and it should stay so when we leave it,
or it was not uptodate and noone had right to set it uptodate during
__block_write_begin() so it remains !uptodate when we leave as well. So just
remove clearing of the bit.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Anton Blanchard [Mon, 13 Jun 2011 21:31:12 +0000 (22:31 +0100)]
 
afs: afs_fill_page reads too much, or wrong data
afs_fill_page should read the page that is about to be written but
the current implementation has a number of issues. If we aren't
extending the file we always read PAGE_CACHE_SIZE at offset 0. If we
are extending the file we try to read the entire file.
Change afs_fill_page to read PAGE_CACHE_SIZE at the right offset,
clamped to i_size.
While here, avoid calling afs_fill_page when we are doing a
PAGE_CACHE_SIZE write.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Randy Dunlap [Tue, 14 Jun 2011 22:00:18 +0000 (15:00 -0700)]
 
staging: fix iio builds when IIO_RING_BUFFER is not enabled
Fix build by moving enum list outside of
#ifdef CONFIG_IIO_RING_BUFFER.
  drivers/staging/iio/accel/adis16201_core.c:413: error: 'ADIS16201_SCAN_SUPPLY' undeclared here (not in a function)
  drivers/staging/iio/accel/adis16201_core.c:417: error: 'ADIS16201_SCAN_TEMP' undeclared here (not in a function)
  ..
  drivers/staging/iio/accel/adis16203_core.c:374: error: 'ADIS16203_SCAN_SUPPLY' undeclared here (not in a function)
  drivers/staging/iio/accel/adis16203_core.c:378: error: 'ADIS16203_SCAN_AUX_ADC' undeclared here (not in a function)
  ..
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Jonathan Cameron <jic23@cam.ac.uk>
Cc: linux-iio@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Thu, 16 Jun 2011 14:10:06 +0000 (15:10 +0100)]
 
VFS: Fix vfsmount overput on simultaneous automount
[Kudos to dhowells for tracking that crap down]
If two processes attempt to cause automounting on the same mountpoint at the
same time, the vfsmount holding the mountpoint will be left with one too few
references on it, causing a BUG when the kernel tries to clean up.
The problem is that lock_mount() drops the caller's reference to the
mountpoint's vfsmount in the case where it finds something already mounted on
the mountpoint as it transits to the mounted filesystem and replaces path->mnt
with the new mountpoint vfsmount.
During a pathwalk, however, we don't take a reference on the vfsmount if it is
the same as the one in the nameidata struct, but do_add_mount() doesn't know
this.
The fix is to make sure we have a ref on the vfsmount of the mountpoint before
calling do_add_mount().  However, if lock_mount() doesn't transit, we're then
left with an extra ref on the mountpoint vfsmount which needs releasing.
We can handle that in follow_managed() by not making assumptions about what
we can and what we cannot get from lookup_mnt() as the current code does.
The callers of follow_managed() expect that reference to path->mnt will be
grabbed iff path->mnt has been changed.  follow_managed() and follow_automount()
keep track of whether such reference has been grabbed and assume that it'll
happen in those and only those cases that'll have us return with changed
path->mnt.  That assumption is almost correct - it breaks in case of
racing automounts and in even harder to hit race between following a mountpoint
and a couple of mount --move.  The thing is, we don't need to make that
assumption at all - after the end of loop in follow_manage() we can check
if path->mnt has ended up unchanged and do mntput() if needed.
The BUG can be reproduced with the following test program:
	#include <stdio.h>
	#include <sys/types.h>
	#include <sys/stat.h>
	#include <unistd.h>
	#include <sys/wait.h>
	int main(int argc, char **argv)
	{
		int pid, ws;
		struct stat buf;
		pid = fork();
		stat(argv[1], &buf);
		if (pid > 0) wait(&ws);
		return 0;
	}
and the following procedure:
 (1) Mount an NFS volume that on the server has something else mounted on a
     subdirectory.  For instance, I can mount / from my server:
	mount warthog:/ /mnt -t nfs4 -r
     On the server /data has another filesystem mounted on it, so NFS will see
     a change in FSID as it walks down the path, and will mark /mnt/data as
     being a mountpoint.  This will cause the automount code to be triggered.
     !!! Do not look inside the mounted fs at this point !!!
 (2) Run the above program on a file within the submount to generate two
     simultaneous automount requests:
	/tmp/forkstat /mnt/data/testfile
 (3) Unmount the automounted submount:
	umount /mnt/data
 (4) Unmount the original mount:
	umount /mnt
     At this point the kernel should throw a BUG with something like the
     following:
	BUG: Dentry 
ffff880032e3c5c0{i=2,n=} still in use (1) [unmount of nfs4 0:12]
Note that the bug appears on the root dentry of the original mount, not the
mountpoint and not the submount because sys_umount() hasn't got to its final
mntput_no_expire() yet, but this isn't so obvious from the call trace:
 [<
ffffffff8117cd82>] shrink_dcache_for_umount+0x69/0x82
 [<
ffffffff8116160e>] generic_shutdown_super+0x37/0x15b
 [<
ffffffffa00fae56>] ? nfs_super_return_all_delegations+0x2e/0x1b1 [nfs]
 [<
ffffffff811617f3>] kill_anon_super+0x1d/0x7e
 [<
ffffffffa00d0be1>] nfs4_kill_super+0x60/0xb6 [nfs]
 [<
ffffffff81161c17>] deactivate_locked_super+0x34/0x83
 [<
ffffffff811629ff>] deactivate_super+0x6f/0x7b
 [<
ffffffff81186261>] mntput_no_expire+0x18d/0x199
 [<
ffffffff811862a8>] mntput+0x3b/0x44
 [<
ffffffff81186d87>] release_mounts+0xa2/0xbf
 [<
ffffffff811876af>] sys_umount+0x47a/0x4ba
 [<
ffffffff8109e1ca>] ? trace_hardirqs_on_caller+0x1fd/0x22f
 [<
ffffffff816ea86b>] system_call_fastpath+0x16/0x1b
as do_umount() is inlined.  However, you can see release_mounts() in there.
Note also that it may be necessary to have multiple CPU cores to be able to
trigger this bug.
Tested-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Ian Kent <raven@themaw.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Török Edwin [Wed, 15 Jun 2011 21:06:14 +0000 (00:06 +0300)]
 
fix wrong iput on d_inode introduced by 
e6bc45d65d
Git bisection shows that commit 
e6bc45d65df8599fdbae73be9cec4ceed274db53 causes
BUG_ONs under high I/O load:
kernel BUG at fs/inode.c:1368!
[ 2862.501007] Call Trace:
[ 2862.501007]  [<
ffffffff811691d8>] d_kill+0xf8/0x140
[ 2862.501007]  [<
ffffffff81169c19>] dput+0xc9/0x190
[ 2862.501007]  [<
ffffffff8115577f>] fput+0x15f/0x210
[ 2862.501007]  [<
ffffffff81152171>] filp_close+0x61/0x90
[ 2862.501007]  [<
ffffffff81152251>] sys_close+0xb1/0x110
[ 2862.501007]  [<
ffffffff814c14fb>] system_call_fastpath+0x16/0x1b
A reliable way to reproduce this bug is:
Login to KDE, run 'rsnapshot sync', and apt-get install openjdk-6-jdk,
and apt-get remove openjdk-6-jdk.
The buggy part of the patch is this:
	struct inode *inode = NULL;
.....
-               if (nd.last.name[nd.last.len])
-                       goto slashes;
                inode = dentry->d_inode;
-               if (inode)
-                       ihold(inode);
+               if (nd.last.name[nd.last.len] || !inode)
+                       goto slashes;
+               ihold(inode)
...
	if (inode)
		iput(inode);	/* truncate the inode here */
If nd.last.name[nd.last.len] is nonzero (and thus goto slashes branch is taken),
and dentry->d_inode is non-NULL, then this code now does an additional iput on
the inode, which is wrong.
Fix this by only setting the inode variable if nd.last.name[nd.last.len] is 0.
Reference: https://lkml.org/lkml/2011/6/15/50
Reported-by: Norbert Preining <preining@logic.at>
Reported-by: Török Edwin <edwintorok@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Thu, 16 Jun 2011 07:35:09 +0000 (00:35 -0700)]
 
mm: get rid of the most spurious find_vma_prev() users
We have some users of this function that date back to before the vma
list was doubly linked, and just are silly.  These days, you can find
the previous vma by just following the vma->vm_prev pointer.
In some cases you don't need any find_vma() lookup at all, and in other
cases you're better off with the regular "find_vma()" that uses the vma
cache front-end lookup.
Some "find_vma_prev()" users are still valid, though.  For example, in
the case of a stack that grows up, it can be the case that we don't find
any 'vma' at all (because we're looking up an address that is past the
last vma), and that the stack that we want to grow is the 'prev' vma.
But that kind of special case aside, we generally should prefer to use
'find_vma()'.
Noticed due to a totally unrelated POWER memory corruption bug that just
happened to hit in 'find_vma_prev()' and made me go "Hmm - why are we
using that function here?".
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christian Dietrich [Sat, 4 Jun 2011 15:36:43 +0000 (15:36 +0000)]
 
drivers/gpu/drm: use printk_ratelimited instead of printk_ratelimit
Since printk_ratelimit() shouldn't be used anymore (see comment in
include/linux/printk.h), replace it with printk_ratelimited.
Signed-off-by: Christian Dietrich <christian.dietrich@informatik.uni-erlangen.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 14 Jun 2011 06:13:55 +0000 (06:13 +0000)]
 
drm/radeon: workaround a hw bug on some radeon chipsets with all-0 EDIDs.
Some RS690 chipsets seem to end up with floating connectors, either
a DVI connector isn't actually populated, or an add-in HDMI card
is available but not installed. In this case we seem to get a NULL byte
response for each byte of the i2c transaction, so we detect this
case and if we see it we don't do anymore DDC transactions on this
connector.
I've tested this on my RS690 without the HDMI card installed and
it seems to work fine.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Dave Airlie [Tue, 14 Jun 2011 06:13:54 +0000 (06:13 +0000)]
 
drm: make debug levels match in edid failure code.
this puts the header and followup at the same loglevel as the
hex dump code.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Alex Deucher [Mon, 13 Jun 2011 22:02:51 +0000 (22:02 +0000)]
 
drm/radeon/kms: clear wb memory by default
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Tue, 14 Jun 2011 15:27:38 +0000 (15:27 +0000)]
 
drm/radeon/kms: be more pedantic about the g5 quirk (v2)
I don't think Apple offered any other cards for
this mac, so I doubt this will be an issue, but just
to be on the safe side, check the pci ids as well.
v2: fix spelling in commit message
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: Joachim Henke <j-o@users.sourceforge.net>
Cc: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Tue, 14 Jun 2011 19:15:53 +0000 (19:15 +0000)]
 
drm/radeon/kms: signed fix for evergreen thermal
temperature is signed.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Wolfram Sang [Wed, 15 Jun 2011 09:26:47 +0000 (11:26 +0200)]
 
drm: populate irq_by_busid-member for pci
Commit 8410ea (drm: rework PCI/platform driver interface) implemented
drm_pci_irq_by_busid() but forgot to make it available in the
drm_pci_bus-struct.
This caused a freeze on my Radeon9600-equipped laptop when executing glxgears.
Thanks to Michel for noticing the flaw.
[airlied: made function static also]
Reported-by: Michel Dänzer <daenzer@vmware.com>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Kuninori Morimoto [Wed, 15 Jun 2011 06:08:28 +0000 (06:08 +0000)]
 
sh: sh7724: Add USBHS DMAEngine support
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Kuninori Morimoto [Wed, 15 Jun 2011 06:08:18 +0000 (06:08 +0000)]
 
sh: ecovec: Add renesas_usbhs support
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Linus Torvalds [Thu, 16 Jun 2011 05:01:36 +0000 (22:01 -0700)]
 
Merge branch 'fixes' of /home/rmk/linux-2.6-arm
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm:
  ARM: footbridge: fix clock event support
  ARM: footbridge: fix debug macros
  ARM: initrd: disable initrds outside of memory
  ARM: extend Code: line by one 16-bit quantity for Thumb instructions
  ARM: 6955/1: cmpxchg syscall should data abort if page not write
  ARM: 6954/1: zImage: fix Thumb2 breakage
  ARM: 6953/1: DT: don't try to access physical address zero
  ARM: 6949/2: mach-u300: fix compilaton warning in IO accessors
  Revert "ARM: 6944/1: mm: allow ASID 0 to be allocated to tasks"
  Revert "ARM: 6943/1: mm: use TTBR1 instead of reserved context ID"
  davinci: make PCM platform devices static
  arm: davinci: Fix fallout from generic irq chip conversion
  ARM: 6894/1: mmci: trigger card detect IRQs on falling and rising edges
  ARM: 6952/1: fix lockdep warning of "unannotated irqs-off"
  ARM: 6951/1: include .bss in memory layout information
  ARM: 6948/1: Fix .size directives for __arm{7,9}tdmi_proc_info
  ARM: 6947/2: mach-u300: fix compilation error in timer
  ARM: 6946/1: vexpress: move v2m clock init to init_early
  ARM: mx51/sdma: Check the chip revision in run-time
  arm: mxs: include asm/processor.h for cpu_relax()
Linus Torvalds [Thu, 16 Jun 2011 04:53:52 +0000 (21:53 -0700)]
 
Revert "fs/exec.c: use BUILD_BUG_ON for VM_STACK_FLAGS & VM_STACK_INCOMPLETE_SETUP"
This reverts commit 
7f81c8890c15a10f5220bebae3b6dfae4961962a.
It turns out that it's not actually a build-time check on x86-64 UML,
which does some seriously crazy stuff with VM_STACK_FLAGS.
The VM_STACK_FLAGS define depends on the arch-supplied
VM_STACK_DEFAULT_FLAGS value, and on x86-64 UML we have
  arch/um/sys-x86_64/shared/sysdep/vm-flags.h:
	#define VM_STACK_DEFAULT_FLAGS \
		(test_thread_flag(TIF_IA32) ? vm_stack_flags32 : vm_stack_flags)
	#define VM_STACK_DEFAULT_FLAGS vm_stack_flags
(yes, seriously: two different #define's for that thing, with the first
one being inside an "#ifdef TIF_IA32")
It's possible that it is UML that should just be fixed in this area, but
for now let's just undo the (very small) optimization.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jörg Sommer [Wed, 15 Jun 2011 20:00:47 +0000 (13:00 -0700)]
 
Documentation: fix cgroup typos and formatting
Fix format and spelling.
Signed-off-by: Jörg Sommer <joerg@alea.gnuu.de>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jörg Sommer [Wed, 15 Jun 2011 19:59:45 +0000 (12:59 -0700)]
 
Documentation: update cgroupfs mount point
According to commit 
676db4af0430 ("cgroupfs: create /sys/fs/cgroup to
mount cgroupfs on") the canonical mountpoint for the cgroup filesystem
is /sys/fs/cgroup.  Hence, this should be used in the documentation.
Signed-off-by: Jörg Sommer <joerg@alea.gnuu.de>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Maxin B. John [Wed, 15 Jun 2011 19:58:29 +0000 (12:58 -0700)]
 
Documentation: update kmemleak supported archs
Instead of listing the architectures that are supported by
kmemleak in Documentation/kmemleak.txt, just refer people to
the list of supported architecutures in lib/Kconfig.debug so
that Documentation/kmemleak.txt does not need more updates
for this.
Signed-off-by: Maxin B. John <maxin.john@gmail.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Murray [Wed, 15 Jun 2011 19:57:09 +0000 (12:57 -0700)]
 
Documentation: update printk-formats.txt
This patch updates the incomplete documentation concerning the printk
extended format specifiers.
Signed-off-by: Andrew Murray <amurray@mpc-data.co.uk>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 16 Jun 2011 04:45:18 +0000 (21:45 -0700)]
 
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Check if lowest_mask is initialized in find_lowest_rq()
  sched: Fix need_resched() when checking peempt
Dan Rosenberg [Wed, 15 Jun 2011 22:09:01 +0000 (15:09 -0700)]
 
alpha: fix several security issues
Fix several security issues in Alpha-specific syscalls.  Untested, but
mostly trivial.
1. Signedness issue in osf_getdomainname allows copying out-of-bounds
kernel memory to userland.
2. Signedness issue in osf_sysinfo allows copying large amounts of
kernel memory to userland.
3. Typo (?) in osf_getsysinfo bounds minimum instead of maximum copy
size, allowing copying large amounts of kernel memory to userland.
4. Usage of user pointer in osf_wait4 while under KERNEL_DS allows
privilege escalation via writing return value of sys_wait4 to kernel
memory.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Geert Uytterhoeven [Wed, 15 Jun 2011 22:08:59 +0000 (15:08 -0700)]
 
drivers/misc/apds990x.c: apds990x_chip_on() should depend on CONFIG_PM || CONFIG_PM_RUNTIME
Fixes this warning:
  drivers/misc/apds990x.c: At top level:
  drivers/misc/apds990x.c:613: warning: `apds990x_chip_on' defined but not used
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Samu Onkalo <samu.p.onkalo@nokia.com>
Cc: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 15 Jun 2011 22:08:58 +0000 (15:08 -0700)]
 
ksm: fix NULL pointer dereference in scan_get_next_rmap_item()
Andrea Righi reported a case where an exiting task can race against
ksmd::scan_get_next_rmap_item (http://lkml.org/lkml/2011/6/1/742) easily
triggering a NULL pointer dereference in ksmd.
ksm_scan.mm_slot == &ksm_mm_head with only one registered mm
CPU 1 (__ksm_exit)		CPU 2 (scan_get_next_rmap_item)
 				list_empty() is false
lock				slot == &ksm_mm_head
list_del(slot->mm_list)
(list now empty)
unlock
				lock
				slot = list_entry(slot->mm_list.next)
				(list is empty, so slot is still ksm_mm_head)
				unlock
				slot->mm == NULL ... Oops
Close this race by revalidating that the new slot is not simply the list
head again.
Andrea's test case:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#define BUFSIZE getpagesize()
int main(int argc, char **argv)
{
	void *ptr;
	if (posix_memalign(&ptr, getpagesize(), BUFSIZE) < 0) {
		perror("posix_memalign");
		exit(1);
	}
	if (madvise(ptr, BUFSIZE, MADV_MERGEABLE) < 0) {
		perror("madvise");
		exit(1);
	}
	*(char *)NULL = 0;
	return 0;
}
Reported-by: Andrea Righi <andrea@betterlinux.com>
Tested-by: Andrea Righi <andrea@betterlinux.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wanlong Gao [Wed, 15 Jun 2011 22:08:56 +0000 (15:08 -0700)]
 
rtc: fix build warnings in defconfigs
RTC_CLASS is changed to bool, so 'm' is invalid.
Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Stein [Wed, 15 Jun 2011 22:08:55 +0000 (15:08 -0700)]
 
drivers/tty/serial/pch_uart.c: don't oops if dmi_get_system_info returns NULL
If dmi_get_system_info() returns NULL, pch_uart_init_port() will
dereferencea a zero pointer.
This oops was observed on an Atom based board which has no BIOS, but
a bootloder which doesn't provide DMI data.
Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Cc: Greg KH <gregkh@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nils Carlson [Wed, 15 Jun 2011 22:08:54 +0000 (15:08 -0700)]
 
drivers/char/hpet.c: fix periodic-emulation for delayed interrupts
When interrupts are delayed due to interrupt masking or due to other
interrupts being serviced the HPET periodic-emuation would fail.  This
happened because given an interval t and a time for the current interrupt
m we would compute the next time as t + m.  This works until we are
delayed for > t, in which case we would be writing a new value which is in
fact in the past.
This can be solved by computing the next time instead as (k * t) + m where
k is large enough to be in the future.  The exact computation of k is
described in a comment to the code.
More detail:
Assuming an interval of 5 between each expected interrupt we have a normal
case of
t0: interrupt, read t0 from comparator, set next interrupt t0 + 5
t5: interrupt, read t5 from comparator, set next interrupt t5 + 5
t10: interrupt, read t10 from comparator, set next interrupt t10 + 5
...
So, what happens when the interrupt is serviced too late?
t0: interrupt, read t0 from comparator, set next interrupt t0 + 5
t11: delayed interrupt serviced, read t5 from comparator, set next
interrupt t5 + 5, which is in the past!
... counter loops ...
t10: Much much later, get the next interrupt.
This can happen either because we have interrupts masked for too long
(some stupid driver goes on a printk rampage) or just because we are
pushing the limits of the interval (too small a period), or both most
probably.
My solution is to read the main counter as well and set the next interrupt
to occur at the right interval, for example:
t0: interrupt, read t0 from comparator, set next interrupt t0 + 5
t11: delayed interrupt serviced, read t5 from comparator, set next
interrupt t15 as t10 has been missed.
t15: back on track.
Signed-off-by: Nils Carlson <nils.carlson@ericsson.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
akpm@linux-foundation.org [Wed, 15 Jun 2011 22:08:52 +0000 (15:08 -0700)]
 
Documentation/feature-removal-schedule.txt: remove ns_cgroup from feature-removal-schedule.txt
Commit 
a77aea92010acf ("cgroup: remove the ns_cgroup") removed the
ns_cgroup but it forgot to remove the related doc in
feature-removal-schedule.txt.
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Serge E.  Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Wed, 15 Jun 2011 22:08:52 +0000 (15:08 -0700)]
 
mm: compaction: abort compaction if too many pages are isolated and caller is asynchronous V2
Asynchronous compaction is used when promoting to huge pages.  This is all
very nice but if there are a number of processes in compacting memory, a
large number of pages can be isolated.  An "asynchronous" process can
stall for long periods of time as a result with a user reporting that
firefox can stall for 10s of seconds.  This patch aborts asynchronous
compaction if too many pages are isolated as it's better to fail a
hugepage promotion than stall a process.
[minchan.kim@gmail.com: return COMPACT_PARTIAL for abort]
Reported-and-tested-by: Ury Stankevich <urykhy@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Wed, 15 Jun 2011 22:08:51 +0000 (15:08 -0700)]
 
mm: vmscan: do not use page_count without a page pin
It is unsafe to run page_count during the physical pfn scan because
compound_head could trip on a dangling pointer when reading
page->first_page if the compound page is being freed by another CPU.
[mgorman@suse.de: split out patch]
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Wed, 15 Jun 2011 22:08:50 +0000 (15:08 -0700)]
 
mm: compaction: ensure that the compaction free scanner does not move to the next zone
Compaction works with two scanners, a migration and a free scanner.  When
the scanners crossover, migration within the zone is complete.  The
location of the scanner is recorded on each cycle to avoid excesive
scanning.
When a zone is small and mostly reserved, it's very easy for the migration
scanner to be close to the end of the zone.  Then the following situation
can occurs
  o migration scanner isolates some pages near the end of the zone
  o free scanner starts at the end of the zone but finds that the
    migration scanner is already there
  o free scanner gets reinitialised for the next cycle as
    cc->migrate_pfn + pageblock_nr_pages
    moving the free scanner into the next zone
  o migration scanner moves into the next zone
When this happens, NR_ISOLATED accounting goes haywire because some of the
accounting happens against the wrong zone.  One zones counter remains
positive while the other goes negative even though the overall global
count is accurate.  This was reported on X86-32 with !SMP because !SMP
allows the negative counters to be visible.  The fact that it is the bug
should theoritically be possible there.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shaohua Li [Wed, 15 Jun 2011 22:08:49 +0000 (15:08 -0700)]
 
compaction: checks correct fragmentation index
fragmentation_index() returns -1000 when the allocation might succeed
This doesn't match the comment and code in compaction_suitable(). I
thought compaction_suitable should return COMPACT_PARTIAL in -1000
case, because in this case allocation could succeed depending on
watermarks.
The impact of this is that compaction starts and compact_finished() is
called which rechecks the watermarks and the free lists.  It should have
the same result in that compaction should not start but is more expensive.
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Wed, 15 Jun 2011 22:08:48 +0000 (15:08 -0700)]
 
mm/memory-failure.c: fix page isolated count mismatch
Pages isolated for migration are accounted with the vmstat counters
NR_ISOLATE_[ANON|FILE].  Callers of migrate_pages() are expected to
increment these counters when pages are isolated from the LRU.  Once the
pages have been migrated, they are put back on the LRU or freed and the
isolated count is decremented.
Memory failure is not properly accounting for pages it isolates causing
the NR_ISOLATED counters to be negative.  On SMP builds, this goes
unnoticed as negative counters are treated as 0 due to expected per-cpu
drift.  On UP builds, the counter is treated by too_many_isolated() as a
large value causing processes to enter D state during page reclaim or
compaction.  This patch accounts for pages isolated by memory failure
correctly.
[mel@csn.ul.ie: rewrote changelog]
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Josh Triplett [Wed, 15 Jun 2011 22:08:47 +0000 (15:08 -0700)]
 
gcov: disable CONFIG_CONSTRUCTORS when not needed by CONFIG_GCOV_KERNEL
CONFIG_CONSTRUCTORS controls support for running constructor functions at
kernel init time.  According to commit 
b99b87f70c7785ab ("kernel:
constructor support"), gcov (CONFIG_GCOV_KERNEL) needs this.  However,
CONFIG_CONSTRUCTORS currently defaults to y, with no option to disable it,
and CONFIG_GCOV_KERNEL depends on it.  Instead, default it to n and have
CONFIG_GCOV_KERNEL select it, so that the normal case of
CONFIG_GCOV_KERNEL=n will result in CONFIG_CONSTRUCTORS=n.
Observed in the short list of =y values in a minimal kernel configuration.
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jean Delvare [Wed, 15 Jun 2011 22:08:46 +0000 (15:08 -0700)]
 
MAINTAINERS: add entry for legacy eeprom driver
I shall maintain the legacy eeprom driver, until we finally get rid of it.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Wed, 15 Jun 2011 22:08:46 +0000 (15:08 -0700)]
 
memcg: avoid percpu cached charge draining at softlimit
Based on Michal Hocko's comment.
We are not draining per cpu cached charges during soft limit reclaim
because background reclaim doesn't care about charges.  It tries to free
some memory and charges will not give any.
Cached charges might influence only selection of the biggest soft limit
offender but as the call is done only after the selection has been already
done it makes no change.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Wed, 15 Jun 2011 22:08:45 +0000 (15:08 -0700)]
 
memcg: fix percpu cached charge draining frequency
For performance, memory cgroup caches some "charge" from res_counter into
per cpu cache.  This works well but because it's cache, it needs to be
flushed in some cases.  Typical cases are
   1. when someone hit limit.
   2. when rmdir() is called and need to charges to be 0.
But "1" has problem.
Recently, with large SMP machines, we see many kworker runs because of
flushing memcg's cache.  Bad things in implementation are that even if a
cpu contains a cache for memcg not related to a memcg which hits limit,
drain code is called.
This patch does
        A) check percpu cache contains a useful data or not.
        B) check other asynchronous percpu draining doesn't run.
        C) don't call local cpu callback.
(*)This patch avoid changing the calling condition with hard-limit.
When I run "cat 1Gfile > /dev/null" under 300M limit memcg,
[Before]
13767 kamezawa  20   0 98.6m  424  416 D 10.0  0.0   0:00.61 cat
   58 root      20   0     0    0    0 S  0.6  0.0   0:00.09 kworker/2:1
   60 root      20   0     0    0    0 S  0.6  0.0   0:00.08 kworker/4:1
    4 root      20   0     0    0    0 S  0.3  0.0   0:00.02 kworker/0:0
   57 root      20   0     0    0    0 S  0.3  0.0   0:00.05 kworker/1:1
   61 root      20   0     0    0    0 S  0.3  0.0   0:00.05 kworker/5:1
   62 root      20   0     0    0    0 S  0.3  0.0   0:00.05 kworker/6:1
   63 root      20   0     0    0    0 S  0.3  0.0   0:00.05 kworker/7:1
[After]
 2676 root      20   0 98.6m  416  416 D  9.3  0.0   0:00.87 cat
 2626 kamezawa  20   0 15192 1312  920 R  0.3  0.0   0:00.28 top
    1 root      20   0 19384 1496 1204 S  0.0  0.0   0:00.66 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd
    3 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
    4 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kworker/0:0
[akpm@linux-foundation.org: make percpu_charge_mutex static, tweak comments]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Wed, 15 Jun 2011 22:08:44 +0000 (15:08 -0700)]
 
memcg: fix wrong check of noswap with softlimit
Hierarchical reclaim doesn't swap out if memsw and resource limits are
thye same (memsw_is_minimum == true) because we would hit mem+swap limit
anyway (during hard limit reclaim).
If it comes to the soft limit we shouldn't consider memsw_is_minimum at
all because it doesn't make much sense.  Either the soft limit is bellow
the hard limit and then we cannot hit mem+swap limit or the direct reclaim
takes a precedence.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Wed, 15 Jun 2011 22:08:43 +0000 (15:08 -0700)]
 
memcg: clear mm->owner when last possible owner leaves
The following crash was reported:
> Call Trace:
> [<
ffffffff81139792>] mem_cgroup_from_task+0x15/0x17
> [<
ffffffff8113a75a>] __mem_cgroup_try_charge+0x148/0x4b4
> [<
ffffffff810493f3>] ? need_resched+0x23/0x2d
> [<
ffffffff814cbf43>] ? preempt_schedule+0x46/0x4f
> [<
ffffffff8113afe8>] mem_cgroup_charge_common+0x9a/0xce
> [<
ffffffff8113b6d1>] mem_cgroup_newpage_charge+0x5d/0x5f
> [<
ffffffff81134024>] khugepaged+0x5da/0xfaf
> [<
ffffffff81078ea0>] ? __init_waitqueue_head+0x4b/0x4b
> [<
ffffffff81133a4a>] ? add_mm_counter.constprop.5+0x13/0x13
> [<
ffffffff81078625>] kthread+0xa8/0xb0
> [<
ffffffff814d13e8>] ? sub_preempt_count+0xa1/0xb4
> [<
ffffffff814d5664>] kernel_thread_helper+0x4/0x10
> [<
ffffffff814ce858>] ? retint_restore_args+0x13/0x13
> [<
ffffffff8107857d>] ? __init_kthread_worker+0x5a/0x5a
What happens is that khugepaged tries to charge a huge page against an mm
whose last possible owner has already exited, and the memory controller
crashes when the stale mm->owner is used to look up the cgroup to charge.
mm->owner has never been set to NULL with the last owner going away, but
nobody cared until khugepaged came along.
Even then it wasn't a problem because the final mmput() on an mm was
forced to acquire and release mmap_sem in write-mode, preventing an
exiting owner to go away while the mmap_sem was held, and until "692e0b3
mm: thp: optimize memcg charge in khugepaged", the memory cgroup charge
was protected by mmap_sem in read-mode.
Instead of going back to relying on the mmap_sem to enforce lifetime of a
task, this patch ensures that mm->owner is properly set to NULL when the
last possible owner is exiting, which the memory controller can handle
just fine.
[akpm@linux-foundation.org: tweak comments]
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Hugh Dickins <hughd@google.com>
Reported-by: Dave Jones <davej@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Wed, 15 Jun 2011 22:08:42 +0000 (15:08 -0700)]
 
memcg: fix init_page_cgroup nid with sparsemem
Commit 
21a3c9646873 ("memcg: allocate memory cgroup structures in local
nodes") makes page_cgroup allocation as NUMA aware.  But that caused a
problem https://bugzilla.kernel.org/show_bug.cgi?id=36192.
The problem was getting a NID from invalid struct pages, which was not
initialized because it was out-of-node, out of [node_start_pfn,
node_end_pfn)
Now, with sparsemem, page_cgroup_init scans pfn from 0 to max_pfn.  But
this may scan a pfn which is not on any node and can access memmap which
is not initialized.
This makes page_cgroup_init() for SPARSEMEM node aware and remove a code
to get nid from page->flags.  (Then, we'll use valid NID always.)
[akpm@linux-foundation.org: try to fix up comments]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Wed, 15 Jun 2011 22:08:41 +0000 (15:08 -0700)]
 
mm: memory.numa_stat: fix file permission
Commit 
406eb0c9ba76 ("memcg: add memory.numastat api for numa
statistics") adds memory.numa_stat file for memory cgroup.  But the file
permissions are wrong.
  [kamezawa@bluextal linux-2.6]$ ls -l /cgroup/memory/A/memory.numa_stat
  ---------- 1 root root 0 Jun  9 18:36 /cgroup/memory/A/memory.numa_stat
This patch fixes the permission as
  [root@bluextal kamezawa]# ls -l /cgroup/memory/A/memory.numa_stat
  -r--r--r-- 1 root root 0 Jun 10 16:49 /cgroup/memory/A/memory.numa_stat
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Miao [Wed, 15 Jun 2011 22:08:40 +0000 (15:08 -0700)]
 
leds: fix the incorrect display in menuconfig
Seems when a config option does not have a dependency of the menuconfig,
it messes the display of the rest configs, even if it's a hidden one.
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>