pandora-kernel.git
13 years agocris: Convert V10 interrupt handling
Thomas Gleixner [Wed, 19 Jan 2011 12:54:54 +0000 (13:54 +0100)]
cris: Convert V10 interrupt handling

Convert the irq_chip functions and install handle_simple_irq for each
interrupt. This converts V10 to the flow handling and lets us remove
__do_IRQ().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mikael Starvik <starvik@axis.com>
13 years agocris: Use irq handling wrapper
Thomas Gleixner [Wed, 19 Jan 2011 12:59:01 +0000 (13:59 +0100)]
cris: Use irq handling wrapper

Use the wrapper around __do_IRQ() so we can convert V10 and V32
seperately.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mikael Starvik <starvik@axis.com>
13 years agoh8300: Use generic irq Kconfig
Thomas Gleixner [Wed, 19 Jan 2011 11:26:32 +0000 (12:26 +0100)]
h8300: Use generic irq Kconfig

Switch to the generic irq Kconfig. h8300 has all irq chips converted
to the new functions, so select the GENERIC_HARDIRQS_NO_DEPRECATED
switch as well. Fixup the resulting fallout in show_interrupts().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
13 years agoh8300: Convert interrupt handling to flow handler
Thomas Gleixner [Wed, 19 Jan 2011 11:18:57 +0000 (12:18 +0100)]
h8300: Convert interrupt handling to flow handler

__do_IRQ is deprecated so h8300 needs to be converted to proper flow
handling. The irq chip is simple and does not required any
mask/ack/eoi functions, so we can use handle_simple_irq.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
13 years agoh8300: Convert to new irq_chip functions
Thomas Gleixner [Wed, 19 Jan 2011 11:15:29 +0000 (12:15 +0100)]
h8300: Convert to new irq_chip functions

No functional change, just straight forward conversion.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
13 years agonet_sched: accurate bytes/packets stats/rates
Eric Dumazet [Fri, 21 Jan 2011 07:31:33 +0000 (23:31 -0800)]
net_sched: accurate bytes/packets stats/rates

In commit 44b8288308ac9d (net_sched: pfifo_head_drop problem), we fixed
a problem with pfifo_head drops that incorrectly decreased
sch->bstats.bytes and sch->bstats.packets

Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
previously enqueued packet, and bstats cannot be changed, so
bstats/rates are not accurate (over estimated)

This patch changes the qdisc_bstats updates to be done at dequeue() time
instead of enqueue() time. bstats counters no longer account for dropped
frames, and rates are more correct, since enqueue() bursts dont have
effect on dequeue() rate.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'fix/asoc' into for-linus
Takashi Iwai [Fri, 21 Jan 2011 07:10:14 +0000 (08:10 +0100)]
Merge branch 'fix/asoc' into for-linus

13 years agoMerge branch 'fix/misc' into for-linus
Takashi Iwai [Fri, 21 Jan 2011 07:10:09 +0000 (08:10 +0100)]
Merge branch 'fix/misc' into for-linus

13 years agopowerpc/mpic: Fix mask/unmask timeout message
Scott Wood [Mon, 17 Jan 2011 12:10:41 +0000 (12:10 +0000)]
powerpc/mpic: Fix mask/unmask timeout message

Don't say that enable timed out when it was disable, and
show which IRQ had the problem.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/pseries: Add BNX2=m to defconfig
Nishanth Aravamudan [Thu, 13 Jan 2011 13:22:39 +0000 (13:22 +0000)]
powerpc/pseries: Add BNX2=m to defconfig

Upcoming servers will include a Broadcom NIC, add to the defconfig to
increase testing coverage and make sure mainline builds come up with
networking.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Enable 64kB pages and 1024 threads in pseries config
Anton Blanchard [Wed, 12 Jan 2011 02:14:32 +0000 (02:14 +0000)]
powerpc: Enable 64kB pages and 1024 threads in pseries config

- Enable 64kB pages so it gets some regular testing.

- The largest POWER7 has 1024 threads so bump NR_CPUS it to match.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Disable mcount tracers in pseries defconfig
Anton Blanchard [Wed, 12 Jan 2011 02:12:43 +0000 (02:12 +0000)]
powerpc: Disable mcount tracers in pseries defconfig

IRQSOFF_TRACER and STACK_TRACER force the kernel to be built with -pg
which is a substantial overhead.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/boot/dts: Install dts from the right directory
Ben Hutchings [Sat, 8 Jan 2011 14:24:01 +0000 (14:24 +0000)]
powerpc/boot/dts: Install dts from the right directory

The dts-installed variable is initialised using a wildcard path that
will be expanded relative to the build directory.  Use the existing
variable dtstree to generate an absolute wildcard path that will work
when building in a separate directory.

Reported-by: Gerhard Pircher <gerhard_pircher@gmx.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Gerhard Pircher <gerhard_pircher@gmx.net> [against 2.6.32]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: machine_check_generic is wrong on 64bit
Anton Blanchard [Tue, 11 Jan 2011 19:52:31 +0000 (19:52 +0000)]
powerpc: machine_check_generic is wrong on 64bit

Decoding machine checks is CPU specific and so machine_check_generic doesn't
do the right thing on 64bit chips. Luckily we never call into this code
because we call ppc_md.machine_check_exception instead if available.

Since we check cur_cpu_spec->machine_check before calling it, we may as
well remove machine_check_generic from 64bit archs.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Check RTAS extended log flag before checking length
Anton Blanchard [Tue, 11 Jan 2011 19:51:31 +0000 (19:51 +0000)]
powerpc: Check RTAS extended log flag before checking length

The spec suggests we should first check the extended log flag before checking
the length field.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Fix corruption when grabbing FWNMI data
Anton Blanchard [Tue, 11 Jan 2011 19:50:51 +0000 (19:50 +0000)]
powerpc: Fix corruption when grabbing FWNMI data

The FWNMI code uses a global buffer without any locks to read the RTAS error
information. If two CPUs take a machine check at once then we will corrupt
this buffer.

Since most FWNMI rtas messages are not of the extended type, we can create a
64bit percpu buffer and use it where possible. If we do receive an extended
RTAS log then we fall back to the old behaviour of using the global buffer.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Rework pseries machine check handler
Anton Blanchard [Tue, 11 Jan 2011 19:49:19 +0000 (19:49 +0000)]
powerpc: Rework pseries machine check handler

Rework pseries machine check handler:

- If MSR_RI isn't set, we cannot recover even if the machine check was fully
  recovered

- Rename nonfatal to recovered

- Handle RTAS_DISP_LIMITED_RECOVERY

- Use BUS_MCEERR_AR instead of BUS_ADRERR

- Don't check all the RTAS error log fields when receiving a synchronous
  machine check. Recent versions of the pseries firmware do not fill them
  in during a machine check and instead send a follow up error log with
  the detailed information. If we see a synchronous machine check, and we
  came from userspace then kill the task.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Don't silently handle machine checks from userspace
Anton Blanchard [Tue, 11 Jan 2011 19:48:14 +0000 (19:48 +0000)]
powerpc: Don't silently handle machine checks from userspace

If a machine check comes from userspace we send a SIGBUS to the task and
fail to printk anything.

If we are taking machine checks due to bad hardware we want to know about
it right away. Furthermore if we don't complain loudly then it will look
a lot like a bug in the userspace application, potentially causing a lot
of confusion.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Remove duplicate debugger hook in machine_check_exception
Anton Blanchard [Tue, 11 Jan 2011 19:47:20 +0000 (19:47 +0000)]
powerpc: Remove duplicate debugger hook in machine_check_exception

We are calling debugger_fault_handler twice in machine_check_exception.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Never halt RTAS error logging after receiving an unrecoverable machine check
Anton Blanchard [Tue, 11 Jan 2011 19:46:29 +0000 (19:46 +0000)]
powerpc: Never halt RTAS error logging after receiving an unrecoverable machine check

Newer versions of the System p firwmare send a partial RTAS error log in the
machine check handler with a more detailed response appearing sometime later
via check event.

This means at machine check time we do not have enough information to
ascertain exactly what went on. Furthermore, I have found the RTAS error
logs in the machine check handler contain no useful information, so halting on
them makes little sense. If we want to halt it would make more sense to do
it following the error log received sometime later via check event.

In light of this, never halt the error log in the pseries machine
check handler.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Don't force MSR_RI in machine_check_exception
Anton Blanchard [Tue, 11 Jan 2011 19:45:31 +0000 (19:45 +0000)]
powerpc: Don't force MSR_RI in machine_check_exception

We should never force MSR_RI on. If we take a machine check with MSR_RI off
then we have no chance of recovering safely.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Print 32 bits of DSISR in show_regs
Anton Blanchard [Tue, 11 Jan 2011 19:44:30 +0000 (19:44 +0000)]
powerpc: Print 32 bits of DSISR in show_regs

We were printing 64 bits of DSISR in show_regs even though it is 32 bit.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/kdump: Disable ftrace during kexec
Anton Blanchard [Thu, 6 Jan 2011 18:00:36 +0000 (18:00 +0000)]
powerpc/kdump: Disable ftrace during kexec

We should disable ftrace during kexec, some of the tracers are very invasive
and we do not want them going off while doing the low level work of swapping
one kernel out for another. This mirrors what we do on x86.

Even though we cannot return from a kexec on powerpc (since we do not implement
CONFIG_KEXEC_JUMP), add the restore code in case we do one day.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/kdump: Move crash_kexec_stop_spus to kdump crash handler
Anton Blanchard [Fri, 21 Jan 2011 02:43:59 +0000 (13:43 +1100)]
powerpc/kdump: Move crash_kexec_stop_spus to kdump crash handler

Use the crash handler hooks to run the SPU stop code, just like we do for
ehea and cell RAS code.

While I'm here I noticed "CPUSs reliabally"

so fix the spelling MISTAKESs reliabally.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/kexec: Remove empty ppc_md.machine_kexec_prepare
Anton Blanchard [Thu, 6 Jan 2011 17:58:36 +0000 (17:58 +0000)]
powerpc/kexec: Remove empty ppc_md.machine_kexec_prepare

We check for a valid handler before calling ppc_md.machine_kexec_prepare
so we can just remove these empty handlers.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/kexec: Don't initialise kexec hooks to default handlers
Anton Blanchard [Thu, 6 Jan 2011 17:57:03 +0000 (17:57 +0000)]
powerpc/kexec: Don't initialise kexec hooks to default handlers

There's no need to initialise ppc_md.machine_kexec and
ppc_md.machine_kexec_prepare to the default handlers.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/kdump: Remove ppc_md.machine_crash_shutdown
Anton Blanchard [Thu, 6 Jan 2011 17:56:09 +0000 (17:56 +0000)]
powerpc/kdump: Remove ppc_md.machine_crash_shutdown

No one uses ppc_md.machine_crash_shutdown, so remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/kexec: Remove ppc_md.machine_kexec
Anton Blanchard [Thu, 6 Jan 2011 17:55:36 +0000 (17:55 +0000)]
powerpc/kexec: Remove ppc_md.machine_kexec

No one uses ppc_md.machine_kexec, so remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/kexec: Remove ppc_md.machine_kexec_cleanup
Anton Blanchard [Thu, 6 Jan 2011 17:54:58 +0000 (17:54 +0000)]
powerpc/kexec: Remove ppc_md.machine_kexec_cleanup

No one uses ppc_md.machine_kexec_cleanup, so remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/kexec: Move all ppc_md kexec function pointers together
Anton Blanchard [Thu, 6 Jan 2011 17:54:15 +0000 (17:54 +0000)]
powerpc/kexec: Move all ppc_md kexec function pointers together

Move all the kexec handlers together.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/cell: Use system_wq in cpufreq_spudemand
Tejun Heo [Mon, 3 Jan 2011 03:49:25 +0000 (03:49 +0000)]
powerpc/cell: Use system_wq in cpufreq_spudemand

With cmwq, there's no reason to use a separate workqueue in
cpufreq_spudemand.  Use system_wq instead.  The work items are already
sync canceled on stop, so it's already guaranteed that no work is
running when spu_gov_exit() is entered.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Dave Jones <davej@redhat.com>
Cc: cpufreq@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/macintosh: Fix wrong test in fan_{read,write}_reg()
roel kluin [Fri, 31 Dec 2010 04:57:46 +0000 (04:57 +0000)]
powerpc/macintosh: Fix wrong test in fan_{read,write}_reg()

Fix error test in fan_{read,write}_reg()

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/rtas_flash: Use simple_read_from_buffer
Akinobu Mita [Fri, 24 Dec 2010 20:03:59 +0000 (20:03 +0000)]
powerpc/rtas_flash: Use simple_read_from_buffer

Simplify read file operation for /proc/powerpc/rtas/* interface
by using simple_read_from_buffer.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/spufs: Use simple_write_to_buffer
Akinobu Mita [Fri, 24 Dec 2010 20:03:56 +0000 (20:03 +0000)]
powerpc/spufs: Use simple_write_to_buffer

Simplify several write fileoperations for spufs by using
simple_write_to_buffer().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/ppc32/tracing: Add stack frame to calls of trace_hardirqs_on/off
Steven Rostedt [Wed, 22 Dec 2010 16:42:56 +0000 (16:42 +0000)]
powerpc/ppc32/tracing: Add stack frame to calls of trace_hardirqs_on/off

32-bit variant of the previous patch for 64-bit:

<<
    When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
    With one level stack. But if we have irqsoff tracing enabled,
    it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
    goes two stack frames up. If this is from user space, then there may
    not exist a second stack....
>>

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc/ppc64/tracing: Add stack frame to calls of trace_hardirqs_on/off
Steven Rostedt [Thu, 23 Dec 2010 19:46:06 +0000 (19:46 +0000)]
powerpc/ppc64/tracing: Add stack frame to calls of trace_hardirqs_on/off

    When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
    With one level stack. But if we have irqsoff tracing enabled,
    it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
    goes two stack frames up. If this is from user space, then there may
    not exist a second stack.

    Add a second stack when calling trace_hardirqs_on/off() otherwise
    the following oops might occur:

    Oops: Kernel access of bad area, sig: 11 [#1]
    PREEMPT SMP NR_CPUS=2 PA Semi PWRficient
    last sysfs file: /sys/block/sda/size
    Modules linked in: ohci_hcd ehci_hcd usbcore
    NIP: c0000000000e1c00 LR: c0000000000034d4 CTR: 000000011012c440
    REGS: c00000003e2f3af0 TRAP: 0300   Not tainted  (2.6.37-rc6+)
    MSR: 9000000000001032 <ME,IR,DR>  CR: 48044444  XER: 20000000
    DAR: 00000001ffb9db50, DSISR: 0000000040000000
    TASK = c00000003e1a00a0[2088] 'emacs' THREAD: c00000003e2f0000 CPU: 1
    GPR00: 0000000000000001 c00000003e2f3d70 c00000000084e0d0 c0000000008816e8
    GPR04: 000000001034c678 000000001032e8f9 0000000010336540 0000000040020000
    GPR08: 0000000040020000 00000001ffb9db40 c00000003e2f3e30 0000000060000000
    GPR12: 100000000000f032 c00000000fff0280 000000001032e8c9 0000000000000008
    GPR16: 00000000105be9c0 00000000105be950 00000000105be9b0 00000000105be950
    GPR20: 00000000ffb9dc50 00000000ffb9dbf0 00000000102f0000 00000000102f0000
    GPR24: 00000000102e0000 00000000102f0000 0000000010336540 c0000000009ded38
    GPR28: 00000000102e0000 c0000000000034d4 c0000000007ccb10 c00000003e2f3d70
    NIP [c0000000000e1c00] .trace_hardirqs_off+0xb0/0x1d0
    LR [c0000000000034d4] decrementer_common+0xd4/0x100
    Call Trace:
    [c00000003e2f3d70] [c00000003e2f3e30] 0xc00000003e2f3e30 (unreliable)
    [c00000003e2f3e30] [c0000000000034d4] decrementer_common+0xd4/0x100
    Instruction dump:
    81690000 7f8b0000 419e0018 f84a0028 60000000 60000000 60000000 e95f0000
    80030000 e92a0000 eb6301f8 2f800000 <eb89001041fe00dc a06d000a eb1e8050
    ---[ end trace 4ec7fd2be9240928 ]---

Reported-by: Joerg Sommer <joerg@alea.gnuu.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agopowerpc: Ensure the else case of feature sections will fit
Michael Ellerman [Sun, 7 Nov 2010 18:22:29 +0000 (18:22 +0000)]
powerpc: Ensure the else case of feature sections will fit

When we create an alternative feature section, the else case must be the
same size or smaller than the body. This is because when we patch the
else case in we just overwrite the body, so there must be room.

Up to now we just did this by inspection, but it's quite easy to enforce
it in the assembler, so we should.

The only change is to add the ifgt block, but that effects the alignment
of the tabs and so the whole macro is modified.

Also add a test, but #if 0 it because we don't want to break the build.
Anyone who's modifying the feature macros should enable the test.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
13 years agoMerge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 21 Jan 2011 02:30:37 +0000 (18:30 -0800)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  smp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init/main.c
  lockdep: Move early boot local IRQ enable/disable status to init/main.c

13 years agoACPI / PM: Call suspend_nvs_free() earlier during resume
Rafael J. Wysocki [Wed, 19 Jan 2011 21:27:55 +0000 (22:27 +0100)]
ACPI / PM: Call suspend_nvs_free() earlier during resume

It turns out that some device drivers map pages from the ACPI NVS region
during resume using ioremap(), which conflicts with ioremap_cache() used
for mapping those pages by the NVS save/restore code in nvs.c.

Make the NVS pages mapped by the code in nvs.c be unmapped before device
drivers' resume routines run.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoACPI: Introduce acpi_os_ioremap()
Rafael J. Wysocki [Wed, 19 Jan 2011 21:27:14 +0000 (22:27 +0100)]
ACPI: Introduce acpi_os_ioremap()

Commit ca9b600be38c ("ACPI / PM: Make suspend_nvs_save() use
acpi_os_map_memory()") attempted to prevent the code in osl.c and nvs.c
from using different ioremap() variants by making the latter use
acpi_os_map_memory() for mapping the NVS pages.  However, that also
requires acpi_os_unmap_memory() to be used for unmapping them, which
causes synchronize_rcu() to be executed many times in a row
unnecessarily and introduces substantial delays during resume on some
systems.

Instead of using acpi_os_map_memory() for mapping the NVS pages in nvs.c
introduce acpi_os_ioremap() calling ioremap_cache() and make the code in
both osl.c and nvs.c use it.

Reported-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agocifs: fix up CIFSSMBEcho for unaligned access
Jeff Layton [Fri, 21 Jan 2011 02:19:25 +0000 (21:19 -0500)]
cifs: fix up CIFSSMBEcho for unaligned access

Make sure that CIFSSMBEcho can handle unaligned fields. Also fix a minor
bug that causes this warning:

fs/cifs/cifssmb.c: In function 'CIFSSMBEcho':
fs/cifs/cifssmb.c:740: warning: large integer implicitly truncated to unsigned type

...WordCount is u8, not __le16, so no need to convert it.

This patch should apply cleanly on top of the rest of the patchset to
clean up unaligned access.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agoMerge branch 'for-next'
Steve French [Fri, 21 Jan 2011 02:19:30 +0000 (02:19 +0000)]
Merge branch 'for-next'

13 years agoMerge branch 'akpm'
Linus Torvalds [Fri, 21 Jan 2011 01:02:14 +0000 (17:02 -0800)]
Merge branch 'akpm'

* akpm:
  kernel/smp.c: consolidate writes in smp_call_function_interrupt()
  kernel/smp.c: fix smp_call_function_many() SMP race
  memcg: correctly order reading PCG_USED and pc->mem_cgroup
  backlight: fix 88pm860x_bl macro collision
  drivers/leds/ledtrig-gpio.c: make output match input, tighten input checking
  MAINTAINERS: update Atmel AT91 entry
  mm: fix truncate_setsize() comment
  memcg: fix rmdir, force_empty with THP
  memcg: fix LRU accounting with THP
  memcg: fix USED bit handling at uncharge in THP
  memcg: modify accounting function for supporting THP better
  fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio
  mm: compaction: prevent division-by-zero during user-requested compaction
  mm/vmscan.c: remove duplicate include of compaction.h
  memblock: fix memblock_is_region_memory()
  thp: keep highpte mapped until it is no longer needed
  kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT

13 years agokernel/smp.c: consolidate writes in smp_call_function_interrupt()
Milton Miller [Thu, 20 Jan 2011 22:44:34 +0000 (14:44 -0800)]
kernel/smp.c: consolidate writes in smp_call_function_interrupt()

We have to test the cpu mask in the interrupt handler before checking the
refs, otherwise we can start to follow an entry before its deleted and
find it partially initailzed for the next trip.  Presently we also clear
the cpumask bit before executing the called function, which implies
getting write access to the line.  After the function is called we then
decrement refs, and if they go to zero we then unlock the structure.

However, this implies getting write access to the call function data
before and after another the function is called.  If we can assert that no
smp_call_function execution function is allowed to enable interrupts, then
we can move both writes to after the function is called, hopfully allowing
both writes with one cache line bounce.

On a 256 thread system with a kernel compiled for 1024 threads, the time
to execute testcase in the "smp_call_function_many race" changelog was
reduced by about 30-40ms out of about 545 ms.

I decided to keep this as WARN because its now a buggy function, even
though the stack trace is of no value -- a simple printk would give us the
information needed.

Raw data:

Without patch:
  ipi_test startup took 1219366ns complete 539819014ns total 541038380ns
  ipi_test startup took 1695754ns complete 543439872ns total 545135626ns
  ipi_test startup took 7513568ns complete 539606362ns total 547119930ns
  ipi_test startup took 13304064ns complete 533898562ns total 547202626ns
  ipi_test startup took 8668192ns complete 544264074ns total 552932266ns
  ipi_test startup took 4977626ns complete 548862684ns total 553840310ns
  ipi_test startup took 2144486ns complete 541292318ns total 543436804ns
  ipi_test startup took 21245824ns complete 530280180ns total 551526004ns

With patch:
  ipi_test startup took 5961748ns complete 500859628ns total 506821376ns
  ipi_test startup took 8975996ns complete 495098924ns total 504074920ns
  ipi_test startup took 19797750ns complete 492204740ns total 512002490ns
  ipi_test startup took 14824796ns complete 487495878ns total 502320674ns
  ipi_test startup took 11514882ns complete 494439372ns total 505954254ns
  ipi_test startup took 8288084ns complete 502570774ns total 510858858ns
  ipi_test startup took 6789954ns complete 493388112ns total 500178066ns

#include <linux/module.h>
#include <linux/init.h>
#include <linux/sched.h> /* sched clock */

#define ITERATIONS 100

static void do_nothing_ipi(void *dummy)
{
}

static void do_ipis(struct work_struct *dummy)
{
int i;

for (i = 0; i < ITERATIONS; i++)
smp_call_function(do_nothing_ipi, NULL, 1);

printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id());
}

static struct work_struct work[NR_CPUS];

static int __init testcase_init(void)
{
int cpu;
u64 start, started, done;

start = local_clock();
for_each_online_cpu(cpu) {
INIT_WORK(&work[cpu], do_ipis);
schedule_work_on(cpu, &work[cpu]);
}
started = local_clock();
for_each_online_cpu(cpu)
flush_work(&work[cpu]);
done = local_clock();
pr_info("ipi_test startup took %lldns complete %lldns total %lldns\n",
started-start, done-started, done-start);

return 0;
}

static void __exit testcase_exit(void)
{
}

module_init(testcase_init)
module_exit(testcase_exit)
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anton Blanchard");

Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agokernel/smp.c: fix smp_call_function_many() SMP race
Anton Blanchard [Thu, 20 Jan 2011 22:44:33 +0000 (14:44 -0800)]
kernel/smp.c: fix smp_call_function_many() SMP race

I noticed a failure where we hit the following WARN_ON in
generic_smp_call_function_interrupt:

                if (!cpumask_test_and_clear_cpu(cpu, data->cpumask))
                        continue;

                data->csd.func(data->csd.info);

                refs = atomic_dec_return(&data->refs);
                WARN_ON(refs < 0);      <-------------------------

We atomically tested and cleared our bit in the cpumask, and yet the
number of cpus left (ie refs) was 0.  How can this be?

It turns out commit 54fdade1c3332391948ec43530c02c4794a38172
("generic-ipi: make struct call_function_data lockless") is at fault.  It
removes locking from smp_call_function_many and in doing so creates a
rather complicated race.

The problem comes about because:

 - The smp_call_function_many interrupt handler walks call_function.queue
   without any locking.
 - We reuse a percpu data structure in smp_call_function_many.
 - We do not wait for any RCU grace period before starting the next
   smp_call_function_many.

Imagine a scenario where CPU A does two smp_call_functions back to back,
and CPU B does an smp_call_function in between.  We concentrate on how CPU
C handles the calls:

CPU A            CPU B                  CPU C              CPU D

smp_call_function
                                        smp_call_function_interrupt
                                            walks
call_function.queue sees
data from CPU A on list

                 smp_call_function

                                        smp_call_function_interrupt
                                            walks

                                        call_function.queue sees
                                          (stale) CPU A on list
   smp_call_function int
   clears last ref on A
   list_del_rcu, unlock
smp_call_function reuses
percpu *data A
                                         data->cpumask sees and
                                         clears bit in cpumask
                                         might be using old or new fn!
                                         decrements refs below 0

set data->refs (too late!)

The important thing to note is since the interrupt handler walks a
potentially stale call_function.queue without any locking, then another
cpu can view the percpu *data structure at any time, even when the owner
is in the process of initialising it.

The following test case hits the WARN_ON 100% of the time on my PowerPC
box (having 128 threads does help :)

#include <linux/module.h>
#include <linux/init.h>

#define ITERATIONS 100

static void do_nothing_ipi(void *dummy)
{
}

static void do_ipis(struct work_struct *dummy)
{
int i;

for (i = 0; i < ITERATIONS; i++)
smp_call_function(do_nothing_ipi, NULL, 1);

printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id());
}

static struct work_struct work[NR_CPUS];

static int __init testcase_init(void)
{
int cpu;

for_each_online_cpu(cpu) {
INIT_WORK(&work[cpu], do_ipis);
schedule_work_on(cpu, &work[cpu]);
}

return 0;
}

static void __exit testcase_exit(void)
{
}

module_init(testcase_init)
module_exit(testcase_exit)
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anton Blanchard");

I tried to fix it by ordering the read and the write of ->cpumask and
->refs.  In doing so I missed a critical case but Paul McKenney was able
to spot my bug thankfully :) To ensure we arent viewing previous
iterations the interrupt handler needs to read ->refs then ->cpumask then
->refs _again_.

Thanks to Milton Miller and Paul McKenney for helping to debug this issue.

[miltonm@bga.com: add WARN_ON and BUG_ON, remove extra read of refs before initial read of mask that doesn't help (also noted by Peter Zijlstra), adjust comments, hopefully clarify scenario ]
[miltonm@bga.com: remove excess tests]
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: <stable@kernel.org> [2.6.32+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agomemcg: correctly order reading PCG_USED and pc->mem_cgroup
Johannes Weiner [Thu, 20 Jan 2011 22:44:31 +0000 (14:44 -0800)]
memcg: correctly order reading PCG_USED and pc->mem_cgroup

The placement of the read-side barrier is confused: the writer first
sets pc->mem_cgroup, then PCG_USED.  The read-side barrier has to be
between testing PCG_USED and reading pc->mem_cgroup.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agobacklight: fix 88pm860x_bl macro collision
Randy Dunlap [Thu, 20 Jan 2011 22:44:31 +0000 (14:44 -0800)]
backlight: fix 88pm860x_bl macro collision

Fix collision with kernel-supplied #define:

  drivers/video/backlight/88pm860x_bl.c:24:1: warning: "CURRENT_MASK" redefined
  arch/x86/include/asm/page_64_types.h:6:1: warning: this is the location of the previous definition

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Haojian Zhuang <haojian.zhuang@marvell.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agodrivers/leds/ledtrig-gpio.c: make output match input, tighten input checking
Janusz Krzysztofik [Thu, 20 Jan 2011 22:44:29 +0000 (14:44 -0800)]
drivers/leds/ledtrig-gpio.c: make output match input, tighten input checking

Replicate changes made to drivers/leds/ledtrig-backlight.c.

Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Richard Purdie <richard.purdie@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMAINTAINERS: update Atmel AT91 entry
Nicolas Ferre [Thu, 20 Jan 2011 22:44:27 +0000 (14:44 -0800)]
MAINTAINERS: update Atmel AT91 entry

Add two co-maintainers and update the entry with new information.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agomm: fix truncate_setsize() comment
Jan Kara [Thu, 20 Jan 2011 22:44:26 +0000 (14:44 -0800)]
mm: fix truncate_setsize() comment

Contrary to what the comment says, truncate_setsize() should be called
*before* filesystem truncated blocks.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agomemcg: fix rmdir, force_empty with THP
KAMEZAWA Hiroyuki [Thu, 20 Jan 2011 22:44:25 +0000 (14:44 -0800)]
memcg: fix rmdir, force_empty with THP

Now, when THP is enabled, memcg's rmdir() function is broken because
move_account() for THP page is not supported.

This will cause account leak or -EBUSY issue at rmdir().
This patch fixes the issue by supporting move_account() THP pages.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agomemcg: fix LRU accounting with THP
KAMEZAWA Hiroyuki [Thu, 20 Jan 2011 22:44:24 +0000 (14:44 -0800)]
memcg: fix LRU accounting with THP

memory cgroup's LRU stat should take care of size of pages because
Transparent Hugepage inserts hugepage into LRU.  If this value is the
number wrong, memory reclaim will not work well.

Note: only head page of THP's huge page is linked into LRU.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agomemcg: fix USED bit handling at uncharge in THP
KAMEZAWA Hiroyuki [Thu, 20 Jan 2011 22:44:24 +0000 (14:44 -0800)]
memcg: fix USED bit handling at uncharge in THP

Now, under THP:

at charge:
  - PageCgroupUsed bit is set to all page_cgroup on a hugepage.
    ....set to 512 pages.
at uncharge
  - PageCgroupUsed bit is unset on the head page.

So, some pages will remain with "Used" bit.

This patch fixes that Used bit is set only to the head page.
Used bits for tail pages will be set at splitting if necessary.

This patch adds this lock order:
   compound_lock() -> page_cgroup_move_lock().

[akpm@linux-foundation.org: fix warning]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agomemcg: modify accounting function for supporting THP better
KAMEZAWA Hiroyuki [Thu, 20 Jan 2011 22:44:23 +0000 (14:44 -0800)]
memcg: modify accounting function for supporting THP better

mem_cgroup_charge_statisics() was designed for charging a page but now, we
have transparent hugepage.  To fix problems (in following patch) it's
required to change the function to get the number of pages as its
arguments.

The new function gets following as argument.
  - type of page rather than 'pc'
  - size of page which is accounted.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agofs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio
David Dillow [Thu, 20 Jan 2011 22:44:22 +0000 (14:44 -0800)]
fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio

When using devices that support max_segments > BIO_MAX_PAGES (256), direct
IO tries to allocate a bio with more pages than allowed, which leads to an
oops in dio_bio_alloc().  Clamp the request to the supported maximum, and
change dio_bio_alloc() to reflect that bio_alloc() will always return a
bio when called with __GFP_WAIT and a valid number of vectors.

[akpm@linux-foundation.org: remove redundant BUG_ON()]
Signed-off-by: David Dillow <dillowda@ornl.gov>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agomm: compaction: prevent division-by-zero during user-requested compaction
Johannes Weiner [Thu, 20 Jan 2011 22:44:21 +0000 (14:44 -0800)]
mm: compaction: prevent division-by-zero during user-requested compaction

Up until 3e7d344 ("mm: vmscan: reclaim order-0 and use compaction instead
of lumpy reclaim"), compaction skipped calculating the fragmentation index
of a zone when compaction was explicitely requested through the procfs
knob.

However, when compaction_suitable was introduced, it did not come with an
extra check for order == -1, set on explicit compaction requests, and
passed this order on to the fragmentation index calculation, where it
overshifts the number of requested pages, leading to a division by zero.

This patch makes sure that order == -1 is recognized as the flag it is
rather than passing it along as valid order parameter.

[akpm@linux-foundation.org: add comment, per Mel]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agomm/vmscan.c: remove duplicate include of compaction.h
Jesper Juhl [Thu, 20 Jan 2011 22:44:20 +0000 (14:44 -0800)]
mm/vmscan.c: remove duplicate include of compaction.h

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agomemblock: fix memblock_is_region_memory()
Tomi Valkeinen [Thu, 20 Jan 2011 22:44:20 +0000 (14:44 -0800)]
memblock: fix memblock_is_region_memory()

memblock_is_region_memory() uses reserved memblocks to search for the
given region, while it should use the memory memblocks.

I encountered the problem with OMAP's framebuffer ram allocation.
Normally the ram is allocated dynamically, and this function is not
called.  However, if we want to pass the framebuffer from the bootloader
to the kernel (to retain the boot image), this function is used to check
the validity of the kernel parameters for the framebuffer ram area.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@nokia.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agothp: keep highpte mapped until it is no longer needed
Johannes Weiner [Thu, 20 Jan 2011 22:44:18 +0000 (14:44 -0800)]
thp: keep highpte mapped until it is no longer needed

Two users reported THP-related crashes on 32-bit x86 machines.  Their oops
reports indicated an invalid pte, and subsequent code inspection showed
that the highpte is actually used after unmap.

The fix is to unmap the pte only after all operations against it are
finished.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Ilya Dryomov <idryomov@gmail.com>
Reported-by: werner <w.landgraf@ru.ru>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Steven Rostedt <rostedt@goodmis.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agokconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT
David Rientjes [Thu, 20 Jan 2011 22:44:16 +0000 (14:44 -0800)]
kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT

The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
is used to configure any non-standard kernel with a much larger scope than
only small devices.

This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
references to the option throughout the kernel.  A new CONFIG_EMBEDDED
option is added that automatically selects CONFIG_EXPERT when enabled and
can be used in the future to isolate options that should only be
considered for embedded systems (RISC architectures, SLOB, etc).

Calling the option "EXPERT" more accurately represents its intention: only
expert users who understand the impact of the configuration changes they
are making should enable it.

Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David Woodhouse <david.woodhouse@intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Greg KH <gregkh@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Robin Holt <holt@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agobonding: Ensure that we unshare skbs prior to calling pskb_may_pull
Neil Horman [Thu, 20 Jan 2011 09:02:31 +0000 (09:02 +0000)]
bonding: Ensure that we unshare skbs prior to calling pskb_may_pull

Recently reported oops:

kernel BUG at net/core/skbuff.c:813!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/bond0/broadcast
CPU 8
Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
scsi_transport_sas dm_mod [last unloaded: microcode]

Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
scsi_transport_sas dm_mod [last unloaded: microcode]
Pid: 0, comm: swapper Not tainted 2.6.32-71.el6.x86_64 #1 BladeCenter HS22
-[7870AC1]-
RIP: 0010:[<ffffffff81405b16>]  [<ffffffff81405b16>]
pskb_expand_head+0x36/0x1e0
RSP: 0018:ffff880028303b70  EFLAGS: 00010202
RAX: 0000000000000002 RBX: ffff880c6458ec80 RCX: 0000000000000020
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880c6458ec80
RBP: ffff880028303bc0 R08: ffffffff818a6180 R09: ffff880c6458ed64
R10: ffff880c622b36c0 R11: 0000000000000400 R12: 0000000000000000
R13: 0000000000000180 R14: ffff880c622b3000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000038653452a4 CR3: 0000000001001000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff8806649c2000, task ffff880c64f16ab0)
Stack:
 ffff880028303bc0 ffffffff8104fff9 000000000000001c 0000000100000000
<0> ffff880000047d80 ffff880c6458ec80 000000000000001c ffff880c6223da00
<0> ffff880c622b3000 0000000000000000 ffff880028303c10 ffffffff81407f7a
Call Trace:
<IRQ>
 [<ffffffff8104fff9>] ? __wake_up_common+0x59/0x90
 [<ffffffff81407f7a>] __pskb_pull_tail+0x2aa/0x360
 [<ffffffffa0244530>] bond_arp_rcv+0x2c0/0x2e0 [bonding]
 [<ffffffff814a0857>] ? packet_rcv+0x377/0x440
 [<ffffffff8140f21b>] netif_receive_skb+0x2db/0x670
 [<ffffffff8140f788>] napi_skb_finish+0x58/0x70
 [<ffffffff8140fc89>] napi_gro_receive+0x39/0x50
 [<ffffffffa01286eb>] ixgbe_clean_rx_irq+0x35b/0x900 [ixgbe]
 [<ffffffffa01290f6>] ixgbe_clean_rxtx_many+0x136/0x240 [ixgbe]
 [<ffffffff8140fe53>] net_rx_action+0x103/0x210
 [<ffffffff81073bd7>] __do_softirq+0xb7/0x1e0
 [<ffffffff810d8740>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff810142cc>] call_softirq+0x1c/0x30
 [<ffffffff81015f35>] do_softirq+0x65/0xa0
 [<ffffffff810739d5>] irq_exit+0x85/0x90
 [<ffffffff814cf915>] do_IRQ+0x75/0xf0
 [<ffffffff81013ad3>] ret_from_intr+0x0/0x11
 <EOI>
 [<ffffffff8101bc01>] ? mwait_idle+0x71/0xd0
 [<ffffffff814cd80a>] ? atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff81011e96>] cpu_idle+0xb6/0x110
 [<ffffffff814c17c8>] start_secondary+0x1fc/0x23f

Resulted from bonding driver registering packet handlers via dev_add_pack and
then trying to call pskb_may_pull. If another packet handler (like for AF_PACKET
sockets) gets called first, the delivered skb will have a user count > 1, which
causes pskb_may_pull to BUG halt when it does its skb_shared check.  Fix this by
calling skb_share_check prior to the may_pull call sites in the bonding driver
to clone the skb when needed.  Tested by myself and the reported successfully.

Signed-off-by: Neil Horman
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocxgb4: fix reported state of interfaces without link
Dimitris Michailidis [Wed, 19 Jan 2011 15:29:05 +0000 (15:29 +0000)]
cxgb4: fix reported state of interfaces without link

Currently tools like ip and ifconfig report incorrect state for cxgb4
interfaces that are up but do not have link and do so until first link
establishment.  This is because the initial netif_carrier_off call is
before register_netdev and it needs to be after to be fully effective.
Fix this by moving netif_carrier_off into .ndo_open.

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 21 Jan 2011 00:39:23 +0000 (16:39 -0800)]
Merge branch 'tty-linus' of git://git./linux/kernel/git/gregkh/tty-2.6

* 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  tty: update MAINTAINERS file due to driver movement
  tty: move drivers/serial/ to drivers/tty/serial/
  tty: move hvc drivers to drivers/tty/hvc/

13 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 21 Jan 2011 00:37:55 +0000 (16:37 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched, cgroup: Use exit hook to avoid use-after-free crash
  sched: Fix signed unsigned comparison in check_preempt_tick()
  sched: Replace rq->bkl_count with rq->rq_sched_info.bkl_count
  sched, autogroup: Fix CONFIG_RT_GROUP_SCHED sched_setscheduler() failure
  sched: Display autogroup names in /proc/sched_debug
  sched: Reinstate group names in /proc/sched_debug
  sched: Update effective_load() to use global share weights

13 years agoMerge branch 'xen/xenbus' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
Linus Torvalds [Fri, 21 Jan 2011 00:37:28 +0000 (16:37 -0800)]
Merge branch 'xen/xenbus' of git://git./linux/kernel/git/jeremy/xen

* 'xen/xenbus' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xenbus: Fix memory leak on release
  xenbus: avoid zero returns from read()
  xenbus: add missing wakeup in concurrent read/write
  xenbus: allow any xenbus command over /proc/xen/xenbus
  xenfs/xenbus: report partial reads/writes correctly

13 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Fri, 21 Jan 2011 00:36:38 +0000 (16:36 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/sfrench/cifs-2.6

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: mangle existing header for SMB_COM_NT_CANCEL
  cifs: remove code for setting timeouts on requests
  [CIFS] cifs: reconnect unresponsive servers
  cifs: set up recurring workqueue job to do SMB echo requests
  cifs: add ability to send an echo request
  cifs: add cifs_call_async
  cifs: allow for different handling of received response
  cifs: clean up sync_mid_result
  cifs: don't reconnect server when we don't get a response
  cifs: wait indefinitely for responses
  cifs: Use mask of ACEs for SID Everyone to calculate all three permissions user, group, and other
  cifs: Fix regression during share-level security mounts (Repost)
  [CIFS] Update cifs version number
  cifs: move mid result processing into common function
  cifs: move locked sections out of DeleteMidQEntry and AllocMidQEntry
  cifs: clean up accesses to midCount
  cifs: make wait_for_free_request take a TCP_Server_Info pointer
  cifs: no need to mark smb_ses_list as cifs_demultiplex_thread is exiting
  cifs: don't fail writepages on -EAGAIN errors
  CIFS: Fix oplock break handling (try #2)

13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
Linus Torvalds [Fri, 21 Jan 2011 00:31:20 +0000 (16:31 -0800)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio: remove virtio-pci root device
  LGUEST_GUEST: fix unmet direct dependencies (VIRTUALIZATION && VIRTIO)
  lguest: compile fixes
  lguest: Use this_cpu_ops
  lguest: document --rng in example Launcher
  lguest: example launcher to use guard pages, drop PROT_EXEC, fix limit logic
  lguest: --username and --chroot options

13 years agoMerge branch 'for-38-rc2' of git://codeaurora.org/quic/kernel/davidb/linux-msm
Linus Torvalds [Fri, 21 Jan 2011 00:30:22 +0000 (16:30 -0800)]
Merge branch 'for-38-rc2' of git://codeaurora.org/quic/kernel/davidb/linux-msm

* 'for-38-rc2' of git://codeaurora.org/quic/kernel/davidb/linux-msm:
  msm: qsd8x50: Platform data isn't init data

13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Fri, 21 Jan 2011 00:29:43 +0000 (16:29 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  trusted-keys: avoid scattring va_end()
  trusted-keys: check for NULL before using it
  trusted-keys: another free memory bugfix
  trusted-keys: free memory bugfix

13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes
Linus Torvalds [Fri, 21 Jan 2011 00:29:04 +0000 (16:29 -0800)]
Merge git://git./linux/kernel/git/steve/gfs2-2.6-fixes

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
  GFS2: Fix error path in gfs2_lookup_by_inum()
  GFS2: remove iopen glocks from cache on failed deletes

13 years agoMerge branch 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux...
Linus Torvalds [Fri, 21 Jan 2011 00:28:34 +0000 (16:28 -0800)]
Merge branch 'acpica' of git://git./linux/kernel/git/lenb/linux-acpi-2.6

* 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPICA: Update version to 20110112
  ACPICA: Update all ACPICA copyrights and signons to 2011
  ACPICA: Fix issues/fault with automatic "serialized" method support
  ACPICA: Debugger: Lock namespace for duration of a namespace dump
  ACPICA: Fix namespace race condition
  ACPICA: Fix memory leak in acpi_ev_asynch_execute_gpe_method().

13 years agoFix broken "pipe: use event aware wakeups" optimization
Linus Torvalds [Fri, 21 Jan 2011 00:21:59 +0000 (16:21 -0800)]
Fix broken "pipe: use event aware wakeups" optimization

Commit e462c448fdc8 ("pipe: use event aware wakeups") optimized the pipe
event wakeup calls to avoid wakeups if the events do not match the
requested set.

However, the optimization was buggy, in that it didn't actually use the
correct sets for the events: when we make room for more data to be
written, the pipe poll() routine will return both the POLLOUT _and_
POLLWRNORM bits.  Similarly for read.

And most critically, when a pipe is released, that will potentially
result in POLLHUP|POLLERR (depending on whether it was the last reader
or writer), not just the regular POLLIN|POLLOUT.

This bug showed itself as a hung gnome-screensaver-dialog process, stuck
forever (or at least until it was poked by a signal or by being traced)
in a poll() system call.

Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoi915: Fix i915 suspend delay
Linus Torvalds [Thu, 20 Jan 2011 21:19:55 +0000 (13:19 -0800)]
i915: Fix i915 suspend delay

During system suspend, the "wait for ring buffer to empty" loop would
always time out after three seconds, because the faster cached ring
buffer head read would always return zero.  Force the slow-and-careful
PIO read on all but the first iterations of the loop to fix it.

This also removes the unused (and useless) 'actual_head' variable that
tried to approximate doing this, but did it incorrectly.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Dave Airlie <airlied@linux.ie>
Cc: DRI mailing list <dri-devel@lists.freedesktop.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge remote branch 'kumar/next' into merge
Benjamin Herrenschmidt [Fri, 21 Jan 2011 00:00:44 +0000 (11:00 +1100)]
Merge remote branch 'kumar/next' into merge

13 years agofirewire: net: is not experimental anymore
Stefan Richter [Wed, 19 Jan 2011 23:07:46 +0000 (00:07 +0100)]
firewire: net: is not experimental anymore

thanks to Clemens' and Maxim's fixes to firewire-ohci and -net in the
last two kernel releases.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
13 years agofirewire: net: invalidate ARP entries of removed nodes
Maxim Levitsky [Mon, 29 Nov 2010 02:09:52 +0000 (04:09 +0200)]
firewire: net: invalidate ARP entries of removed nodes

This makes it possible to resume communication with a node that dropped
off the bus for a brief period.  Otherwise communication will only be
possible after ARP cache entry timeouts.

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (rebased)
13 years agofirewire: core: fix unstable I/O with Canon camcorder
Stefan Richter [Sat, 15 Jan 2011 17:19:48 +0000 (18:19 +0100)]
firewire: core: fix unstable I/O with Canon camcorder

Regression since commit 10389536742c, "firewire: core: check for 1394a
compliant IRM, fix inaccessibility of Sony camcorder":

The camcorder Canon MV5i generates lots of bus resets when asynchronous
requests are sent to it (e.g. Config ROM read requests or FCP Command
write requests) if the camcorder is not root node.  This causes drop-
outs in videos or makes the camcorder entirely inaccessible.
https://bugzilla.redhat.com/show_bug.cgi?id=633260

Fix this by allowing any Canon device, even if it is a pre-1394a IRM
like MV5i are, to remain root node (if it is at least Cycle Master
capable).  With the FireWire controller cards that I tested, MV5i always
becomes root node when plugged in and left to its own devices.

Reported-by: Ralf Lange
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: <stable@kernel.org> # 2.6.32.y and newer
13 years agocifs: fix unaligned accesses in cifsConvertToUCS
Jeff Layton [Thu, 20 Jan 2011 18:36:51 +0000 (13:36 -0500)]
cifs: fix unaligned accesses in cifsConvertToUCS

Move cifsConvertToUCS to cifs_unicode.c where all of the other unicode
related functions live. Have it store mapped characters in 'temp' and
then use put_unaligned_le16 to copy it to the target buffer. Also fix
the comments to match kernel coding style.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: clean up unaligned accesses in cifs_unicode.c
Jeff Layton [Thu, 20 Jan 2011 18:36:51 +0000 (13:36 -0500)]
cifs: clean up unaligned accesses in cifs_unicode.c

Make sure we use get/put_unaligned routines when accessing wide
character strings.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: fix unaligned access in check2ndT2 and coalesce_t2
Jeff Layton [Thu, 20 Jan 2011 18:36:51 +0000 (13:36 -0500)]
cifs: fix unaligned access in check2ndT2 and coalesce_t2

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: clean up unaligned accesses in validate_t2
Jeff Layton [Thu, 20 Jan 2011 18:36:51 +0000 (13:36 -0500)]
cifs: clean up unaligned accesses in validate_t2

...and clean up function to reduce indentation.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: use get/put_unaligned functions to access ByteCount
Jeff Layton [Thu, 20 Jan 2011 18:36:51 +0000 (13:36 -0500)]
cifs: use get/put_unaligned functions to access ByteCount

It's possible that when we access the ByteCount that the alignment
will be off. Most CPUs deal with that transparently, but there's
usually some performance impact. Some CPUs raise an exception on
unaligned accesses.

Fix this by accessing the byte count using the get_unaligned and
put_unaligned inlined functions. While we're at it, fix the types
of some of the variables that end up getting returns from these
functions.

Acked-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: move time field in cifsInodeInfo
Jeff Layton [Thu, 20 Jan 2011 18:36:50 +0000 (13:36 -0500)]
cifs: move time field in cifsInodeInfo

...and remove length qualifiers from bools.

Before:

/* size: 1176, cachelines: 19, members: 13 */
/* sum members: 1165, holes: 2, sum holes: 11 */
/* bit holes: 1, sum bit holes: 4 bits */
/* last cacheline: 24 bytes */

After:

/* size: 1168, cachelines: 19, members: 13 */
/* last cacheline: 16 bytes */

...savings of 8 bytes per inode.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: TCP_Server_Info diet
Jeff Layton [Thu, 20 Jan 2011 18:36:50 +0000 (13:36 -0500)]
cifs: TCP_Server_Info diet

Remove fields that are completely unused, and rearrange struct
according to recommendations by "pahole".

Before:

/* size: 1112, cachelines: 18, members: 49 */
/* sum members: 1086, holes: 8, sum holes: 26 */
/* bit holes: 1, sum bit holes: 7 bits */
/* last cacheline: 24 bytes */

After:

/* size: 1072, cachelines: 17, members: 42 */
/* sum members: 1065, holes: 3, sum holes: 7 */
/* last cacheline: 48 bytes */

...savings of 40 bytes per struct on x86_64. 21 bytes by field removal,
and 19 by reorganizing to eliminate holes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agoCIFS: Implement cifs_strict_readv (try #4)
Pavel Shilovsky [Tue, 14 Dec 2010 08:50:41 +0000 (11:50 +0300)]
CIFS: Implement cifs_strict_readv (try #4)

Read from the cache if we have at least Level II oplock - otherwise
read from the server. Add cifs_user_readv to let the client read into
iovec buffers.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agoCIFS: Implement cifs_file_strict_mmap (try #2)
Pavel Shilovsky [Tue, 14 Dec 2010 08:29:51 +0000 (11:29 +0300)]
CIFS: Implement cifs_file_strict_mmap (try #2)

Invalidate inode mapping if we don't have at least Level II oplock.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agoCIFS: Implement cifs_strict_fsync
Pavel Shilovsky [Sun, 12 Dec 2010 10:11:13 +0000 (13:11 +0300)]
CIFS: Implement cifs_strict_fsync

Invalidate inode mapping if we don't have at least Level II oplock in
cifs_strict_fsync. Also remove filemap_write_and_wait call from cifs_fsync
because it is previously called from vfs_fsync_range. Add file operations'
structures for strict cache mode.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agoCIFS: Make cifsFileInfo_put work with strict cache mode
Pavel Shilovsky [Sun, 21 Nov 2010 19:36:12 +0000 (22:36 +0300)]
CIFS: Make cifsFileInfo_put work with strict cache mode

On strict cache mode when we close the last file handle of the inode we
should set invalid_mapping flag on this inode to prevent data coherency
problem when we open it again but it has been modified on the server.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agoACPI / Battery: remove battery refresh on resume
Linus Torvalds [Thu, 20 Jan 2011 21:14:10 +0000 (13:14 -0800)]
ACPI / Battery: remove battery refresh on resume

This partially reverts commit da8aeb92d4853f37e281f11fddf61f9c7d84c3cd
("ACPI / Battery: Update information on info notification and resume"),
which causes a hang on resume on at least some machines.

This bug was bisected on an ASUS EeePC 901, which hangs at resume time
if we do that "acpi_battery_refresh(battery)" in the battery resume
function.

Rafael suspects we'll still need to refresh the sysfs files upon resume,
but that that can be done from a PM notifier (that will run after
thawing user space).

Bisected-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoxen: fix non-ANSI function warning in irq.c
Randy Dunlap [Sun, 9 Jan 2011 04:00:36 +0000 (20:00 -0800)]
xen: fix non-ANSI function warning in irq.c

Fix sparse warning for non-ANSI function declaration:

arch/x86/xen/irq.c:129:30: warning: non-ANSI function declaration of function 'xen_init_irq_ops'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
13 years agocifs: mangle existing header for SMB_COM_NT_CANCEL
Jeff Layton [Tue, 11 Jan 2011 12:24:24 +0000 (07:24 -0500)]
cifs: mangle existing header for SMB_COM_NT_CANCEL

The NT_CANCEL command looks just like the original command, except for a
few small differences. The send_nt_cancel function however currently takes
a tcon, which we don't have in SendReceive and SendReceive2.

Instead of "respinning" the entire header for an NT_CANCEL, just mangle
the existing header by replacing just the fields we need. This means we
don't need a tcon and allows us to call it from other places.

Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: remove code for setting timeouts on requests
Jeff Layton [Tue, 11 Jan 2011 12:24:23 +0000 (07:24 -0500)]
cifs: remove code for setting timeouts on requests

Since we don't time out individual requests anymore, remove the code
that we used to use for setting timeouts on different requests.

Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years ago[CIFS] cifs: reconnect unresponsive servers
Steve French [Thu, 20 Jan 2011 18:06:34 +0000 (18:06 +0000)]
[CIFS] cifs: reconnect unresponsive servers

If the server isn't responding to echoes, we don't want to leave tasks
hung waiting for it to reply. At that point, we'll want to reconnect
so that soft mounts can return an error to userspace quickly.

If the client hasn't received a reply after a specified number of echo
intervals, assume that the transport is down and attempt to reconnect
the socket.

The number of echo_intervals to wait before attempting to reconnect is
tunable via a module parameter. Setting it to 0, means that the client
will never attempt to reconnect. The default is 5.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
13 years agocifs: set up recurring workqueue job to do SMB echo requests
Jeff Layton [Tue, 11 Jan 2011 12:24:23 +0000 (07:24 -0500)]
cifs: set up recurring workqueue job to do SMB echo requests

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: add ability to send an echo request
Jeff Layton [Tue, 11 Jan 2011 12:24:21 +0000 (07:24 -0500)]
cifs: add ability to send an echo request

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: add cifs_call_async
Jeff Layton [Tue, 11 Jan 2011 12:24:21 +0000 (07:24 -0500)]
cifs: add cifs_call_async

Add a function that will send a request, and set up the mid for an
async reply.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: allow for different handling of received response
Jeff Layton [Tue, 11 Jan 2011 12:24:21 +0000 (07:24 -0500)]
cifs: allow for different handling of received response

In order to incorporate async requests, we need to allow for a more
general way to do things on receive, rather than just waking up a
process.

Turn the task pointer in the mid_q_entry into a callback function and a
generic data pointer. When a response comes in, or the socket is
reconnected, cifsd can call the callback function in order to wake up
the process.

The default is to just wake up the current process which should mean no
change in behavior for existing code.

Also, clean up the locking in cifs_reconnect. There doesn't seem to be
any need to hold both the srv_mutex and GlobalMid_Lock when walking the
list of mids.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: clean up sync_mid_result
Jeff Layton [Tue, 11 Jan 2011 12:24:02 +0000 (07:24 -0500)]
cifs: clean up sync_mid_result

Make it use a switch statement based on the value of the midStatus. If
the resp_buf is set, then MID_RESPONSE_RECEIVED is too.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: don't reconnect server when we don't get a response
Jeff Layton [Tue, 11 Jan 2011 12:24:02 +0000 (07:24 -0500)]
cifs: don't reconnect server when we don't get a response

We only want to force a reconnect to the server under very limited and
specific circumstances. Now that we have processes waiting indefinitely
for responses, we shouldn't reach this point unless a reconnect is
already in process. Thus, there's no reason to re-mark the server for
reconnect here.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
13 years agocifs: wait indefinitely for responses
Jeff Layton [Tue, 11 Jan 2011 12:24:02 +0000 (07:24 -0500)]
cifs: wait indefinitely for responses

The client should not be timing out on individual SMB requests. Too much
of the state between client and server is tied to the state of the
socket. If we time out requests and issue spurious disconnects then that
comprimises data integrity.

Instead of doing this complicated dance where we try to decide how long
to wait for a response for particular requests, have the client instead
wait indefinitely for a response. Also, use a TASK_KILLABLE sleep here
so that fatal signals will break out of this waiting.

Later patches will add support for detecting dead peers and forcing
reconnects based on that.

Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de>
Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>