pandora-kernel.git
19 years ago[SPARC64]: Do not include compat.h from asm-sparc64/signal.h any more.
David S. Miller [Mon, 2 Oct 2006 21:30:45 +0000 (14:30 -0700)]
[SPARC64]: Do not include compat.h from asm-sparc64/signal.h any more.

It's not needed, now that all of that stuff is now in
asm/compat_signal.h, and it breaks the build too :-)

Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[SPARC64]: Move signal compat bits to new header file.
David S. Miller [Mon, 2 Oct 2006 21:17:57 +0000 (14:17 -0700)]
[SPARC64]: Move signal compat bits to new header file.

Create asm-sparc64/compat_signal.h and stuff things there.

This avoids the "linux/compat.h includes asm/signal.h but
asm/signal.h needs compat_sigset_t which isn't defined yet"
problems introduced recently.

Signed-off-by: David S. Miller <davem@davemloft.net>
19 years agoAdd prototype for sigset_from_compat()
Linus Torvalds [Mon, 2 Oct 2006 21:05:20 +0000 (14:05 -0700)]
Add prototype for sigset_from_compat()

Duh.  I screwed up editing David Howells patch in commit
3f2e05e90e0846c42626e3d272454f26be34a1bc, and the actual declaration for
the sigset_from_compat() function went missing. My bad.

Olaf Hering saved the day and noticed that I'm a moron.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years agoMerge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
Linus Torvalds [Mon, 2 Oct 2006 15:56:58 +0000 (08:56 -0700)]
Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/netdev-2.6

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (37 commits)
  [netdrvr] hp100: encapsulate all non-module code
  drivers/net/wireless/{airo,ipw2100}: fix error handling bugs
  [netdrvr] phy: Fix bugs in error handling
  [PATCH] spidernet: Use pci_dma_mapping_error()
  [PATCH] sky2: version 1.9
  [PATCH] sky2: fragmented receive for large MTU
  [PATCH] sky2: use netif_tx_lock instead of LLTX
  [PATCH] sky2: incremental transmit completion
  [PATCH] sky2: name irq after eth for irqbalance
  [PATCH] sky2: workarounds for some 88e806x chips
  [PATCH] sky2: use standard pci register capabilties for error register
  [PATCH] sky2: gigabit full duplex negotiation
  e100, e1000, ixgb: increment version numbers
  ixgb: convert to netdev_priv(netdev)
  ixgb: combine more rx descriptors to improve performance
  e1000: possible memory leak in e1000_set_ringparam
  e1000: Janitor: Use #defined values for literals
  e1000: don't strip vlan ID if 8021q claims it
  e1000: rework polarity, NVM, eeprom code and fixes.
  e1000: driver state fixes (race fix)
  ...

19 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy...
Linus Torvalds [Mon, 2 Oct 2006 15:27:09 +0000 (08:27 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/shaggy/jfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
  JFS: White space cleanup
  [PATCH] JFS: return correct error when i-node allocation failed
  JFS: Remove shadow variable from fs/jfs/jfs_txnmgr.c:xtLog()

19 years agoMerge git://git.infradead.org/mtd-2.6
Linus Torvalds [Mon, 2 Oct 2006 15:22:17 +0000 (08:22 -0700)]
Merge git://git.infradead.org/mtd-2.6

* git://git.infradead.org/mtd-2.6:
  [MTD] Cleanup of 'ioremap balanced with iounmap for drivers/mtd subsystem'
  [MTD] fix nftl_write warning
  [MTD] fix printk warning
  [MTD ONENAND] Check OneNAND lock scheme & all block unlock command support
  [MTD ONENAND] Remove unused MTD_ONENAND_SYNC_READ configuration
  [MTD ONENAND] Fix OneNAND probe
  [MTD NAND] Provide prototype for newly-exported nand_wait_ready()
  [MTD] Remove #ifndef __KERNEL__ hack in <mtd/mtd-abi.h>
  [MTD NAND] Allow override of page read and write functions.
  [MTD NAND] Allocate chip->buffers separately to allow it to be overridden
  [MTD NAND] Split nand_scan() into two parts; allow board driver to intervene
  [MTD NAND] Export nand_wait_ready() for use by board drivers

19 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Mon, 2 Oct 2006 15:20:33 +0000 (08:20 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (35 commits)
  Input: wistron - add support for Acer TravelMate 2424NWXCi
  Input: wistron - fix setting up special buttons
  Input: add KEY_BLUETOOTH and KEY_WLAN definitions
  Input: add new BUS_VIRTUAL bus type
  Input: add driver for stowaway serial keyboards
  Input: make input_register_handler() return error codes
  Input: remove cruft that was needed for transition to sysfs
  Input: fix input module refcounting
  Input: constify input core
  Input: libps2 - rearrange exports
  Input: atkbd - support Microsoft Natural Elite Pro keyboards
  Input: i8042 - disable MUX mode on Toshiba Equium A110
  Input: i8042 - get rid of polling timer
  Input: send key up events at disconnect
  Input: constify psmouse driver
  Input: i8042 - add Amoi to the MUX blacklist
  Input: logips2pp - add sugnature 56 (Cordless MouseMan Wheel), cleanup
  Input: add driver for Touchwin serial touchscreens
  Input: add driver for Touchright serial touchscreens
  Input: add driver for Penmount serial touchscreens
  ...

19 years agoMerge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
Linus Torvalds [Mon, 2 Oct 2006 15:18:43 +0000 (08:18 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  [MIPS] Remove unused galileo-boars header files
  [MIPS] Rename SERIAL_PORT_DEFNS for EV64120
  [MIPS] Add UART IRQ number for EV64120
  [MIPS] Remove excite_flash.c
  [MIPS] Update i8259 resources.
  [MIPS] Make unwind_stack() can dig into interrupted context
  [MIPS] Stacktrace build-fix and improvement
  [MIPS] QEMU: Add support for little endian mips
  [MIPS] Remove __flush_icache_page
  [MIPS] lockdep: update defconfigs
  [MIPS] lockdep: Add STACKTRACE_SUPPORT and enable LOCKDEP_SUPPORT
  [MIPS] lockdep: fix TRACE_IRQFLAGS_SUPPORT

19 years ago[PATCH] BLOCK: Revert patch to hack around undeclared sigset_t in linux/compat.h
David Howells [Mon, 2 Oct 2006 13:12:31 +0000 (14:12 +0100)]
[PATCH] BLOCK: Revert patch to hack around undeclared sigset_t in linux/compat.h

Revert Andrew Morton's patch to temporarily hack around the lack of a
declaration of sigset_t in linux/compat.h to make the block-disablement
patches build on IA64.  This got accidentally pushed to Linus and should
be fixed in a different manner.

Also make linux/compat.h #include asm/signal.h to gain a definition of
sigset_t so that it can externally declare sigset_from_compat().

This has been compile-tested for i386, x86_64, ia64, mips, mips64, frv, ppc and
ppc64 and run-tested on frv.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] replace cad_pid by a struct pid
Cedric Le Goater [Mon, 2 Oct 2006 09:19:00 +0000 (02:19 -0700)]
[PATCH] replace cad_pid by a struct pid

There are a few places in the kernel where the init task is signaled.  The
ctrl+alt+del sequence is one them.  It kills a task, usually init, using a
cached pid (cad_pid).

This patch replaces the pid_t by a struct pid to avoid pid wrap around
problem.  The struct pid is initialized at boot time in init() and can be
modified through systctl with

/proc/sys/kernel/cad_pid

[ I haven't found any distro using it ? ]

It also introduces a small helper routine kill_cad_pid() which is used
where it seemed ok to use cad_pid instead of pid 1.

[akpm@osdl.org: cleanups, build fix]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] introduce get_task_pid() to fix unsafe get_pid()
Oleg Nesterov [Mon, 2 Oct 2006 09:18:59 +0000 (02:18 -0700)]
[PATCH] introduce get_task_pid() to fix unsafe get_pid()

proc_pid_make_inode:

ei->pid = get_pid(task_pid(task));

I think this is not safe.  get_pid() can be preempted after checking "pid
!= NULL".  Then the task exits, does detach_pid(), and RCU frees the pid.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: comment what proc_fill_cache does
Eric W. Biederman [Mon, 2 Oct 2006 09:18:57 +0000 (02:18 -0700)]
[PATCH] proc: comment what proc_fill_cache does

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: remove the useless SMP-safe comments from /proc
Eric W. Biederman [Mon, 2 Oct 2006 09:18:57 +0000 (02:18 -0700)]
[PATCH] proc: remove the useless SMP-safe comments from /proc

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: remove trailing blank entry from pid_entry arrays
Eric W. Biederman [Mon, 2 Oct 2006 09:18:56 +0000 (02:18 -0700)]
[PATCH] proc: remove trailing blank entry from pid_entry arrays

It was pointed out that since I am taking ARRAY_SIZE anyway the trailing empty
entry is silly and just wastes space.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: properly compute TGID_OFFSET
Eric W. Biederman [Mon, 2 Oct 2006 09:18:55 +0000 (02:18 -0700)]
[PATCH] proc: properly compute TGID_OFFSET

The value doesn't change but this ensures I will have the proper value when
other files are added to proc_base_stuff.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: drop tasklist lock in task_state()
Oleg Nesterov [Mon, 2 Oct 2006 09:18:54 +0000 (02:18 -0700)]
[PATCH] proc: drop tasklist lock in task_state()

task_state() needs tasklist_lock to protect ->parent/->real_parent.  However
task->parent points to nowhere only when the actions below happen in order

1) release_task(task)
2) release_task(task->parent)
3) a grace period passed

But 3) implies that the memory ops from 1) should be finished, so pid_alive()
can't be true in such a case.

Otherwise, we don't care if ->parent/->real_parent changes under us.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: convert do_task_stat() to use lock_task_sighand()
Oleg Nesterov [Mon, 2 Oct 2006 09:18:53 +0000 (02:18 -0700)]
[PATCH] proc: convert do_task_stat() to use lock_task_sighand()

Drop tasklist_lock. ->siglock protects almost all interesting data
(including sub-threads traversal) except:

->signal->tty
protected by tty_mutex

->real_parent
the task can't be unhashed while we are holding
->siglock, so ->real_parent can change from under us
but we can safely dereference it under rcu_read_lock()

->pgrp/->session
we can get inconsistent numbers if the task does
sys_setsid/daemonize at the same time. I hope this
is acceptable.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: convert task_sig() to use lock_task_sighand()
Oleg Nesterov [Mon, 2 Oct 2006 09:18:52 +0000 (02:18 -0700)]
[PATCH] proc: convert task_sig() to use lock_task_sighand()

lock_task_sighand() can take ->siglock without holding tasklist_lock.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: Use pid_task instead of open coding it
Eric W. Biederman [Mon, 2 Oct 2006 09:18:51 +0000 (02:18 -0700)]
[PATCH] proc: Use pid_task instead of open coding it

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: Merge proc_tid_attr and proc_tgid_attr
Eric W. Biederman [Mon, 2 Oct 2006 09:18:50 +0000 (02:18 -0700)]
[PATCH] proc: Merge proc_tid_attr and proc_tgid_attr

The implementation is exactly the same and there is currently nothing to
distinguish proc_tid_attr, and proc_tgid_attr.  So it is pointless to have two
separate implementations.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: Remove the hard coded inode numbers
Eric W. Biederman [Mon, 2 Oct 2006 09:18:49 +0000 (02:18 -0700)]
[PATCH] proc: Remove the hard coded inode numbers

The hard coded inode numbers in proc currently limit its maintainability,
its flexibility, and what can be done with the rest of system.  /proc limits
pid-max to 32768 on 32 bit systems it limits fd-max to 32768 on all systems,
and placing the pid in the inode number really gets in the way of implementing
subdirectories of per process information.

Ever since people started adding to the middle of the file type enumeration we
haven't been maintaing the historical inode numbers, all we have really
succeeded in doing is keeping the pid in the proc inode number.  The pid is
already available in the directory name so no information is lost removing it
from the inode number.

So if something in user space cares if we remove the inode number from the
/proc inode it is almost certainly broken.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: Factor out an instantiate method from every lookup method
Eric W. Biederman [Mon, 2 Oct 2006 09:18:49 +0000 (02:18 -0700)]
[PATCH] proc: Factor out an instantiate method from every lookup method

To remove the hard coded proc inode numbers it is necessary to be able to
create the proc inodes during readdir.  The instantiate methods are the subset
of lookup that is needed to accomplish that.

This first step just splits the lookup methods into 2 functions.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: Make the generation of the self symlink table driven
Eric W. Biederman [Mon, 2 Oct 2006 09:18:48 +0000 (02:18 -0700)]
[PATCH] proc: Make the generation of the self symlink table driven

This patch generalizes the concept of files in /proc that are related to
processes but live in the root directory of /proc

Ideally this would reuse infrastructure from the rest of the process specific
parts of proc but unfortunately security_task_to_inode must not be called on
files that are not strictly per process.  security_task_to_inode really needs
to be reexamined as the security label can change in important places that we
are not currently catching, but I'm not certain that simplifies this problem.

By at least matching the structure of the rest of proc we get more idiom reuse
and it becomes easier to spot problems in the way things are put together.

Later things like /proc/mounts are likely to be moved into proc_base as well.
If union mounts are ever supported we may be able to make /proc a union mount,
and properly split it into 2 filesystems.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] AVR32: Implement kernel_execve
Haavard Skinnemoen [Mon, 2 Oct 2006 09:18:46 +0000 (02:18 -0700)]
[PATCH] AVR32: Implement kernel_execve

Move execve() into arch/avr32/kernel/sys_avr32.c, rename it to
kernel_execve() and return the syscall return value directly without
setting errno.

This also gets rid of the __KERNEL_SYSCALLS__ stuff from unistd.h and
expands #ifdef __KERNEL__ to cover everything in unistd.h except the
__NR_foo definitions.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] remove remaining errno and __KERNEL_SYSCALLS__ references
Arnd Bergmann [Mon, 2 Oct 2006 09:18:44 +0000 (02:18 -0700)]
[PATCH] remove remaining errno and __KERNEL_SYSCALLS__ references

The last in-kernel user of errno is gone, so we should remove the definition
and everything referring to it.  This also removes the now-unused lib/execve.c
file that was introduced earlier.

Also remove every trace of __KERNEL_SYSCALLS__ that still remained in the
kernel.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] sh64: remove the use of kernel syscalls
Arnd Bergmann [Mon, 2 Oct 2006 09:18:41 +0000 (02:18 -0700)]
[PATCH] sh64: remove the use of kernel syscalls

sh64 is using system call macros to call some functions from the kernel.

The old debug code can simply be removed, since we don't really have that much
of a need for it anymore, it was mostly something that was handy during the
initial bringup.  This also brings us closer to something that looks like
readable code again..

I also added a sane kernel_thread() implementation that gets away from this,
so that should take care of sh64 at least.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Remove the use of _syscallX macros in UML
Arnd Bergmann [Mon, 2 Oct 2006 09:18:37 +0000 (02:18 -0700)]
[PATCH] Remove the use of _syscallX macros in UML

User mode linux uses _syscallX() to call into the host kernel.  The
recommended way to do this is to use the syscall() function from libc.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] provide kernel_execve on all architectures
Arnd Bergmann [Mon, 2 Oct 2006 09:18:34 +0000 (02:18 -0700)]
[PATCH] provide kernel_execve on all architectures

This adds the new kernel_execve function on all architectures that were using
_syscall3() to implement execve.

The implementation uses code from the _syscall3 macros provided in the
unistd.h header file.  I don't have cross-compilers for any of these
architectures, so the patch is untested with the exception of i386.

Most architectures can probably implement this in a nicer way in assembly or
by combining it with the sys_execve implementation itself, but this should do
it for now.

[bunk@stusta.de: m68knommu build fix]
[markh@osdl.org: build fix]
[bero@arklinux.org: build fix]
[ralf@linux-mips.org: mips fix]
[schwidefsky@de.ibm.com: s390 fix]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Bernhard Rosenkraenzer <bero@arklinux.org>
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] rename the provided execve functions to kernel_execve
Arnd Bergmann [Mon, 2 Oct 2006 09:18:31 +0000 (02:18 -0700)]
[PATCH] rename the provided execve functions to kernel_execve

Some architectures provide an execve function that does not set errno, but
instead returns the result code directly.  Rename these to kernel_execve to
get the right semantics there.  Moreover, there is no reasone for any of these
architectures to still provide __KERNEL_SYSCALLS__ or _syscallN macros, so
remove these right away.

[akpm@osdl.org: build fix]
[bunk@stusta.de: build fix]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] introduce kernel_execve
Arnd Bergmann [Mon, 2 Oct 2006 09:18:26 +0000 (02:18 -0700)]
[PATCH] introduce kernel_execve

The use of execve() in the kernel is dubious, since it relies on the
__KERNEL_SYSCALLS__ mechanism that stores the result in a global errno
variable.  As a first step of getting rid of this, change all users to a
global kernel_execve function that returns a proper error code.

This function is a terrible hack, and a later patch removes it again after the
kernel syscalls are gone.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] ipc: replace kmalloc and memset in get_undo_list with kzalloc
Matt Helsley [Mon, 2 Oct 2006 09:18:25 +0000 (02:18 -0700)]
[PATCH] ipc: replace kmalloc and memset in get_undo_list with kzalloc

Simplify get_undo_list() by dropping the unnecessary cast, removing the
size variable, and switching to kzalloc() instead of a kmalloc() followed
by a memset().

This cleanup was split then modified from Jes Sorenson's Task Notifiers
patches.

Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Cc: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] nsproxy cloning error path fix
Pavel [Mon, 2 Oct 2006 09:18:24 +0000 (02:18 -0700)]
[PATCH] nsproxy cloning error path fix

This patch fixes copy_namespaces()'s error path.

when new nsproxy (new_ns) is created pointers to namespaces (ipc, uts) are
copied from the old nsproxy.  Later in copy_utsname, copy_ipcs, etc.
according namespaces are get-ed.  On error path needed namespaces are
put-ed, so there's no need to put new nsproxy itelf as it woud cause
putting namespaces for the second time.

Found when incorporating namespaces into OpenVZ kernel.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] IPC namespace - sysctls
Kirill Korotaev [Mon, 2 Oct 2006 09:18:23 +0000 (02:18 -0700)]
[PATCH] IPC namespace - sysctls

Sysctl tweaks for IPC namespace

Signed-off-by: Pavel Emelianiov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] IPC namespace - shm
Kirill Korotaev [Mon, 2 Oct 2006 09:18:22 +0000 (02:18 -0700)]
[PATCH] IPC namespace - shm

IPC namespace support for IPC shm code.

Signed-off-by: Pavel Emelianiov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] IPC namespace - sem
Kirill Korotaev [Mon, 2 Oct 2006 09:18:22 +0000 (02:18 -0700)]
[PATCH] IPC namespace - sem

IPC namespace support for IPC sem code.

Signed-off-by: Pavel Emelianiov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] IPC namespace - msg
Kirill Korotaev [Mon, 2 Oct 2006 09:18:21 +0000 (02:18 -0700)]
[PATCH] IPC namespace - msg

IPC namespace support for IPC msg code.

Signed-off-by: Pavel Emelianiov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] IPC namespace - utils
Kirill Korotaev [Mon, 2 Oct 2006 09:18:20 +0000 (02:18 -0700)]
[PATCH] IPC namespace - utils

This patch adds basic IPC namespace functionality to
IPC utils:
- init_ipc_ns
- copy/clone/unshare/free IPC ns
- /proc preparations

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] IPC namespace core
Kirill Korotaev [Mon, 2 Oct 2006 09:18:19 +0000 (02:18 -0700)]
[PATCH] IPC namespace core

This patch set allows to unshare IPCs and have a private set of IPC objects
(sem, shm, msg) inside namespace.  Basically, it is another building block of
containers functionality.

This patch implements core IPC namespace changes:
- ipc_namespace structure
- new config option CONFIG_IPC_NS
- adds CLONE_NEWIPC flag
- unshare support

[clg@fr.ibm.com: small fix for unshare of ipc namespace]
[akpm@osdl.org: build fix]
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] uts: copy nsproxy only when needed
Serge Hallyn [Mon, 2 Oct 2006 09:18:18 +0000 (02:18 -0700)]
[PATCH] uts: copy nsproxy only when needed

The nsproxy was being copied in unshare() when anything was being unshared,
even if it was something not referenced from nsproxy.  This should end up
in some cases with far more memory usage than necessary.

Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] namespaces: utsname: implement CLONE_NEWUTS flag
Serge E. Hallyn [Mon, 2 Oct 2006 09:18:17 +0000 (02:18 -0700)]
[PATCH] namespaces: utsname: implement CLONE_NEWUTS flag

Implement a CLONE_NEWUTS flag, and use it at clone and sys_unshare.

[clg@fr.ibm.com: IPC unshare fix]
[bunk@stusta.de: cleanup]
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] namespaces: utsname: remove system_utsname
Serge E. Hallyn [Mon, 2 Oct 2006 09:18:16 +0000 (02:18 -0700)]
[PATCH] namespaces: utsname: remove system_utsname

The system_utsname isn't needed now that kernel/sysctl.c is fixed.
Nuke it.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] namespaces: utsname: sysctl
Serge E. Hallyn [Mon, 2 Oct 2006 09:18:15 +0000 (02:18 -0700)]
[PATCH] namespaces: utsname: sysctl

Sysctl uts patch.  This will need to be done another way, but since sysctl
itself needs to be container aware, 'the right thing' is a separate patchset.

[akpm@osdl.org: ia64 build fix]
[sam.vilain@catalyst.net.nz: cleanup]
[sam.vilain@catalyst.net.nz: add proc_do_utsns_string]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] namespaces: utsname: implement utsname namespaces
Serge E. Hallyn [Mon, 2 Oct 2006 09:18:14 +0000 (02:18 -0700)]
[PATCH] namespaces: utsname: implement utsname namespaces

This patch defines the uts namespace and some manipulators.
Adds the uts namespace to task_struct, and initializes a
system-wide init namespace.

It leaves a #define for system_utsname so sysctl will compile.
This define will be removed in a separate patch.

[akpm@osdl.org: build fix, cleanup]
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] namespaces: utsname: use init_utsname when appropriate
Serge E. Hallyn [Mon, 2 Oct 2006 09:18:13 +0000 (02:18 -0700)]
[PATCH] namespaces: utsname: use init_utsname when appropriate

In some places, particularly drivers and __init code, the init utsns is the
appropriate one to use.  This patch replaces those with a the init_utsname
helper.

Changes: Removed several uses of init_utsname().  Hope I picked all the
right ones in net/ipv4/ipconfig.c.  These are now changed to
utsname() (the per-process namespace utsname) in the previous
patch (2/7)

[akpm@osdl.org: CIFS fix]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Cc: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] namespaces: utsname: switch to using uts namespaces
Serge E. Hallyn [Mon, 2 Oct 2006 09:18:11 +0000 (02:18 -0700)]
[PATCH] namespaces: utsname: switch to using uts namespaces

Replace references to system_utsname to the per-process uts namespace
where appropriate.  This includes things like uname.

Changes: Per Eric Biederman's comments, use the per-process uts namespace
for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c

[jdike@addtoit.com: UML fix]
[clg@fr.ibm.com: cleanup]
[akpm@osdl.org: build fix]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] namespaces: utsname: introduce temporary helpers
Serge E. Hallyn [Mon, 2 Oct 2006 09:18:10 +0000 (02:18 -0700)]
[PATCH] namespaces: utsname: introduce temporary helpers

Define utsname() and init_utsname() which return &system_utsname.  Users of
system_utsname will be changed to use these helpers, after which
system_utsname will disappear.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] namespaces: exit_task_namespaces() invalidates nsproxy
Cedric Le Goater [Mon, 2 Oct 2006 09:18:09 +0000 (02:18 -0700)]
[PATCH] namespaces: exit_task_namespaces() invalidates nsproxy

exit_task_namespaces() has replaced the former exit_namespace().  It
invalidates task->nsproxy and associated namespaces.  This is an issue for
the (futur) pid namespace which is required to be valid in exit_notify().

This patch moves exit_task_namespaces() after exit_notify() to keep nsproxy
valid.

Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] namespaces: incorporate fs namespace into nsproxy
Serge E. Hallyn [Mon, 2 Oct 2006 09:18:08 +0000 (02:18 -0700)]
[PATCH] namespaces: incorporate fs namespace into nsproxy

This moves the mount namespace into the nsproxy.  The mount namespace count
now refers to the number of nsproxies point to it, rather than the number of
tasks.  As a result, the unshare_namespace() function in kernel/fork.c no
longer checks whether it is being shared.

Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] nsproxy: move init_nsproxy into kernel/nsproxy.c
Serge E. Hallyn [Mon, 2 Oct 2006 09:18:07 +0000 (02:18 -0700)]
[PATCH] nsproxy: move init_nsproxy into kernel/nsproxy.c

Move the init_nsproxy definition out of arch/ into kernel/nsproxy.c.  This
avoids all arches having to be updated.  Compiles and boots on s390.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] namespaces: add nsproxy
Serge E. Hallyn [Mon, 2 Oct 2006 09:18:06 +0000 (02:18 -0700)]
[PATCH] namespaces: add nsproxy

This patch adds a nsproxy structure to the task struct.  Later patches will
move the fs namespace pointer into this structure, and introduce a new utsname
namespace into the nsproxy.

The vserver and openvz functionality, then, would be implemented in large part
by virtualizing/isolating more and more resources into namespaces, each
contained in the nsproxy.

[akpm@osdl.org: build fix]
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] make kernel/sysctl.c:_proc_do_string() static
Adrian Bunk [Mon, 2 Oct 2006 09:18:05 +0000 (02:18 -0700)]
[PATCH] make kernel/sysctl.c:_proc_do_string() static

This patch makes the needlessly global _proc_do_string() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] proc: sysctl: add _proc_do_string helper
Sam Vilain [Mon, 2 Oct 2006 09:18:04 +0000 (02:18 -0700)]
[PATCH] proc: sysctl: add _proc_do_string helper

The logic in proc_do_string is worth re-using without passing in a
ctl_table structure (say, we want to calculate a pointer and pass that in
instead); pass in the two fields it uses from that structure as explicit
arguments.

Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
Cc: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] nfsd: lockdep annotation
Peter Zijlstra [Mon, 2 Oct 2006 09:18:03 +0000 (02:18 -0700)]
[PATCH] nfsd: lockdep annotation

while doing a kernel make modules_install install over an NFS mount.

  =============================================
  [ INFO: possible recursive locking detected ]
  ---------------------------------------------
  nfsd/9550 is trying to acquire lock:
   (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f

  but task is already holding lock:
   (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f

  other info that might help us debug this:
  2 locks held by nfsd/9550:
   #0:  (hash_sem){..--}, at: [<cc895223>] exp_readlock+0xd/0xf [nfsd]
   #1:  (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f

  stack backtrace:
   [<c0103508>] show_trace_log_lvl+0x58/0x152
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa57>] __lock_acquire+0x77a/0x9a3
   [<c012af4a>] lock_acquire+0x60/0x80
   [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034c845>] mutex_lock+0x1c/0x1f
   [<c0162edc>] vfs_unlink+0x34/0x8a
   [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd]
   [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033e84d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb
  DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
  Leftover inexact backtrace:
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa57>] __lock_acquire+0x77a/0x9a3
   [<c012af4a>] lock_acquire+0x60/0x80
   [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034c845>] mutex_lock+0x1c/0x1f
   [<c0162edc>] vfs_unlink+0x34/0x8a
   [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd]
   [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033e84d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb

  =============================================
  [ INFO: possible recursive locking detected ]
  ---------------------------------------------
  nfsd/9580 is trying to acquire lock:
   (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f

  but task is already holding lock:
   (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f

  other info that might help us debug this:
  2 locks held by nfsd/9580:
   #0:  (hash_sem){..--}, at: [<cc89522b>] exp_readlock+0xd/0xf [nfsd]
   #1:  (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f

  stack backtrace:
   [<c0103508>] show_trace_log_lvl+0x58/0x152
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa63>] __lock_acquire+0x77a/0x9a3
   [<c012af56>] lock_acquire+0x60/0x80
   [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034cc1d>] mutex_lock+0x1c/0x1f
   [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd]
   [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd]
   [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033ec1d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb
  DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
  Leftover inexact backtrace:
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa63>] __lock_acquire+0x77a/0x9a3
   [<c012af56>] lock_acquire+0x60/0x80
   [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034cc1d>] mutex_lock+0x1c/0x1f
   [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd]
   [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd]
   [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033ec1d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Neil Brown <neilb@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: allow admin to set nthreads per node
Greg Banks [Mon, 2 Oct 2006 09:18:02 +0000 (02:18 -0700)]
[PATCH] knfsd: allow admin to set nthreads per node

Add /proc/fs/nfsd/pool_threads which allows the sysadmin (or a userspace
daemon) to read and change the number of nfsd threads in each pool.  The
format is a list of space-separated integers, one per pool.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: make rpc threads pools numa aware
Greg Banks [Mon, 2 Oct 2006 09:18:01 +0000 (02:18 -0700)]
[PATCH] knfsd: make rpc threads pools numa aware

Actually implement multiple pools.  On NUMA machines, allocate a svc_pool per
NUMA node; on SMP a svc_pool per CPU; otherwise a single global pool.  Enqueue
sockets on the svc_pool corresponding to the CPU on which the socket bh is run
(i.e.  the NIC interrupt CPU).  Threads have their cpu mask set to limit them
to the CPUs in the svc_pool that owns them.

This is the patch that allows an Altix to scale NFS traffic linearly
beyond 4 CPUs and 4 NICs.

Incorporates changes and feedback from Neil Brown, Trond Myklebust, and
Christoph Hellwig.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: use svc_set_num_threads to manage threads in knfsd
Greg Banks [Mon, 2 Oct 2006 09:18:00 +0000 (02:18 -0700)]
[PATCH] knfsd: use svc_set_num_threads to manage threads in knfsd

Replace the existing list of all nfsd threads with new code using
svc_create_pooled().

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: add svc_set_num_threads
Greg Banks [Mon, 2 Oct 2006 09:17:59 +0000 (02:17 -0700)]
[PATCH] knfsd: add svc_set_num_threads

Currently knfsd keeps its own list of all nfsd threads in nfssvc.c; add a new
way of managing the list of all threads in a svc_serv.  Add
svc_create_pooled() to allow creation of a svc_serv whose threads are managed
by the sunrpc code.  Add svc_set_num_threads() to manage the number of threads
in a service, either per-pool or globally across the service.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: add svc_get
Greg Banks [Mon, 2 Oct 2006 09:17:58 +0000 (02:17 -0700)]
[PATCH] knfsd: add svc_get

add svc_get() for those occasions when we need to temporarily bump up
svc_serv->sv_nrthreads as a pseudo refcount.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: split svc_serv into pools
Greg Banks [Mon, 2 Oct 2006 09:17:58 +0000 (02:17 -0700)]
[PATCH] knfsd: split svc_serv into pools

Split out the list of idle threads and pending sockets from svc_serv into a
new svc_pool structure, and allocate a fixed number (in this patch, 1) of
pools per svc_serv.  The new structure contains a lock which takes over
several of the duties of svc_serv->sv_lock, which is now relegated to
protecting only sv_tempsocks, sv_permsocks, and sv_tmpcnt in svc_serv.

The point is to move the hottest fields out of svc_serv and into svc_pool,
allowing a following patch to arrange for a svc_pool per NUMA node or per CPU.
 This is a major step towards making the NFS server NUMA-friendly.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: test and set SK_BUSY atomically
Greg Banks [Mon, 2 Oct 2006 09:17:57 +0000 (02:17 -0700)]
[PATCH] knfsd: test and set SK_BUSY atomically

The SK_BUSY bit in svc_sock->sk_flags ensures that we do not attempt to
enqueue a socket twice.  Currently, setting and clearing the bit is protected
by svc_serv->sv_lock.  As I intend to reduce the data that the lock protects
so it's not held when svc_sock_enqueue() tests and sets SK_BUSY, that test and
set needs to be atomic.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: convert sk_reserved to atomic_t
Greg Banks [Mon, 2 Oct 2006 09:17:56 +0000 (02:17 -0700)]
[PATCH] knfsd: convert sk_reserved to atomic_t

Convert the svc_sock->sk_reserved variable from an int protected by
svc_serv->sv_lock, to an atomic.  This reduces (by 1) the number of places we
need to take the (effectively global) svc_serv->sv_lock.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: use new lock for svc_sock deferred list
Greg Banks [Mon, 2 Oct 2006 09:17:55 +0000 (02:17 -0700)]
[PATCH] knfsd: use new lock for svc_sock deferred list

Protect the svc_sock->sk_deferred list with a new lock svc_sock->sk_defer_lock
instead of svc_serv->sv_lock.  Using the more fine-grained lock reduces the
number of places we need to take the svc_serv lock.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: convert sk_inuse to atomic_t
Greg Banks [Mon, 2 Oct 2006 09:17:54 +0000 (02:17 -0700)]
[PATCH] knfsd: convert sk_inuse to atomic_t

Convert the svc_sock->sk_inuse counter from an int protected by
svc_serv->sv_lock, to an atomic.  This reduces the number of places we need to
take the (effectively global) svc_serv->sv_lock.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: move tempsock aging to a timer
Greg Banks [Mon, 2 Oct 2006 09:17:54 +0000 (02:17 -0700)]
[PATCH] knfsd: move tempsock aging to a timer

Following are 11 patches from Greg Banks which combine to make knfsd more
Numa-aware.  They reduce hitting on 'global' data structures, and create some
data-structures that can be node-local.

knfsd threads are bound to a particular node, and the thread to handle a new
request is chosen from the threads that are attach to the node that received
the interrupt.

The distribution of threads across nodes can be controlled by a new file in
the 'nfsd' filesystem, though the default approach of an even spread is
probably fine for most sites.

Some (old) numbers that show the efficacy of these patches: N == number of
NICs == number of CPUs == nmber of clients.  Number of NUMA nodes == N/2

N Throughput, MiB/s CPU usage, % (max=N*100)
Before After Before After
--- ------ ---- ----- -----
4 312 435 350 228
6 500 656 501 418
8 562 804 690 589

This patch:

Move the aging of RPC/TCP connection sockets from the main svc_recv() loop to
a timer which uses a mark-and-sweep algorithm every 6 minutes.  This reduces
the amount of work that needs to be done in the main RPC loop and the length
of time we need to hold the (effectively global) svc_serv->sv_lock.

[akpm@osdl.org: cleanup]
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: Correctly handle error condition from lockd_up
NeilBrown [Mon, 2 Oct 2006 09:17:53 +0000 (02:17 -0700)]
[PATCH] knfsd: Correctly handle error condition from lockd_up

If lockd_up fails - what should we expect?  Do we have to later call
lockd_down?

Well the nfs client thinks "no", the nfs server thinks "yes".  lockd thinks
"yes".

The only answer that really makes sense is "no" !!

So:
  Make lockd_up only increment  nlmsvc_users on success.
  Make nfsd handle errors from lockd_up properly.
  Make sure lockd_up(0) never fails when lockd is running
    so that the 'reclaimer' call to lockd_up doesn't need to
    be error checked.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: Move makesock failed warning into make_socks.
NeilBrown [Mon, 2 Oct 2006 09:17:52 +0000 (02:17 -0700)]
[PATCH] knfsd: Move makesock failed warning into make_socks.

Thus it is printed for any path that leads to failure (make_socks is called
from two places).

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: Check return value of lockd_up in write_ports
NeilBrown [Mon, 2 Oct 2006 09:17:51 +0000 (02:17 -0700)]
[PATCH] knfsd: Check return value of lockd_up in write_ports

We should be checking the return value of lockd_up when adding a new socket to
nfsd.  So move the lockd_up before the svc_addsock and check the return value.

The move is because lockd_down is easy, but there is no easy way to remove a
recently added socket.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: Drop 'serv' option to svc_recv and svc_process
NeilBrown [Mon, 2 Oct 2006 09:17:50 +0000 (02:17 -0700)]
[PATCH] knfsd: Drop 'serv' option to svc_recv and svc_process

It isn't needed as it is available in rqstp->rq_server, and dropping it allows
some local vars to be dropped.

[akpm@osdl.org: build fix]
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] nfsd: add lock annotations to e_start and e_stop
Josh Triplett [Mon, 2 Oct 2006 09:17:50 +0000 (02:17 -0700)]
[PATCH] nfsd: add lock annotations to e_start and e_stop

e_start acquires svc_export_cache.hash_lock, and e_stop releases it.  Add
lock annotations to these two functions so that sparse can check callers
for lock pairing, and so that sparse will not complain about these
functions since they intentionally use locks in this manner.

Signed-off-by: Josh Triplett <josh@freedesktop.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: Use SEQ_START_TOKEN instead of hardcoded magic (void*)1
Greg Banks [Mon, 2 Oct 2006 09:17:49 +0000 (02:17 -0700)]
[PATCH] knfsd: Use SEQ_START_TOKEN instead of hardcoded magic (void*)1

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: allow sockets to be passed to nfsd via 'portlist'
NeilBrown [Mon, 2 Oct 2006 09:17:48 +0000 (02:17 -0700)]
[PATCH] knfsd: allow sockets to be passed to nfsd via 'portlist'

Userspace should create and bind a socket (but not connectted) and write the
'fd' to portlist.  This will cause the nfs server to listen on that socket.

To close a socket, the name of the socket - as read from 'portlist' can be
written to 'portlist' with a preceding '-'.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: define new nfsdfs file: portlist - contains list of ports
NeilBrown [Mon, 2 Oct 2006 09:17:47 +0000 (02:17 -0700)]
[PATCH] knfsd: define new nfsdfs file: portlist - contains list of ports

This file will list all ports that nfsd has open.
Default when TCP enabled will be
   ipv4 udp 0.0.0.0 2049
   ipv4 tcp 0.0.0.0 2049

Later, the list of ports will be settable.

'portlist' chosen rather than 'ports', to avoid unnecessary confusion with
non-mainline patches which created 'ports' with different semantics.

[akpm@osdl.org: cleanups, build fix]
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: separate out some parts of nfsd_svc, which start nfs servers
NeilBrown [Mon, 2 Oct 2006 09:17:46 +0000 (02:17 -0700)]
[PATCH] knfsd: separate out some parts of nfsd_svc, which start nfs servers

Separate out the code for creating a new service, and for creating initial
sockets.

Some of these new functions will have multiple callers soon.

[akpm@osdl.org: cleanups]
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: remove nfsd_versbits as intermediate storage for desired versions
NeilBrown [Mon, 2 Oct 2006 09:17:46 +0000 (02:17 -0700)]
[PATCH] knfsd: remove nfsd_versbits as intermediate storage for desired versions

We have an array 'nfsd_version' which lists the available versions of nfsd,
and 'nfsd_versions' (poor choice there :-() which lists the currently active
versions.

Then we have a bitmap - nfsd_versbits which says which versions are wanted.
The bits in this bitset cause content to be copied from nfsd_version to
nfsd_versions when nfsd starts.

This patch removes nfsd_versbits and moves information directly from
nfsd_version to nfsd_versions when requests for version changes arrive.

Note that this doesn't make it possible to change versions while the server is
running.  This is because serv->sv_xdrsize is calculated when a service is
created, and used when threads are created, and xdrsize depends on the active
versions.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: be more selective in which sockets lockd listens on
NeilBrown [Mon, 2 Oct 2006 09:17:45 +0000 (02:17 -0700)]
[PATCH] knfsd: be more selective in which sockets lockd listens on

Currently lockd listens on UDP always, and TCP if CONFIG_NFSD_TCP is set.

However as lockd performs services of the client as well, this is a problem.
If CONFIG_NfSD_TCP is not set, and a tcp mount is used, the server will not be
able to call back to lockd.

So:
 - add an option to lockd_up saying which protocol is needed
 - Always open sockets for which an explicit port was given, otherwise
   only open a socket of the type required
 - Change nfsd to do one lockd_up per socket rather than one per thread.

This
 - removes the dependancy on CONFIG_NFSD_TCP
 - means that lockd may open sockets other than at startup
 - means that lockd will *not* listen on UDP if the only
   mounts are TCP mount (and nfsd hasn't started).

The latter is the only one that concerns me at all - I don't know if this
might be a problem with some servers.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: add a callback for when last rpc thread finishes
NeilBrown [Mon, 2 Oct 2006 09:17:44 +0000 (02:17 -0700)]
[PATCH] knfsd: add a callback for when last rpc thread finishes

nfsd has some cleanup that it wants to do when the last thread exits, and
there will shortly be some more.  So collect this all into one place and
define a callback for an rpc service to call when the service is about to be
destroyed.

[akpm@osdl.org: cleanups, build fix]
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: remove an unused variable from auth_unix_lookup()
Greg Banks [Mon, 2 Oct 2006 09:17:43 +0000 (02:17 -0700)]
[PATCH] knfsd: remove an unused variable from auth_unix_lookup()

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: remove an unused variable from e_show()
Greg Banks [Mon, 2 Oct 2006 09:17:42 +0000 (02:17 -0700)]
[PATCH] knfsd: remove an unused variable from e_show()

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] knfsd: add some missing newlines in printks
Greg Banks [Mon, 2 Oct 2006 09:17:41 +0000 (02:17 -0700)]
[PATCH] knfsd: add some missing newlines in printks

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] cpumask: export node_to_cpu_mask consistently
Greg Banks [Mon, 2 Oct 2006 09:17:41 +0000 (02:17 -0700)]
[PATCH] cpumask: export node_to_cpu_mask consistently

cpumask: ensure that node_to_cpumask() is available to modules for all
supported combinations of architecture and CONFIG_NUMA.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] cpumask: export cpu_online_map and cpu_possible_map consistently
Greg Banks [Mon, 2 Oct 2006 09:17:40 +0000 (02:17 -0700)]
[PATCH] cpumask: export cpu_online_map and cpu_possible_map consistently

cpumask: ensure that the cpu_online_map and cpu_possible_map bitmasks, and
hence all the macros in <linux/cpumask.h> that require them, are available to
modules for all supported combinations of architecture and CONFIG_SMP.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] cpumask: add highest_possible_node_id
Greg Banks [Mon, 2 Oct 2006 09:17:39 +0000 (02:17 -0700)]
[PATCH] cpumask: add highest_possible_node_id

cpumask: add highest_possible_node_id(), analogous to
highest_possible_processor_id().

[pj@sgi.com: fix typo]
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] isdn: work around excessive udelay()
Andrew Morton [Mon, 2 Oct 2006 09:17:38 +0000 (02:17 -0700)]
[PATCH] isdn: work around excessive udelay()

As reported in http://bugzilla.kernel.org/show_bug.cgi?id=6970, ISDN can issue
excessively-long udelays, which triggers a build-time error on ARM.

This is very sucky of ISDN, but I doubt if anyone is going to suddenly fix it.
So change the macro to do the microsecond counting itself.

Cc: <tch@wpkg.org>
Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] isdn4linux: Gigaset driver: fix __must_check warning
Tilman Schmidt [Mon, 2 Oct 2006 09:17:37 +0000 (02:17 -0700)]
[PATCH] isdn4linux: Gigaset driver: fix __must_check warning

This patch to the Siemens Gigaset driver fixes the compile warning
"ignoring return value of 'class_device_create_file', declared with
attribute warn_unused_result" appearing with CONFIG_ENABLE_MUST_CHECK=y in
release 2.6.18-rc1-mm1.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Acked-by: Hansjoerg Lipp <hjlipp@web.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Linux Kernel Dump Test Module
Ankita Garg [Mon, 2 Oct 2006 09:17:36 +0000 (02:17 -0700)]
[PATCH] Linux Kernel Dump Test Module

A simple module to test Linux Kernel Dump mechanism.  This module uses
jprobes to install/activate pre-defined crash points.  At different crash
points, various types of crashing scenarios are created like a BUG(),
panic(), exception, recursive loop and stack overflow.  The user can
activate a crash point with specific type by providing parameters at the
time of module insertion.  Please see the file header for usage
information.  The module is based on the Linux Kernel Dump Test Tool by
Fernando <http://lkdtt.sourceforge.net>.

This module could be merged with mainline. Jprobes is used here so that the
context in which crash point is hit, could be maintained. This implements
all the crash points as done by LKDTT except the one in the middle of
tasklet_action().

Signed-off-by: Ankita Garg <ankita@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] kretprobe spinlock deadlock patch
bibo,mao [Mon, 2 Oct 2006 09:17:35 +0000 (02:17 -0700)]
[PATCH] kretprobe spinlock deadlock patch

kprobe_flush_task() possibly calls kfree function during holding
kretprobe_lock spinlock, if kfree function is probed by kretprobe that will
incur spinlock deadlock.  This patch moves kfree function out scope of
kretprobe_lock.

Signed-off-by: bibo, mao <bibo.mao@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] disallow kprobes on notifier_call_chain
bibo,mao [Mon, 2 Oct 2006 09:17:34 +0000 (02:17 -0700)]
[PATCH] disallow kprobes on notifier_call_chain

When kprobe is re-entered, the re-entered kprobe kernel path will will call
atomic_notifier_call_chain function, if this function is kprobed that will
incur numerous kprobe recursive fault.  This patch disallows kprobes on
atomic_notifier_call_chain function.

Signed-off-by: bibo, mao <bibo.mao@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] kprobe whitespace cleanup
bibo,mao [Mon, 2 Oct 2006 09:17:33 +0000 (02:17 -0700)]
[PATCH] kprobe whitespace cleanup

Whitespace is used to indent, this patch cleans up these sentences by
kernel coding style.

Signed-off-by: bibo, mao <bibo.mao@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Update Documentation/kprobes.txt
Ananth N Mavinakayanahalli [Mon, 2 Oct 2006 09:17:32 +0000 (02:17 -0700)]
[PATCH] Update Documentation/kprobes.txt

Documentation/kprobes.txt updated to reflect:

o In-kernel symbol resolution
o CONFIG_KALLSYMS dependency
o Usage of JPROBE_ENTRY
o Addition of regs_return_value()

Also update the references list and usage examples to use correct module
interfaces.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Add regs_return_value() helper
Ananth N Mavinakayanahalli [Mon, 2 Oct 2006 09:17:31 +0000 (02:17 -0700)]
[PATCH] Add regs_return_value() helper

Add the regs_return_value() macro to extract the return value in an
architecture agnostic manner, given the pt_regs.

Other architecture maintainers may want to add similar helpers.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] kprobes: handle symbol resolution when <module:.symbol> is specified
Ananth N Mavinakayanahalli [Mon, 2 Oct 2006 09:17:31 +0000 (02:17 -0700)]
[PATCH] kprobes: handle symbol resolution when <module:.symbol> is specified

kallsyms_lookup_name() allows for <module:symbol> style specification for
looking up symbol addresses.  Handle the case where the user specifies
<module:.symbol> on powerpc, given that 64-bit powerpc uses function
descriptors.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Kprobes: Make kprobe modules more portable
Ananth N Mavinakayanahalli [Mon, 2 Oct 2006 09:17:30 +0000 (02:17 -0700)]
[PATCH] Kprobes: Make kprobe modules more portable

In an effort to make kprobe modules more portable, here is a patch that:

o Introduces the "symbol_name" field to struct kprobe.
  The symbol->address resolution now happens in the kernel in an
  architecture agnostic manner. 64-bit powerpc users no longer have
  to specify the ".symbols"
o Introduces the "offset" field to struct kprobe to allow a user to
  specify an offset into a symbol.
o The legacy mechanism of specifying the kprobe.addr is still supported.
  However, if both the kprobe.addr and kprobe.symbol_name are specified,
  probe registration fails with an -EINVAL.
o The symbol resolution code uses kallsyms_lookup_name(). So
  CONFIG_KPROBES now depends on CONFIG_KALLSYMS
o Apparantly kprobe modules were the only legitimate out-of-tree user of
  the kallsyms_lookup_name() EXPORT. Now that the symbol resolution
  happens in-kernel, remove the EXPORT as suggested by Christoph Hellwig
o Modify tcp_probe.c that uses the kprobe interface so as to make it
  work on multiple platforms (in its earlier form, the code wouldn't
  work, say, on powerpc)

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] s390: update fs3270 to use a struct pid
Cedric Le Goater [Mon, 2 Oct 2006 09:17:28 +0000 (02:17 -0700)]
[PATCH] s390: update fs3270 to use a struct pid

Replaces the pid_t value with a struct pid to avoid pid wrap around
problems.

Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] usb: fixup usb so it uses struct pid
Eric W. Biederman [Mon, 2 Oct 2006 09:17:28 +0000 (02:17 -0700)]
[PATCH] usb: fixup usb so it uses struct pid

The problem with remembering a user space process by its pid is that it is
possible that the process will exit, pid wrap around will occur.
Converting to a struct pid avoid that problem, and paves the way for
implementing a pid namespace.

Also since usb is the only user of kill_proc_info_as_uid rename
kill_proc_info_as_uid to kill_pid_info_as_uid and have the new version take
a struct pid.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] file: Add locking to f_getown
Eric W. Biederman [Mon, 2 Oct 2006 09:17:27 +0000 (02:17 -0700)]
[PATCH] file: Add locking to f_getown

This has been needed for a long time, but now with the advent of a
reference counted struct pid there are real consequences for getting this
wrong.

Someone I think it was Oleg Nesterov pointed out that this construct was
missing locking, when I introduced struct pid.  After taking time to review
the locking construct already present I figured out which lock needs to be
taken.  The other paths that access f_owner.pid take either the f_owner
read or the write lock.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] update mq_notify to use a struct pid
Cedric Le Goater [Mon, 2 Oct 2006 09:17:26 +0000 (02:17 -0700)]
[PATCH] update mq_notify to use a struct pid

Message queues can signal a process waiting for a message.

This patch replaces the pid_t value with a struct pid to avoid pid wrap
around problems.

Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Acked-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Use struct pspace in next_pidmap and find_ge_pid
Eric W. Biederman [Mon, 2 Oct 2006 09:17:25 +0000 (02:17 -0700)]
[PATCH] Use struct pspace in next_pidmap and find_ge_pid

This updates my proc: readdir race fix (take 3) patch
to account for the changes made by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
to introduce struct pspace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Define struct pspace
Sukadev Bhattiprolu [Mon, 2 Oct 2006 09:17:24 +0000 (02:17 -0700)]
[PATCH] Define struct pspace

Define a per-container pid space object.  And create one instance of this
object, init_pspace, to define the entire pid space.  Subsequent patches
will provide/use interfaces to create/destroy pid spaces.

Its a subset/rework of Eric Biederman's patch
http://lkml.org/lkml/2006/2/6/285 .

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] Move pidmap to pspace.h
Sukadev Bhattiprolu [Mon, 2 Oct 2006 09:17:23 +0000 (02:17 -0700)]
[PATCH] Move pidmap to pspace.h

Move struct pidmap and PIDMAP_ENTRIES to a new file, include/linux/pspace.h
where it will be used in subsequent patches to define pid spaces.

Its a subset of Eric Biederman's patch http://lkml.org/lkml/2006/2/6/285

[akpm@osdl.org: cleanups]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
19 years ago[PATCH] pid: simplify pid iterators
Oleg Nesterov [Mon, 2 Oct 2006 09:17:22 +0000 (02:17 -0700)]
[PATCH] pid: simplify pid iterators

I think it is hardly possible to read the current do_each_task_pid().  The
new version is much simpler and makes the code smaller.

Only the do_each_task_pid change is tested, the do_each_pid_task isn't.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>