Pandora Sourcecodes - pandora-kernel.git/log

Merge git://git./linux/kernel/git/steve/gfs2-2.6-fixes

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
GFS2: Fix mount hang caused by certain access pattern to sysfs files

Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (22 commits)
  ALSA: hda - Cirrus Logic CS421x support
  ALSA: Make pcm.h self-contained
  ALSA: hda - Allow codec-specific set_power_state ops
  ALSA: hda - Add post_suspend patch ops
  ALSA: hda - Make CONFIG_SND_HDA_POWER_SAVE depending on CONFIG_PM
  ALSA: hda - Make sure mute led reflects master mute state
  ALSA: hda - Fix invalid mute led state on resume of IDT codecs
  ASoC: Revert "ASoC: SAMSUNG: Add I2S0 internal dma driver"
  ALSA: hda - Add support of the 4 internal speakers on certain HP laptops
  ALSA: Make snd_pcm_debug_name usable outside pcm_lib
  ALSA: hda - Fix DAC filling for multi-connection pins in Realtek parser
  ASoC: dapm - Add methods to retrieve snd_card and soc_card from dapm context.
  ASoC: SAMSUNG: Add I2S0 internal dma driver
  ASoC: SAMSUNG: Modify I2S driver to support idma
  ASoC: davinci: add missing break statement
  ASoC: davinci: fix codec start and stop functions
  ASoC: dapm - add DAPM macro for external enum widgets
  ASoC: Acknowledge WM8962 interrupts before acting on them
  ASoC: sgtl5000: guide user when regulator support is needed
  ASoC: sgtl5000: refactor registering internal ldo
  ...

Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (53 commits)
  Input: synaptics - fix reporting of min coordinates
  Input: tegra-kbc - enable key autorepeat
  Input: kxtj9 - fix locking typo in kxtj9_set_poll()
  Input: kxtj9 - fix bug in probe()
  Input: intel-mid-touch - remove pointless checking for variable 'found'
  Input: hp_sdc - staticize hp_sdc_kicker()
  Input: pmic8xxx-keypad - fix a leak of the IRQ during init failure
  Input: cy8ctmg110_ts - set reset_pin and irq_pin from platform data
  Input: cy8ctmg110_ts - constify i2c_device_id table
  Input: cy8ctmg110_ts - fix checking return value of i2c_master_send
  Input: lifebook - make dmi callback functions return 1
  Input: atkbd - make dmi callback functions return 1
  Input: gpio_keys - switch to using SIMPLE_DEV_PM_OPS
  Input: gpio_keys - add support for device-tree platform data
  Input: aiptek - remove double define
  Input: synaptics - set minimum coordinates as reported by firmware
  Input: synaptics - process button bits in AGM packets
  Input: synaptics - rename set_slot to be more descriptive
  Input: synaptics - fuzz position for touchpad with reduced filtering
  Input: synaptics - set resolution for MT_POSITION_X/Y axes
  ...

Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze

* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Do not show error message for 32 interrupt lines
  Revert "microblaze: PCI fix typo fault in of_node pointer moving into pci_bus"
  microblaze: PCI fix typo fault in of_node pointer moving into pci_bus
  microblaze: Add support for early console on mdm
  microblaze: Simplify early console binding from DT
  microblaze: Get early printk console earlier
  microblaze: Standardise cpuinfo output for cache policy
  microblaze: Unprivileged stream instruction awareness
  microblaze: trivial: Fix typo fault
  microblaze: exec: Remove redundant set_fs(USER_DS)
  microblaze: Remove duplicated prototype of start_thread()
  microblaze: Fix unaligned value saving to the stack for system with MMU
  microblaze/irqs: Do not trace arch_local_{*,irq_*} functions

staging: brcm80211: Fix double include introduced by bad merge

A merge with Linus' tree added a double include of linux/interrupt.h.
Fix by removing one of the includes.

Signed-off-by: Daniel Morsing <daniel.morsing@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ALSA: hda - Fix duplicated DAC assignments for Realtek

Copying hp_pins and speaker_pins from line_out_pins may confuse the
parser, and it can lead to duplicated initializations for the same pin
with a wrong DAC assignment. The problem appears in 3.0 kernel code.

Cc: <stable@kernel.org> (for 3.0)
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: asihpi - off by one in asihpi_hpi_ioctl()

"adapter" is used as an array index in the adapters[] array so
the off by one would make us read past the end.

1c073b67979 "ALSA: asihpi - Remove spurious adapter index check"
reverted Dan Rosenberg's check that would have prevented the
overflow here.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: hda - Fix Oops with Realtek quirks with NULL adc_nids

Somce quirk models don't set adc_nids but let the parser filling it.
But the recent code has unnecessary NULL-checks of spec->input_mux,
and it resulted in NULL dereferences.
This patch fixes that regression.

Reported-and-tested-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

gro: Only reset frag0 when skb can be pulled

Currently skb_gro_header_slow unconditionally resets frag0 and
frag0_len. However, when we can't pull on the skb this leaves
the GRO fields in an inconsistent state.

This patch fixes this by only resetting those fields after the
pskb_may_pull test.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

microblaze: Do not show error message for 32 interrupt lines

When interrupt controller uses 32 interrupts lines the kernel
show error message about mismatch in kind-of-intr parameter
because it exceeds u32. Recast fixs this issue.

Signed-off-by: Michal Simek <monstr@monstr.eu>

ALSA: asihpi - bug fix pa use before init.

Fixes bug introduced by 1c073b67.
Also declare pa local to block in which it is used.

Signed-off-by: Eliot Blennerhassett <eblennerhassett@audioscience.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Merge branch 'next' into for-linus

ALSA: hda - Add support for vref-out based mute LED control on IDT codecs

This patch also registers all necessary callbacks to support mute LED
only when such control is enabled. And it keeps codec AFG in D0 or D1
state all the time when aggressive power managemnt is enabled for vref-out
control (and mute LED) work correctly.

Signed-off-by: Vitaliy Kulikov <Vitaliy.Kulikov@idt.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

xfs: optimize the negative xattr caching

Since the addition of file capabilities every write needs to read xattrs to
check if we have any capabilities to clear. In Linux 3.0 Andi Kleen added
a flag to cache the fact that we do not have any attributes on an inode.
Make sure to already mark a file as not having any attributes when reading
it from disk in case it doesn't even have an attribute fork. Based on an
earlier patch from Andi Kleen.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>

xfs: prevent against ioend livelocks in xfs_file_fsync

We need to take some locks to prevent new ioends from coming in when we wait
for all existing ones to go away. Up to Linux 3.0 that was done using the
i_mutex held by the VFS fsync code, but now that we are called without
it we need to take care of it ourselves. Use the I/O lock instead of
i_mutex just like we do in other places.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>

xfs: flag all buffers as metadata

Now that REQ_META bios aren't treated specially in the CFQ I/O schedule
anymore, we can tag all buffers as metadata to make blktrace traces more
meaningful. Note that we use buffers also to zero out partial blocks
in the preallocation / hole punching code, and while they operate on
data blocks the zeros written certainly aren't data. I think this case
is borderline metadata enough to not bother special casing it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>

xfs: encapsulate a block of debug code

Pull into a helper function some debug-only code that validates a
xfs_da_blkinfo structure that's been read from disk.

Signed-off-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>

Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  merge fchmod() and fchmodat() guts, kill ancient broken kludge
  xfs: fix misspelled S_IS...()
  xfs: get rid of open-coded S_ISREG(), etc.
  vfs: document locking requirements for d_move, __d_move and d_materialise_unique
  omfs: fix (mode & S_IFDIR) abuse
  btrfs: S_ISREG(mode) is not mode & S_IFREG...
  ima: fmode_t misspelled as mode_t...
  pci-label.c: size_t misspelled as mode_t
  jffs2: S_ISLNK(mode & S_IFMT) is pointless
  snd_msnd ->mode is fmode_t, not mode_t
  v9fs_iop_get_acl: get rid of unused variable
  vfs: dont chain pipe/anon/socket on superblock s_inodes list
  Documentation: Exporting: update description of d_splice_alias
  fs: add missing unlock in default_llseek()

MD: generate an event when array sync is complete

This patch causes MD to generate an event (for device-mapper) when the
synchronization thread is reaped. This is expected behavior for device-mapper.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>

MD bitmap: Revert DM dirty log hooks

Revert most of commit e384e58549a2e9a83071ad80280c1a9053cfd84c
md/bitmap: prepare for storing write-intent-bitmap via dm-dirty-log.

MD should not need to use DM's dirty log - we decided to use md's
bitmaps instead.

Keeping the DIV_ROUND_UP clean-ups that were part of commit
e384e58549a2e9a83071ad80280c1a9053cfd84c, however.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>

MD: raid1 s/sysfs_notify_dirent/sysfs_notify_dirent_safe

If device-mapper creates a RAID1 array that includes devices to
be rebuilt, it will deref a NULL pointer when finished because
sysfs is not used by device-mapper instantiated RAID devices.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md/raid5: Avoid BUG caused by multiple failures.

While preparing to write a stripe we keep the parity block or blocks
locked (R5_LOCKED) - towards the end of schedule_reconstruction.

If the array is discovered to have failed before this write completes
we can leave those blocks LOCKED, and init_stripe will notice that a
free stripe still has a locked block and will complain.

So clear the R5_LOCKED flag in handle_failed_stripe, and demote the
'BUG' to a 'WARN_ON'.

Signed-off-by: NeilBrown <neilb@suse.de>

md/raid10: move rdev->corrected_errors counting

Read errors are considered to corrected if write-back and re-read
cycle is finished without further problems. Thus moving the rdev->
corrected_errors counting after the re-reading looks more reasonable
IMHO.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md/raid5: move rdev->corrected_errors counting

Read errors are considered to corrected if write-back and re-read
cycle is finished without further problems. Thus moving the rdev->
corrected_errors counting after the re-reading looks more reasonable
IMHO.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md/raid1: move rdev->corrected_errors counting

Read errors are considered to corrected if write-back and re-read
cycle is finished without further problems. Thus moving the rdev->
corrected_errors counting after the re-reading looks more reasonable
IMHO. Also included a couple of whitespace fixes on sync_page_io().

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md: get rid of unnecessary casts on page_address()

page_address() returns void pointer, so the casts can be removed.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md/raid10: Improve decision on whether to fail a device with a read error.

Normally we would fail a device with a READ error. However if doing
so causes the array to fail, it is better to leave the device
in place and just return the read error to the caller.

The current test for decide if the array will fail is overly
simplistic.
We have a function 'enough' which can tell if the array is failed or
not, so use it to guide the decision.

Signed-off-by: NeilBrown <neilb@suse.de>

md/raid10: Make use of new recovery_disabled handling

When we get a read error during recovery, RAID10 previously
arranged for the recovering device to appear to fail so that
the recovery stops and doesn't restart. This is misleading and wrong.

Instead, make use of the new recovery_disabled handling and mark
the target device and having recovery disabled.

Add appropriate checks in add_disk and remove_disk so that devices
are removed and not re-added when recovery is disabled.

Signed-off-by: NeilBrown <neilb@suse.de>

md: change managed of recovery_disabled.

If we hit a read error while recovering a mirror, we want to abort the
recovery without necessarily failing the disk - as having a disk this
a read error is better than not having an array at all.

Currently this is managed with a per-array flag "recovery_disabled"
and is only implemented for RAID1. For RAID10 we will need finer
grained control as we might want to disable recovery for individual
devices separately.

So push more of the decision making into the personality.
'recovery_disabled' is now a 'cookie' which is copied when the
personality want to disable recovery and is changed when a device is
added to the array as this is used as a trigger to 'try recovery
again'.

This will allow RAID10 to get the control that it needs.

Signed-off-by: NeilBrown <neilb@suse.de>

md: remove ro check in md_check_recovery()

Commit c89a8eee6154 ("Allow faulty devices to be removed from a
readonly array.") added some work on ro array in the function,
but it couldn't be done since we didn't allow the ro array to be
handled from the beginning. Fix it.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md: introduce link/unlink_rdev() helpers

There are places where sysfs links to rdev are handled
in a same way. Add the helper functions to consolidate
them.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md/raid: use printk_ratelimited instead of printk_ratelimit

As per printk_ratelimit comment, it should not be used.

Signed-off-by: Christian Dietrich <christian.dietrich@informatik.uni-erlangen.de>
Signed-off-by: NeilBrown <neilb@suse.de>

md: use proper little-endian bitops

Using __test_and_{set,clear}_bit_le() with ignoring its return value
can be replaced with __{set,clear}_bit_le().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>

md/raid5: finalise new merged handle_stripe.

handle_stripe5() and handle_stripe6() are now virtually identical.
So discard one and rename the other to 'analyse_stripe()'.

It always returns 0, so change it to 'void' and remove the 'done'
variable in handle_stripe().

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>

md/raid5: move some more common code into handle_stripe

The RAID6 version of this code is usable for RAID5 providing:
  - we test "conf->max_degraded" rather than "2" as appropriate
  - we make sure s->failed_num[1] is meaningful (and not '-1')
    when s->failed > 1

The 'return 1' must become 'goto finish' in the new location.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>

md/raid5: move more common code into handle_stripe

Apart from 'prexor' which can only be set for RAID5, and
'qd_idx' which can only be meaningful for RAID6, these two
chunks of code are nearly the same.

So combine them into one adding a test to call either
handle_parity_checks5 or handle_parity_checks6 as appropriate.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>

md/raid5: unite handle_stripe_dirtying5 and handle_stripe_dirtying6

RAID6 is only allowed to choose 'reconstruct-write' while RAID5 is
also allow 'read-modify-write'
Apart from this difference, handle_stripe_dirtying[56] are nearly
identical. So resolve these differences and create just one function.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>

md/raid5: unite fetch_block5 and fetch_block6

Provided that ->failed_num[1] is not a valid device number (which is
easily achieved) fetch_block6 provides all the functionality of
fetch_block5.

So remove the latter and rename the former to simply "fetch_block".

Then handle_stripe_fill5 and handle_stripe_fill6 become the same and
can similarly be united.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>

md/raid5: rearrange a test in fetch_block6.

Next patch will unite fetch_block5 and fetch_block6.
First I want to make the differences a little more clear.

For RAID6 if we are writing at all and there is a failed device, then
we need to load or compute every block so we can do a
reconstruct-write.
This case isn't needed for RAID5 - we will do a read-modify-write in
that case.
So make that test a separate test in fetch_block6 rather than merged
with two other tests.

Make a similar change in fetch_block5 so the one bit that is not
needed for RAID6 is clearly separate.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>

md/raid5: move more code into common handle_stripe

The difference between the RAID5 and RAID6 code here is easily
resolved using conf->max_degraded.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>

md/raid5: Move code for finishing a reconstruction into handle_stripe.

Prior to commit ab69ae12ceef7 the code in handle_stripe5 and
handle_stripe6 to "Finish reconstruct operations initiated by the
expansion process" was identical.
That commit added an identical stanza of code to each function, but in
different places. That was careless.

The raid5 code was correct, so move that out into handle_stripe and
remove raid6 version.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>

md/raid5: Remove stripe_head_state arg from handle_stripe_expansion.

This arg is only used to differentiate between RAID5 and RAID6 but
that is not needed. For RAID5, raid5_compute_sector will set qd_idx
to "~0" so j with certainly not equals qd_idx, so there is no need
for a guard on that condition.

So remove the guard and remove the arg from the declaration and
callers of handle_stripe_expansion.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>

Merge branch 'next/devel2' of git://git./linux/kernel/git/arm/linux-arm-soc

* 'next/devel2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc: (47 commits)
  OMAP: Add debugfs node to show the summary of all clocks
  OMAP2+: hwmod: Follow the recommended PRCM module enable sequence
  OMAP2+: clock: allow per-SoC clock init code to prevent clockdomain calls from clock code
  OMAP2+: clockdomain: Add per clkdm lock to prevent concurrent state programming
  OMAP2+: PM: idle clkdms only if already in idle
  OMAP2+: clockdomain: add clkdm_in_hwsup()
  OMAP2+: clockdomain: Add 2 APIs to control clockdomain from hwmod framework
  OMAP: clockdomain: Remove redundant call to pwrdm_wait_transition()
  OMAP4: hwmod: Introduce the module control in hwmod control
  OMAP4: cm: Add two new APIs for modulemode control
  OMAP4: hwmod data: Add modulemode entry in omap_hwmod structure
  OMAP4: hwmod data: Add PRM context register offset
  OMAP4: prm: Remove deprecated functions
  OMAP4: prm: Replace warm reset API with the offset based version
  OMAP4: hwmod: Replace RSTCTRL absolute address with offset macros
  OMAP: hwmod: Wait the idle status to be disabled
  OMAP4: hwmod: Replace CLKCTRL absolute address with offset macros
  OMAP2+: hwmod: Init clkdm field at boot time
  OMAP4: hwmod data: Add clock domain attribute
  OMAP4: clock data: Add missing divider selection for auxclks
  ...

Merge branch 'next/devel' of ssh:///linux/kernel/git/arm/linux-arm-soc

* 'next/devel' of ssh://master.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc: (128 commits)
  ARM: S5P64X0: External Interrupt Support
  ARM: EXYNOS4: Enable MFC on Samsung NURI
  ARM: EXYNOS4: Enable MFC on universal_c210
  ARM: S5PV210: Enable MFC on Goni
  ARM: S5P: Add support for MFC device
  ARM: EXYNOS4: Add support FIMD on SMDKC210
  ARM: EXYNOS4: Add platform device and helper functions for FIMD
  ARM: EXYNOS4: Add resource definition for FIMD
  ARM: EXYNOS4: Change devname for FIMD clkdev
  ARM: SAMSUNG: Add IRQ_I2S0 definition
  ARM: SAMSUNG: Add platform device for idma
  ARM: EXYNOS4: Add more registers to be saved and restored for PM
  ARM: EXYNOS4: Add more register addresses of CMU
  ARM: EXYNOS4: Add platform device for dwmci driver
  ARM: EXYNOS4: configure rtc-s3c on NURI
  ARM: EXYNOS4: configure MAX8903 secondary charger on NURI
  ARM: EXYNOS4: configure ADC on NURI
  ARM: EXYNOS4: configure MAX17042 fuel gauge on NURI
  ARM: EXYNOS4: configure regulators and PMIC(MAX8997) on NURI
  ARM: EXYNOS4: Increase NR_IRQS for devices with more IRQs
  ...

Fix up tons of silly conflicts:
- arch/arm/mach-davinci/include/mach/psc.h
- arch/arm/mach-exynos4/Kconfig
- arch/arm/mach-exynos4/mach-smdkc210.c
- arch/arm/mach-exynos4/pm.c
- arch/arm/mach-imx/mm-imx1.c
- arch/arm/mach-imx/mm-imx21.c
- arch/arm/mach-imx/mm-imx25.c
- arch/arm/mach-imx/mm-imx27.c
- arch/arm/mach-imx/mm-imx31.c
- arch/arm/mach-imx/mm-imx35.c
- arch/arm/mach-mx5/mm.c
- arch/arm/mach-s5pv210/mach-goni.c
- arch/arm/mm/Kconfig

Merge branch 'next/board' of git://git./linux/kernel/git/arm/linux-arm-soc

* 'next/board' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
  ARM: S3C64XX: Configure backup battery charger on Cragganmore
  ARM: S3C64XX: Fix WM8915 IRQ polarity on Cragganmore
  ARM: S3C64XX: Configure supplies for all Cragganmore regulators
  ARM: S3C64XX: Refresh Cragganmore support
  ARM: S3C64XX: Initial support for Wolfson/Simtec Cragganmore/Banff
  OMAP4: Keyboard: Mux changes in the board file
  omap: blaze: add mmc5/wl1283 device support
  omap: 4430SDP: Register the card detect GPIO properly
  arm: omap3: cm-t35: add support for cm-t3730
  OMAP3: beagle: add support for beagleboard xM revision C
  OMAP3: rx-51: Add full regulator definitions
  omap: rx51: Platform support for lp5523 led chip

Merge branch 'next/cross-platform' of git://git./linux/kernel/git/arm/linux-arm-soc

* 'next/cross-platform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
  ARM: Consolidate the clkdev header files
  ARM: set vga memory base at run-time
  ARM: convert PCI defines to variables
  ARM: pci: make pcibios_assign_all_busses use pci_has_flag
  ARM: remove unnecessary mach/hardware.h includes
  pci: move microblaze and powerpc pci flag functions into asm-generic
  powerpc: rename ppc_pci_*_flags to pci_*_flags

Fix up conflicts in arch/microblaze/include/asm/pci-bridge.h

Merge branch 'next/fixes2' of git://git./linux/kernel/git/arm/linux-arm-soc

* 'next/fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc: (24 commits)
  ASoC: omap: McBSP: fix build breakage on OMAP1
  OMAP: hwmod: fix the i2c-reset timeout during bootup
  I2C: OMAP2+: add correct functionality flags to all omap2plus i2c dev_attr
  I2C: OMAP2+: Tag all OMAP2+ hwmod defintions with I2C IP revision
  I2C: OMAP1/OMAP2+: create omap I2C functionality flags for each cpu_... test
  I2C: OMAP2+:  Introduce I2C IP versioning constants
  I2C: OMAP2+: increase omap_i2c_dev_attr flags from u8 to u32
  I2C: OMAP2+: Set hwmod flags to only allow 16-bit accesses to i2c
  OMAP4: hwmod data: Change DSS main_clk scheme
  OMAP4: powerdomain data: Remove unsupported MPU powerdomain state
  OMAP4: clock data: Keep GPMC clocks always enabled and hardware managed
  OMAP4: powerdomain data: Fix core mem states and missing cefuse flag
  OMAP2+: PM: Initialise sleep_switch to a non-valid value
  OMAP4: hwmod data: Modify DSS opt clocks
  OMAP4: iommu: fix clock name
  omap: iovmm: s/sg_dma_len(sg)/sg->length/
  omap: iommu: fix pte programming
  arm: omap3: cm-t35: fix slow path warning
  arm: omap3: cm-t35: minor comments fixes
  omap: ZOOM: QUART: Request reset GPIO
  ...

Merge branch 'next/soc' of git://git./linux/kernel/git/arm/linux-arm-soc

* 'next/soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
  MAINTAINERS: add maintainer of CSR SiRFprimaII machine
  ARM: CSR: initializing L2 cache
  ARM: CSR: mapping early DEBUG_LL uart
  ARM: CSR: Adding CSR SiRFprimaII board support
  OMAP4: clocks: Update the clock tree with 4460 clock nodes
  OMAP4: PRCM: OMAP4460 specific PRM and CM register bitshifts
  OMAP4: ID: add omap_has_feature for max freq supported
  OMAP: ID: introduce chip detection for OMAP4460
  ARM: Xilinx: merge board file into main platform code
  ARM: Xilinx: Adding Xilinx board support

Fix up conflicts in arch/arm/mach-omap2/cm-regbits-44xx.h

Merge branch 'next-i2c' of git://git.fluff.org/bjdooks/linux

* 'next-i2c' of git://git.fluff.org/bjdooks/linux:
  i2c-eg20t : Fix the issue of Combined R/W transfer mode
  i2c-eg20t : Support Combined R/W transfer mode
  i2c: Tegra: Add DeviceTree support

asm-generic/atomic.h: allow SMP peeps to leverage this

Only a few core funcs need to be implemented for SMP systems, so allow the
arches to override them while getting the rest for free.

At least, this is enough to allow the Blackfin SMP port to use things.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

asm-generic/atomic.h: add atomic_set_mask() helper

Since arches are expected to implement this guy, add a common version for
people the same way as atomic_clear_mask is handled.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

asm-generic/atomic.h: fix type used in atomic_clear_mask

The atomic helpers are supposed to take an atomic_t pointer, not a random
unsigned long pointer. So convert atomic_clear_mask over.

While we're here, also add some nice documentation to the func.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

asm-generic/atomic.h: simplify inc/dec test helpers

We already declared inc/dec helpers, so we don't need to call the
atomic_{add,sub}_return funcs directly.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Arun Sharma <asharma@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

atomic: Update comments in atomic.h

This clarifies the differences between <linux/atomic.h> and
<asm-generic/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Suggested-by: Mike Frysinger <vapier.adi@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

atomic: cleanup asm-generic atomic*.h inclusion

After changing all consumers of atomics to include <linux/atomic.h>, we
ran into some compile time errors due to this dependency chain:

linux/atomic.h
  -> asm/atomic.h
    -> asm-generic/atomic-long.h

where atomic-long.h could use funcs defined later in linux/atomic.h
without a prototype.  This patches moves the code that includes
asm-generic/atomic*.h to linux/atomic.h.

Archs that need <asm-generic/atomic64.h> need to select
CONFIG_GENERIC_ATOMIC64 from now on (some of them used to include it
unconditionally).

Compile tested on i386 and x86_64 with allnoconfig.

Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

atomic: move atomic_add_unless to generic code

This is in preparation for more generic atomic primitives based on
__atomic_add_unless.

Signed-off-by: Arun Sharma <asharma@fb.com>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

atomic: use <linux/atomic.h>

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

asm-generic: add another generic ext2 atomic bitops

The majority of architectures implement ext2 atomic bitops as
test_and_{set,clear}_bit() without spinlock.

This adds this type of generic implementation in ext2-atomic-setbit.h and
use it wherever possible.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Suggested-by: Andreas Dilger <adilger@dilger.ca>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fail_make_request: cleanup should_fail_request

This changes should_fail_request() to more usable wrapper function of
should_fail(). It can avoid putting #ifdef CONFIG_FAIL_MAKE_REQUEST in
the middle of a function.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fail_page_alloc: simplify debugfs initialization

Now cleanup_fault_attr_dentries() recursively removes a directory, So we
can simplify the error handling in the initialization code and no need
to hold dentry structs for each debugfs file.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

failslab: simplify debugfs initialization

Now cleanup_fault_attr_dentries() recursively removes a directory, So we
can simplify the error handling in the initialization code and no need
to hold dentry structs for each debugfs file.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fault-injection: use debugfs_remove_recursive

Use debugfs_remove_recursive() to simplify initialization and
deinitialization of fault injection debugfs files.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fault-injection: cleanup simple attribute of stacktrace_depth

Minor cosmetic changes for simple attribute of stacktrace_depth:

- use min_t()
- reduce #ifdef by moving a function
- do not use partly capitalized function name

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fault-injection: remove nonexistent function extern

should_fail_srandom() does not exist.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fault-injection: do not include unneeded header

No need to include linux/kallsyms.h.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ramoops: make record_size a module parameter

The size of the dump is currently set using the RECORD_SIZE macro which
is set to a page size. This patch makes the record size a module
parameter and allows it to be set through platform data as well to allow
larger dumps if needed.

Signed-off-by: Sergiu Iordache <sergiu@chromium.org>
Acked-by: Marco Stornelli <marco.stornelli@gmail.com>
Cc: "Ahmed S. Darwish" <darwish.07@gmail.com>
Cc: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ramoops: move dump_oops into platform data

The platform driver currently allows setting the mem_size and
mem_address.

ince dump_oops is also a module parameter it would be more consistent if
it could be set through platform data as well.

Signed-off-by: Sergiu Iordache <sergiu@chromium.org>
Acked-by: Marco Stornelli <marco.stornelli@gmail.com>
Cc: "Ahmed S. Darwish" <darwish.07@gmail.com>
Cc: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ramoops: add new line to each print

Add new line to each print.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Reported-by: Stevie Trujillo <stevie.trujillo@gmail.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Américo Wang <xiyou.wangcong@gmail.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ramoops: use module parameters instead of platform data if not available

Use generic module parameters instead of platform data, if platform data
are not available. This limitation has been introduced with commit
c3b92ce9e75 ("ramoops: use the platform data structure instead of module
params").

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Américo Wang <xiyou.wangcong@gmail.com>
Reported-by: Stevie Trujillo <stevie.trujillo@gmail.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Vmware balloon: switch to using sysem-wide freezable workqueue

With the arrival of concurrency-managed workqueues there is no need for
our driver to use dedicated workqueue; system-wide one should suffice just
fine.

[akpm@linux-foundation.org: fix comment layout & grammar]
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/w1/slaves/w1_therm.c: add support for DS28EA00

Signed-off-by: Christian Glindkamp <christian.glindkamp@taskit.de>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

panic, vt: do not force oops output when panic_timeout < 0

Don't force output if you intend to reboot immediately.

In this patch, I'm disabling the functionality enabled by
vc->vc_panic_force_write if panic_timeout < 0 (i.e.  no timeout).
vc_panic_force_write is only enabled for fb video consoles if the
FBINFO_CAN_FORCE_OUTPUT flag is set.

For our application, we're using ram_oops to preserved the panic in
memory.  We want to reliably, and as fast as possible, machine_restart.
The vc_panic_force_write flag results in a bunch of graphics driver code
to be invoked which slows down restart and decreases reliability.  Since
we're already storing the panic in RAM and are going to reboot
immediately, there is no benefit in mode switching back to the vc in
order to display the panic output.  The log buffer will get flushed by
the console_unblank() call so remote management consoles should see all
output.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Olaf Hering <olaf@aepfle.de>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

panic: panic=-1 for immediate reboot

When a kernel BUG or oops occurs, ChromeOS intends to panic and
immediately reboot, with stacktrace and other messages preserved in RAM
across reboot.

But the longer we delay, the more likely the user is to poweroff and
lose the info.

panic_timeout (seconds before rebooting) is set by panic= boot option or
sysctl or /proc/sys/kernel/panic; but 0 means wait forever, so at
present we have to delay at least 1 second.

Let a negative number mean reboot immediately (with the small cosmetic
benefit of suppressing that newline-less "Rebooting in %d seconds.."
message).

Signed-off-by: Hugh Dickins <hughd@chromium.org>
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Olaf Hering <olaf@aepfle.de>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Documentation/DMA-API-HOWTO.txt: fix misleading example

See: DMA-API.txt, part Id, DMA_FROM_DEVICE description.

Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

include/linux/dma-mapping.h: remove DMA_xxBIT_MASK macros

git grep shows there are no users in tree, so we can remove them safely.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gcov: disable CONSTRUCTORS for UML

Selecting GCOV for UML causing configuration mismatch:

warning: (GCOV_KERNEL) selects CONSTRUCTORS which has unmet direct dependencies (!UML)

Constructors are not needed for UML.

Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com>
Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Acked-by: Richard Weinberger <richard@nod.at>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/edac/mpc85xx_edac.c: correct offset_in_page mask bits in edac_mc_handle_ce()

Parameter offset_in_page in edac_mc_handle_ce() should mask the higher
bits above the page size, not the lower bits. The original input
sometimes causes a crash.

Signed-off-by: Kai.Jiang <Kai.Jiang@freescale.com>
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Anton Vorontsov <avorontsov@mvista.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ipc: introduce shm_rmid_forced sysctl

Add support for the shm_rmid_forced sysctl.  If set to 1, all shared
memory objects in current ipc namespace will be automatically forced to
use IPC_RMID.

The POSIX way of handling shmem allows one to create shm objects and
call shmdt(), leaving shm object associated with no process, thus
consuming memory not counted via rlimits.

With shm_rmid_forced=1 the shared memory object is counted at least for
one process, so OOM killer may effectively kill the fat process holding
the shared memory.

It obviously breaks POSIX - some programs relying on the feature would
stop working.  So set shm_rmid_forced=1 only if you're sure nobody uses
"orphaned" memory.  Use shm_rmid_forced=0 by default for compatability
reasons.

The feature was previously impemented in -ow as a configure option.

[akpm@linux-foundation.org: fix documentation, per Randy]
[akpm@linux-foundation.org: fix warning]
[akpm@linux-foundation.org: readability/conventionality tweaks]
[akpm@linux-foundation.org: fix shm_rmid_forced/shm_forced_rmid confusion, use standard comment layout]
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serge.hallyn@canonical.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Solar Designer <solar@openwall.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ipc/mqueue.c: fix mq_open() return value

We return ENOMEM from mqueue_get_inode even when we have enough memory.
Namely in case the system rlimit of mqueue was reached.  This error
propagates to mq_queue and user sees the error unexpectedly.  So fix
this up to properly return EMFILE as described in the manpage:

EMFILE The process already has the maximum number of files and
       message queues open.

instead of:

ENOMEM Insufficient memory.

With the previous patch we just switch to ERR_PTR/PTR_ERR/IS_ERR error
handling here.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ipc/mqueue.c: refactor failure handling

If new_inode fails to allocate an inode we need only to return with
NULL. But now we test the opposite and have all the work in a nested
block. So do the opposite to save one indentation level (and remove
unnecessary line breaks).

This is only a preparation/cleanup for the next patch where we fix up
return values from mqueue_get_inode.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpumask: add cpumask_var_t documentation

cpumask_var_t has one notable difference from cpumask_t. Add the
explanation.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpumask: alloc_cpumask_var() use NUMA_NO_NODE

NUMA_NO_NODE and numa_node_id() have different meanings. NUMA_NO_NODE is
obviously the recommended fallback.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpumask: convert for_each_cpumask() with for_each_cpu()

Adapt new API fashion.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/exec.c:acct_arg_size(): ptl is no longer needed for add_mm_counter()

acct_arg_size() takes ->page_table_lock around add_mm_counter() if
!SPLIT_RSS_COUNTING. This is not needed after commit 172703b08cd0 ("mm:
delete non-atomic mm counter implementation").

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Matt Fleming <matt.fleming@linux.intel.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

exec: do not retry load_binary method if CONFIG_MODULES=n

If CONFIG_MODULES=n, it makes no sense to retry the list of binary formats
handler because the list will not be modified by request_module().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Richard Weinberger <richard@nod.at>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

exec: do not call request_module() twice from search_binary_handler()

Currently, search_binary_handler() tries to load binary loader module
using request_module() if a loader for the requested program is not yet
loaded. But second attempt of request_module() does not affect the result
of search_binary_handler().

If request_module() triggered recursion, calling request_module() twice
causes 2 to the power of MAX_KMOD_CONCURRENT (= 50) repetitions. It is
not an infinite loop but is sufficient for users to consider as a hang up.

Therefore, this patch changes not to call request_module() twice, making 1
to the power of MAX_KMOD_CONCURRENT repetitions in case of recursion.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Richard Weinberger <richard@nod.at>
Tested-by: Richard Weinberger <richard@nod.at>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/exec.c: use BUILD_BUG_ON for VM_STACK_FLAGS & VM_STACK_INCOMPLETE_SETUP

Commit a8bef8ff6ea1 ("mm: migration: avoid race between
shift_arg_pages() and rmap_walk() during migration by not migrating
temporary stacks") introduced a BUG_ON() to ensure that VM_STACK_FLAGS
and VM_STACK_INCOMPLETE_SETUP do not overlap. The check is a compile
time one, so BUILD_BUG_ON is more appropriate.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/fork.c: fix a few coding style issues

Signed-off-by: Daniel Rebelo de Oliveira <psykon@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: fix a race in do_io_accounting()

If an inode's mode permits opening /proc/PID/io and the resulting file
descriptor is kept across execve() of a setuid or similar binary, the
ptrace_may_access() check tries to prevent using this fd against the
task with escalated privileges.

Unfortunately, there is a race in the check against execve(). If
execve() is processed after the ptrace check, but before the actual io
information gathering, io statistics will be gathered from the
privileged process. At least in theory this might lead to gathering
sensible information (like ssh/ftp password length) that wouldn't be
available otherwise.

Holding task->signal->cred_guard_mutex while gathering the io
information should protect against the race.

The order of locking is similar to the one inside of ptrace_attach():
first goes cred_guard_mutex, then lock_task_sighand().

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

procfs: return ENOENT on opening a being-removed proc entry

Change the return value to ENOENT. This return value is then returned
when opening the proc entry that have been removed. For example,
open("/proc/bus/pci/XX/YY") when the corresponding device is being
hot-removed.

Signed-off-by: Daisuke Ogino <ogino.daisuke@jp.fujitsu.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

h8300/m68k/xtensa: __FD_ISSET should return 0/1

Harmonise these return values with other architectures. In some cases
this affects all compilers and in other cases non-gcc compilers only.

Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ulrich Drepper <drepper@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

do_coredump: fix the "ispipe" error check

do_coredump() assumes that if format_corename() fails it should return
-ENOMEM. This is not true, for example cn_print_exe_file() can propagate
the error from d_path. Even if it was true, this is too fragile. Change
the code to check "ispipe < 0".

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

coredump: escape / in hostname and comm

Change every occurence of / in comm and hostname to !.  If the process
changes its name to contain /, the core is not dumped (if the directory
tree doesn't exist like that).  The same with hostname being something
like myhost/3.  Fix this behaviour by using the escape loop used in %E.
(We extract it to a separate function.)

Now both with comm == myprocess/1 and hostname == myhost/1, the core is
dumped like (kernel.core_pattern='core.%p.%e.%h):
core.2349.myprocess!1.myhost!1

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

coredump: use task comm instead of (unknown)

If we don't know the file corresponding to the binary (i.e. exe_file is
unknown), use "task->comm (path unknown)" instead of simple "(unknown)"
as suggested by ak.

The fallback is the same as %e except it will append "(path unknown)".

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ptrace: unify show_regs() prototype

[ poleg@redhat.com: no need to declare show_regs() in ptrace.h, sched.h does this ]
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpusets: randomize node rotor used in cpuset_mem_spread_node()

[ This patch has already been accepted as commit 0ac0c0d0f837 but later
  reverted (commit 35926ff5fba8) because it itroduced arch specific
  __node_random which was defined only for x86 code so it broke other
  archs.  This is a followup without any arch specific code.  Other than
  that there are no functional changes.]

Some workloads that create a large number of small files tend to assign
too many pages to node 0 (multi-node systems).  Part of the reason is
that the rotor (in cpuset_mem_spread_node()) used to assign nodes starts
at node 0 for newly created tasks.

This patch changes the rotor to be initialized to a random node number
of the cpuset.

[akpm@linux-foundation.org: fix layout]
[Lee.Schermerhorn@hp.com: Define stub numa_random() for !NUMA configuration]
[mhocko@suse.cz: Make it arch independent]
[akpm@linux-foundation.org: fix CONFIG_NUMA=y, MAX_NUMNODES>1 build]
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Paul Menage <menage@google.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Paul Menage <menage@google.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: get rid of percpu_charge_mutex lock

percpu_charge_mutex protects from multiple simultaneous per-cpu charge
caches draining because we might end up having too many work items. At
least this was the case until commit 26fe61684449 ("memcg: fix percpu
cached charge draining frequency") when we introduced a more targeted
draining for async mode.

Now that also sync draining is targeted we can safely remove mutex
because we will not send more work than the current number of CPUs.
FLUSHING_CACHED_CHARGE protects from sending the same work multiple
times and stock->nr_pages == 0 protects from pointless sending a work if
there is obviously nothing to be done. This is of course racy but we
can live with it as the race window is really small (we would have to
see FLUSHING_CACHED_CHARGE cleared while nr_pages would be still
non-zero).

The only remaining place where we can race is synchronous mode when we
rely on FLUSHING_CACHED_CHARGE test which might have been set by other
drainer on the same group but we should wait in that case as well.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: add mem_cgroup_same_or_subtree() helper

We are checking whether a given two groups are same or at least in the
same subtree of a hierarchy at several places. Let's make a helper for
it to make code easier to read.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: unify sync and async per-cpu charge cache draining

Currently we have two ways how to drain per-CPU caches for charges.
drain_all_stock_sync will synchronously drain all caches while
drain_all_stock_async will asynchronously drain only those that refer to
a given memory cgroup or its subtree in hierarchy.  Targeted async
draining has been introduced by 26fe6168 (memcg: fix percpu cached
charge draining frequency) to reduce the cpu workers number.

sync draining is currently triggered only from mem_cgroup_force_empty
which is triggered only by userspace (mem_cgroup_force_empty_write) or
when a cgroup is removed (mem_cgroup_pre_destroy).  Although these are
not usually frequent operations it still makes some sense to do targeted
draining as well, especially if the box has many CPUs.

This patch unifies both methods to use the single code (drain_all_stock)
which relies on the original async implementation and just adds
flush_work to wait on all caches that are still under work for the sync
mode.  We are using FLUSHING_CACHED_CHARGE bit check to prevent from
waiting on a work that we haven't triggered.  Please note that both sync
and async functions are currently protected by percpu_charge_mutex so we
cannot race with other drainers.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: do not try to drain per-cpu caches without pages

drain_all_stock_async tries to optimize a work to be done on the work
queue by excluding any work for the current CPU because it assumes that
the context we are called from already tried to charge from that cache
and it's failed so it must be empty already.

While the assumption is correct we can optimize it even more by checking
the current number of pages in the cache. This will also reduce a work
on other CPUs with an empty stock.

For the current CPU we can simply call drain_local_stock rather than
deferring it to the work queue.

[kamezawa.hiroyu@jp.fujitsu.com: use drain_local_stock for current CPU optimization]
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>