pandora-kernel.git
12 years agomemcg: do not try to drain per-cpu caches without pages
Michal Hocko [Tue, 26 Jul 2011 23:08:27 +0000 (16:08 -0700)]
memcg: do not try to drain per-cpu caches without pages

drain_all_stock_async tries to optimize a work to be done on the work
queue by excluding any work for the current CPU because it assumes that
the context we are called from already tried to charge from that cache
and it's failed so it must be empty already.

While the assumption is correct we can optimize it even more by checking
the current number of pages in the cache.  This will also reduce a work
on other CPUs with an empty stock.

For the current CPU we can simply call drain_local_stock rather than
deferring it to the work queue.

[kamezawa.hiroyu@jp.fujitsu.com: use drain_local_stock for current CPU optimization]
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomemcg: add memory.vmscan_stat
KAMEZAWA Hiroyuki [Tue, 26 Jul 2011 23:08:26 +0000 (16:08 -0700)]
memcg: add memory.vmscan_stat

The commit log of 0ae5e89c60c9 ("memcg: count the soft_limit reclaim
in...") says it adds scanning stats to memory.stat file.  But it doesn't
because we considered we needed to make a concensus for such new APIs.

This patch is a trial to add memory.scan_stat. This shows
  - the number of scanned pages(total, anon, file)
  - the number of rotated pages(total, anon, file)
  - the number of freed pages(total, anon, file)
  - the number of elaplsed time (including sleep/pause time)

  for both of direct/soft reclaim.

The biggest difference with oringinal Ying's one is that this file
can be reset by some write, as

  # echo 0 ...../memory.scan_stat

Example of output is here. This is a result after make -j 6 kernel
under 300M limit.

  [kamezawa@bluextal ~]$ cat /cgroup/memory/A/memory.scan_stat
  [kamezawa@bluextal ~]$ cat /cgroup/memory/A/memory.vmscan_stat
  scanned_pages_by_limit 9471864
  scanned_anon_pages_by_limit 6640629
  scanned_file_pages_by_limit 2831235
  rotated_pages_by_limit 4243974
  rotated_anon_pages_by_limit 3971968
  rotated_file_pages_by_limit 272006
  freed_pages_by_limit 2318492
  freed_anon_pages_by_limit 962052
  freed_file_pages_by_limit 1356440
  elapsed_ns_by_limit 351386416101
  scanned_pages_by_system 0
  scanned_anon_pages_by_system 0
  scanned_file_pages_by_system 0
  rotated_pages_by_system 0
  rotated_anon_pages_by_system 0
  rotated_file_pages_by_system 0
  freed_pages_by_system 0
  freed_anon_pages_by_system 0
  freed_file_pages_by_system 0
  elapsed_ns_by_system 0
  scanned_pages_by_limit_under_hierarchy 9471864
  scanned_anon_pages_by_limit_under_hierarchy 6640629
  scanned_file_pages_by_limit_under_hierarchy 2831235
  rotated_pages_by_limit_under_hierarchy 4243974
  rotated_anon_pages_by_limit_under_hierarchy 3971968
  rotated_file_pages_by_limit_under_hierarchy 272006
  freed_pages_by_limit_under_hierarchy 2318492
  freed_anon_pages_by_limit_under_hierarchy 962052
  freed_file_pages_by_limit_under_hierarchy 1356440
  elapsed_ns_by_limit_under_hierarchy 351386416101
  scanned_pages_by_system_under_hierarchy 0
  scanned_anon_pages_by_system_under_hierarchy 0
  scanned_file_pages_by_system_under_hierarchy 0
  rotated_pages_by_system_under_hierarchy 0
  rotated_anon_pages_by_system_under_hierarchy 0
  rotated_file_pages_by_system_under_hierarchy 0
  freed_pages_by_system_under_hierarchy 0
  freed_anon_pages_by_system_under_hierarchy 0
  freed_file_pages_by_system_under_hierarchy 0
  elapsed_ns_by_system_under_hierarchy 0

total_xxxx is for hierarchy management.

This will be useful for further memcg developments and need to be
developped before we do some complicated rework on LRU/softlimit
management.

This patch adds a new struct memcg_scanrecord into scan_control struct.
sc->nr_scanned at el is not designed for exporting information.  For
example, nr_scanned is reset frequentrly and incremented +2 at scanning
mapped pages.

To avoid complexity, I added a new param in scan_control which is for
exporting scanning score.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Cc: Andrew Bresticker <abrestic@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomemcg: fix behavior of mem_cgroup_resize_limit()
Daisuke Nishimura [Tue, 26 Jul 2011 23:08:25 +0000 (16:08 -0700)]
memcg: fix behavior of mem_cgroup_resize_limit()

Commit 22a668d7c3ef ("memcg: fix behavior under memory.limit equals to
memsw.limit") introduced "memsw_is_minimum" flag, which becomes true
when mem_limit == memsw_limit.  The flag is checked at the beginning of
reclaim, and "noswap" is set if the flag is true, because using swap is
meaningless in this case.

This works well in most cases, but when we try to shrink mem_limit,
which is the same as memsw_limit now, we might fail to shrink mem_limit
because swap doesn't used.

This patch fixes this behavior by:
 - check MEM_CGROUP_RECLAIM_SHRINK at the begining of reclaim
 - If it is set, don't set "noswap" flag even if memsw_is_minimum is true.

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <bsingharora@gmail.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomemcg: fix vmscan count in small memcgs
KAMEZAWA Hiroyuki [Tue, 26 Jul 2011 23:08:24 +0000 (16:08 -0700)]
memcg: fix vmscan count in small memcgs

Commit 246e87a93934 ("memcg: fix get_scan_count() for small targets")
fixes the memcg/kswapd behavior against small targets and prevent vmscan
priority too high.

But the implementation is too naive and adds another problem to small
memcg.  It always force scan to 32 pages of file/anon and doesn't handle
swappiness and other rotate_info.  It makes vmscan to scan anon LRU
regardless of swappiness and make reclaim bad.  This patch fixes it by
adjusting scanning count with regard to swappiness at el.

At a test "cat 1G file under 300M limit." (swappiness=20)
 before patch
        scanned_pages_by_limit 360919
        scanned_anon_pages_by_limit 180469
        scanned_file_pages_by_limit 180450
        rotated_pages_by_limit 31
        rotated_anon_pages_by_limit 25
        rotated_file_pages_by_limit 6
        freed_pages_by_limit 180458
        freed_anon_pages_by_limit 19
        freed_file_pages_by_limit 180439
        elapsed_ns_by_limit 429758872
 after patch
        scanned_pages_by_limit 180674
        scanned_anon_pages_by_limit 24
        scanned_file_pages_by_limit 180650
        rotated_pages_by_limit 35
        rotated_anon_pages_by_limit 24
        rotated_file_pages_by_limit 11
        freed_pages_by_limit 180634
        freed_anon_pages_by_limit 0
        freed_file_pages_by_limit 180634
        elapsed_ns_by_limit 367119089
        scanned_pages_by_system 0

the numbers of scanning anon are decreased(as expected), and elapsed time
reduced. By this patch, small memcgs will work better.
(*) Because the amount of file-cache is much bigger than anon,
    recalaim_stat's rotate-scan counter make scanning files more.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomemcg: change memcg_oom_mutex to spinlock
Michal Hocko [Tue, 26 Jul 2011 23:08:24 +0000 (16:08 -0700)]
memcg: change memcg_oom_mutex to spinlock

memcg_oom_mutex is used to protect memcg OOM path and eventfd interface
for oom_control.  None of the critical sections which it protects sleep
(eventfd_signal works from atomic context and the rest are simple linked
list resp.  oom_lock atomic operations).

Mutex is also too heavyweight for those code paths because it triggers a
lot of scheduling.  It also makes makes convoying effects more visible
when we have a big number of oom killing because we take the lock
mutliple times during mem_cgroup_handle_oom so we have multiple places
where many processes can sleep.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomemcg: make oom_lock 0 and 1 based rather than counter
Michal Hocko [Tue, 26 Jul 2011 23:08:23 +0000 (16:08 -0700)]
memcg: make oom_lock 0 and 1 based rather than counter

Commit 867578cb ("memcg: fix oom kill behavior") introduced a oom_lock
counter which is incremented by mem_cgroup_oom_lock when we are about to
handle memcg OOM situation.  mem_cgroup_handle_oom falls back to a sleep
if oom_lock > 1 to prevent from multiple oom kills at the same time.
The counter is then decremented by mem_cgroup_oom_unlock called from the
same function.

This works correctly but it can lead to serious starvations when we have
many processes triggering OOM and many CPUs available for them (I have
tested with 16 CPUs).

Consider a process (call it A) which gets the oom_lock (the first one
that got to mem_cgroup_handle_oom and grabbed memcg_oom_mutex) and other
processes that are blocked on the mutex.  While A releases the mutex and
calls mem_cgroup_out_of_memory others will wake up (one after another)
and increase the counter and fall into sleep (memcg_oom_waitq).

Once A finishes mem_cgroup_out_of_memory it takes the mutex again and
decreases oom_lock and wakes other tasks (if releasing memory by
somebody else - e.g.  killed process - hasn't done it yet).

A testcase would look like:
  Assume malloc XXX is a program allocating XXX Megabytes of memory
  which touches all allocated pages in a tight loop
  # swapoff SWAP_DEVICE
  # cgcreate -g memory:A
  # cgset -r memory.oom_control=0   A
  # cgset -r memory.limit_in_bytes= 200M
  # for i in `seq 100`
  # do
  #     cgexec -g memory:A   malloc 10 &
  # done

The main problem here is that all processes still race for the mutex and
there is no guarantee that we will get counter back to 0 for those that
got back to mem_cgroup_handle_oom.  In the end the whole convoy
in/decreases the counter but we do not get to 1 that would enable
killing so nothing useful can be done.  The time is basically unbounded
because it highly depends on scheduling and ordering on mutex (I have
seen this taking hours...).

This patch replaces the counter by a simple {un}lock semantic.  As
mem_cgroup_oom_{un}lock works on the a subtree of a hierarchy we have to
make sure that nobody else races with us which is guaranteed by the
memcg_oom_mutex.

We have to be careful while locking subtrees because we can encounter a
subtree which is already locked: hierarchy:

          A
        /   \
       B     \
      /\      \
     C  D     E

B - C - D tree might be already locked.  While we want to enable locking
E subtree because OOM situations cannot influence each other we
definitely do not want to allow locking A.

Therefore we have to refuse lock if any subtree is already locked and
clear up the lock for all nodes that have been set up to the failure
point.

On the other hand we have to make sure that the rest of the world will
recognize that a group is under OOM even though it doesn't have a lock.
Therefore we have to introduce under_oom variable which is incremented
and decremented for the whole subtree when we enter resp.  leave
mem_cgroup_handle_oom.  under_oom, unlike oom_lock, doesn't need be
updated under memcg_oom_mutex because its users only check a single
group and they use atomic operations for that.

This can be checked easily by the following test case:

  # cgcreate -g memory:A
  # cgset -r memory.use_hierarchy=1 A
  # cgset -r memory.oom_control=1   A
  # cgset -r memory.limit_in_bytes= 100M
  # cgset -r memory.memsw.limit_in_bytes= 100M
  # cgcreate -g memory:A/B
  # cgset -r memory.oom_control=1 A/B
  # cgset -r memory.limit_in_bytes=20M
  # cgset -r memory.memsw.limit_in_bytes=20M
  # cgexec -g memory:A/B malloc 30  &    #->this will be blocked by OOM of group B
  # cgexec -g memory:A   malloc 80  &    #->this will be blocked by OOM of group A

While B gets oom_lock A will not get it.  Both of them go into sleep and
wait for an external action.  We can make the limit higher for A to
enforce waking it up

  # cgset -r memory.memsw.limit_in_bytes=300M A
  # cgset -r memory.limit_in_bytes=300M A

malloc in A has to wake up even though it doesn't have oom_lock.

Finally, the unlock path is very easy because we always unlock only the
subtree we have locked previously while we always decrement under_oom.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomemcg: consolidate memory cgroup lru stat functions
KAMEZAWA Hiroyuki [Tue, 26 Jul 2011 23:08:22 +0000 (16:08 -0700)]
memcg: consolidate memory cgroup lru stat functions

In mm/memcontrol.c, there are many lru stat functions as..

  mem_cgroup_zone_nr_lru_pages
  mem_cgroup_node_nr_file_lru_pages
  mem_cgroup_nr_file_lru_pages
  mem_cgroup_node_nr_anon_lru_pages
  mem_cgroup_nr_anon_lru_pages
  mem_cgroup_node_nr_unevictable_lru_pages
  mem_cgroup_nr_unevictable_lru_pages
  mem_cgroup_node_nr_lru_pages
  mem_cgroup_nr_lru_pages
  mem_cgroup_get_local_zonestat

Some of them are under #ifdef MAX_NUMNODES >1 and others are not.
This seems bad. This patch consolidates all functions into

  mem_cgroup_zone_nr_lru_pages()
  mem_cgroup_node_nr_lru_pages()
  mem_cgroup_nr_lru_pages()

For these functions, "which LRU?" information is passed by a mask.

example:
  mem_cgroup_nr_lru_pages(mem, BIT(LRU_ACTIVE_ANON))

And I added some macro as ALL_LRU, ALL_LRU_FILE, ALL_LRU_ANON.

example:
  mem_cgroup_nr_lru_pages(mem, ALL_LRU)

BTW, considering layout of NUMA memory placement of counters, this patch seems
to be better.

Now, when we gather all LRU information, we scan in following orer
    for_each_lru -> for_each_node -> for_each_zone.

This means we'll touch cache lines in different node in turn.

After patch, we'll scan
    for_each_node -> for_each_zone -> for_each_lru(mask)

Then, we'll gather information in the same cacheline at once.

[akpm@linux-foundation.org: fix warnigns, build error]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomemcg: export memory cgroup's swappiness with mem_cgroup_swappiness()
KAMEZAWA Hiroyuki [Tue, 26 Jul 2011 23:08:21 +0000 (16:08 -0700)]
memcg: export memory cgroup's swappiness with mem_cgroup_swappiness()

Each memory cgroup has a 'swappiness' value which can be accessed by
get_swappiness(memcg).  The major user is try_to_free_mem_cgroup_pages()
and swappiness is passed by argument.  It's propagated by scan_control.

get_swappiness() is a static function but some planned updates will need
to get swappiness from files other than memcontrol.c This patch exports
get_swappiness() as mem_cgroup_swappiness().  With this, we can remove the
argument of swapiness from try_to_free...  and drop swappiness from
scan_control.  only memcg uses it.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agortc: fix hrtimer deadlock
Thomas Gleixner [Tue, 26 Jul 2011 23:08:20 +0000 (16:08 -0700)]
rtc: fix hrtimer deadlock

Ben reported a lockup related to rtc. The lockup happens due to:

CPU0                                        CPU1

rtc_irq_set_state()     __run_hrtimer()
  spin_lock_irqsave(&rtc->irq_task_lock)    rtc_handle_legacy_irq();
      spin_lock(&rtc->irq_task_lock);
  hrtimer_cancel()
    while (callback_running);

So the running callback never finishes as it's blocked on
rtc->irq_task_lock.

Use hrtimer_try_to_cancel() instead and drop rtc->irq_task_lock while
waiting for the callback.  Fix this for both rtc_irq_set_state() and
rtc_irq_set_freq().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Ben Greear <greearb@candelatech.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agortc: limit frequency
Thomas Gleixner [Tue, 26 Jul 2011 23:08:19 +0000 (16:08 -0700)]
rtc: limit frequency

Due to the hrtimer self rearming mode a user can DoS the machine simply
because it's starved by hrtimer events.

The RTC hrtimer is self rearming.  We really need to limit the frequency
to something sensible.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ben Greear <greearb@candelatech.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agortc: handle errors correctly in rtc_irq_set_state()
Thomas Gleixner [Tue, 26 Jul 2011 23:08:18 +0000 (16:08 -0700)]
rtc: handle errors correctly in rtc_irq_set_state()

The code checks the correctness of the parameters, but unconditionally
arms/disarms the hrtimer.

The result is that a random task might arm/disarm rtc timer and surprise
the real owner by either generating events or by stopping them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ben Greear <greearb@candelatech.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomn10300, exec: remove redundant set_fs(USER_DS)
Mathias Krause [Tue, 26 Jul 2011 23:08:17 +0000 (16:08 -0700)]
mn10300, exec: remove redundant set_fs(USER_DS)

The address limit is already set in flush_old_exec() so this
set_fs(USER_DS) is redundant.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agodrivers/base/power/opp.c: fix dev_opp initial value
Jonghwan Choi [Tue, 26 Jul 2011 23:08:16 +0000 (16:08 -0700)]
drivers/base/power/opp.c: fix dev_opp initial value

Dev_opp initial value shoule be ERR_PTR(), IS_ERR() is used to check
error.

Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agofrv, exec: remove redundant set_fs(USER_DS)
Mathias Krause [Tue, 26 Jul 2011 23:08:15 +0000 (16:08 -0700)]
frv, exec: remove redundant set_fs(USER_DS)

The address limit is already set in flush_old_exec() so those calls to
set_fs(USER_DS) are redundant.

Also removed the dead code in flush_thread().

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoi2c-eg20t : Fix the issue of Combined R/W transfer mode
Tomoya MORINAGA [Thu, 23 Jun 2011 07:17:10 +0000 (16:17 +0900)]
i2c-eg20t : Fix the issue of Combined R/W transfer mode

issue-1
In case combined transfer mode fails halfway, the processing must be stopped halfway.
However currently, the processing is continued.
This patch breaks the processing.

issue-2
Currently, pch_i2c_xfer returns read/write size at that time.
However pch_i2c_xfer must return the number of messages to be read/written.
This patch modifies correctly.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
12 years agoi2c-eg20t : Support Combined R/W transfer mode
Tomoya MORINAGA [Thu, 9 Jun 2011 02:29:29 +0000 (11:29 +0900)]
i2c-eg20t : Support Combined R/W transfer mode

Currently, Combined R/W transfer mode is not supported.
This patch enables Combined R/W transfer mode.

Signed-off-by: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
12 years agoi2c: Tegra: Add DeviceTree support
John Bonesio [Wed, 22 Jun 2011 16:16:56 +0000 (09:16 -0700)]
i2c: Tegra: Add DeviceTree support

This patch modifies the tegra i2c driver so that it can be initiailized
using the device tree along with the devices connected to the i2c bus.

Signed-off-by: John Bonesio <bones@secretlab.ca>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: OIof Johansson <olof@lixom.net>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
12 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus
Linus Torvalds [Tue, 26 Jul 2011 21:17:28 +0000 (14:17 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/upstream-linus

* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (31 commits)
  MIPS: Close races in TLB modify handlers.
  MIPS: Add uasm UASM_i_SRL_SAFE macro.
  MIPS: RB532: Use hex_to_bin()
  MIPS: Enable cpu_has_clo_clz for MIPS Technologies' platforms
  MIPS: PowerTV: Provide cpu-feature-overrides.h
  MIPS: Remove pointless return statement from empty void functions.
  MIPS: Limit fixrange_init() to the FIXMAP region
  MIPS: Install handlers for software IRQs
  MIPS: Move FIXADDR_TOP into spaces.h
  MIPS: Add SYNC after cacheflush
  MIPS: pfn_valid() is broken on low memory HIGHMEM systems
  MIPS: HIGHMEM DMA on noncoherent MIPS32 processors
  MIPS: topdown mmap support
  MIPS: Remove redundant addr_limit assignment on exec.
  MIPS: AR7: Replace __attribute__((__packed__)) with __packed
  MIPS: AR7: Remove 'space before tabs' in platform.c
  MIPS: Lantiq: Add missing clk_enable and clk_disable functions.
  MIPS: AR7: Fix trailing semicolon bug in clock.c
  MAINTAINERS: Update MIPS entry.
  MIPS: BCM63xx: Remove duplicate PERF_IRQSTAT_REG definition
  ...

12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Tue, 26 Jul 2011 20:38:50 +0000 (13:38 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits)
  ceph: document unlocked d_parent accesses
  ceph: explicitly reference rename old_dentry parent dir in request
  ceph: document locking for ceph_set_dentry_offset
  ceph: avoid d_parent in ceph_dentry_hash; fix ceph_encode_fh() hashing bug
  ceph: protect d_parent access in ceph_d_revalidate
  ceph: protect access to d_parent
  ceph: handle racing calls to ceph_init_dentry
  ceph: set dir complete frag after adding capability
  rbd: set blk_queue request sizes to object size
  ceph: set up readahead size when rsize is not passed
  rbd: cancel watch request when releasing the device
  ceph: ignore lease mask
  ceph: fix ceph_lookup_open intent usage
  ceph: only link open operations to directory unsafe list if O_CREAT|O_TRUNC
  ceph: fix bad parent_inode calc in ceph_lookup_open
  ceph: avoid carrying Fw cap during write into page cache
  libceph: don't time out osd requests that haven't been received
  ceph: report f_bfree based on kb_avail rather than diffing.
  ceph: only queue capsnap if caps are dirty
  ceph: fix snap writeback when racing with writes
  ...

12 years agomerge fchmod() and fchmodat() guts, kill ancient broken kludge
Al Viro [Tue, 26 Jul 2011 08:15:54 +0000 (04:15 -0400)]
merge fchmod() and fchmodat() guts, kill ancient broken kludge

The kludge in question is undocumented and doesn't work for 32bit
binaries on amd64, sparc64 and s390.  Passing (mode_t)-1 as
mode had (since 0.99.14v and contrary to behaviour of any
other Unix, prescriptions of POSIX, SuS and our own manpages)
was kinda-sorta no-op.  Note that any software relying on
that (and looking for examples shows none) would be visibly
broken on sparc64, where practically all userland is built
32bit.  No such complaints noticed...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agoxfs: fix misspelled S_IS...()
Al Viro [Tue, 26 Jul 2011 00:54:24 +0000 (20:54 -0400)]
xfs: fix misspelled S_IS...()

mode_t is not a bitmap...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agoxfs: get rid of open-coded S_ISREG(), etc.
Al Viro [Tue, 26 Jul 2011 06:31:30 +0000 (02:31 -0400)]
xfs: get rid of open-coded S_ISREG(), etc.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agogma500: udelay(20000) it too long again
Stephen Rothwell [Mon, 25 Jul 2011 05:18:44 +0000 (15:18 +1000)]
gma500: udelay(20000) it too long again

so replace it with mdelay(20).

Fixes build error:

  ERROR: "__bad_udelay" [drivers/staging/gma500/psb_gfx.ko] undefined!

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoUSB / Renesas: Fix build issue related to struct scatterlist
Rafael J. Wysocki [Tue, 26 Jul 2011 18:51:01 +0000 (20:51 +0200)]
USB / Renesas: Fix build issue related to struct scatterlist

Fix build issue caused by undefined struct scatterlist in
drivers/usb/renesas_usbhs/fifo.c.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMMC / TMIO: Fix build issue related to struct scatterlist
Rafael J. Wysocki [Tue, 26 Jul 2011 18:50:23 +0000 (20:50 +0200)]
MMC / TMIO: Fix build issue related to struct scatterlist

Fix build issue caused by undefined struct scatterlist in
drivers/mmc/host/tmio_mmc.c.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux...
Linus Torvalds [Tue, 26 Jul 2011 18:34:40 +0000 (11:34 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs-2.6

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
  jbd: change the field "b_cow_tid" of struct journal_head from type unsigned to tid_t
  ext3.txt: update the links in the section "useful links" to the latest ones
  ext3: Fix data corruption in inodes with journalled data
  ext2: check xattr name_len before acquiring xattr_sem in ext2_xattr_get
  ext3: Fix compilation with -DDX_DEBUG
  quota: Remove unused declaration
  jbd: Use WRITE_SYNC in journal checkpoint.
  jbd: Fix oops in journal_remove_journal_head()
  ext3: Return -EINVAL when start is beyond the end of fs in ext3_trim_fs()
  ext3/ioctl.c: silence sparse warnings about different address spaces
  ext3/ext4 Documentation: remove bh/nobh since it has been deprecated
  ext3: Improve truncate error handling
  ext3: use proper little-endian bitops
  ext2: include fs.h into ext2_fs.h
  ext3: Fix oops in ext3_try_to_allocate_with_rsv()
  jbd: fix a bug of leaking jh->b_jcount
  jbd: remove dependency on __GFP_NOFAIL
  ext3: Convert ext3 to new truncate calling convention
  jbd: Add fixed tracepoints
  ext3: Add fixed tracepoints

Resolve conflicts in fs/ext3/fsync.c due to fsync locking push-down and
new fixed tracepoints.

12 years agoceph: document unlocked d_parent accesses
Sage Weil [Tue, 26 Jul 2011 18:31:26 +0000 (11:31 -0700)]
ceph: document unlocked d_parent accesses

For the most part we don't care about racing with rename when directing
MDS requests; either the old or new parent is fine.  Document that, and
do some minor cleanup.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: explicitly reference rename old_dentry parent dir in request
Sage Weil [Tue, 26 Jul 2011 18:31:14 +0000 (11:31 -0700)]
ceph: explicitly reference rename old_dentry parent dir in request

We carry a pin on the parent directory for the rename source and dest
dentries.  For the source it's r_locked_dir; we need to explicitly
reference the old_dentry parent as well, since the dentry's d_parent may
change between when the request was created and pinned and when it is
freed.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: document locking for ceph_set_dentry_offset
Sage Weil [Tue, 26 Jul 2011 18:31:08 +0000 (11:31 -0700)]
ceph: document locking for ceph_set_dentry_offset

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: avoid d_parent in ceph_dentry_hash; fix ceph_encode_fh() hashing bug
Sage Weil [Tue, 26 Jul 2011 18:30:55 +0000 (11:30 -0700)]
ceph: avoid d_parent in ceph_dentry_hash; fix ceph_encode_fh() hashing bug

Have caller pass in a safely-obtained reference to the parent directory
for calculating a dentry's hash valud.

While we're here, simpify the flow through ceph_encode_fh() so that there
is a single exit point and cleanup.

Also fix a bug with the dentry hash calculation: calculate the hash for the
dentry we were given, not its parent.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: protect d_parent access in ceph_d_revalidate
Sage Weil [Tue, 26 Jul 2011 18:30:43 +0000 (11:30 -0700)]
ceph: protect d_parent access in ceph_d_revalidate

Protect d_parent with d_lock.  Carry a reference.  Simplify the flow so
that there is a single exit point and cleanup.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: protect access to d_parent
Sage Weil [Tue, 26 Jul 2011 18:30:29 +0000 (11:30 -0700)]
ceph: protect access to d_parent

d_parent is protected by d_lock: use it when looking up a dentry's parent
directory inode.  Also take a reference and drop it in the caller to avoid
a use-after-free.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: handle racing calls to ceph_init_dentry
Sage Weil [Tue, 26 Jul 2011 18:30:15 +0000 (11:30 -0700)]
ceph: handle racing calls to ceph_init_dentry

The ->lookup() and prepopulate_readdir() callers are working with unhashed
dentries, so we don't have to worry.  The export.c callers, though, need
to initialize something they got back from d_obtain_alias() and are
potentially racing with other callers.  Make sure we don't return unless
the dentry is properly initialized (by us or someone else).

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: set dir complete frag after adding capability
Sage Weil [Tue, 26 Jul 2011 18:30:02 +0000 (11:30 -0700)]
ceph: set dir complete frag after adding capability

Curretly ceph_add_cap clears the complete bit if we are newly issued the
FILE_SHARED cap, which is normally the case for a newly issue cap on a new
directory.  That means we clear the just-set bit.  Move the check that sets
the flag to after the cap is added/updated.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agorbd: set blk_queue request sizes to object size
Josh Durgin [Fri, 22 Jul 2011 18:35:23 +0000 (11:35 -0700)]
rbd: set blk_queue request sizes to object size

This improves performance since more requests can be merged.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
12 years agoceph: set up readahead size when rsize is not passed
Yehuda Sadeh [Fri, 22 Jul 2011 18:12:28 +0000 (11:12 -0700)]
ceph: set up readahead size when rsize is not passed

This should improve the default read performance, as without it
readahead is practically disabled.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
12 years agorbd: cancel watch request when releasing the device
Yehuda Sadeh [Tue, 12 Jul 2011 23:56:57 +0000 (16:56 -0700)]
rbd: cancel watch request when releasing the device

We were missing this cleanup, so when a device was released
the osd didn't clean up its watchers list, so following notifications
could be slow as osd needed to timeout on the client.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
12 years agoceph: ignore lease mask
Sage Weil [Tue, 26 Jul 2011 18:28:25 +0000 (11:28 -0700)]
ceph: ignore lease mask

The lease mask is no longer used (and it changed a while back).  Instead,
use a non-zero duration to indicate that there is a lease being issued.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: fix ceph_lookup_open intent usage
Sage Weil [Tue, 26 Jul 2011 18:28:11 +0000 (11:28 -0700)]
ceph: fix ceph_lookup_open intent usage

We weren't properly calling lookup_instantiate_filp when setting up the
lookup intent, which could lead to file leakage on errors.  So:

 - use separate helper for the hidden snapdir translation, immediately
   following the mds request
 - use ceph_finish_lookup for the final dentry/return value dance in the
   exit path
 - lookup_instantiate_filp on success

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: only link open operations to directory unsafe list if O_CREAT|O_TRUNC
Sage Weil [Tue, 26 Jul 2011 18:27:59 +0000 (11:27 -0700)]
ceph: only link open operations to directory unsafe list if O_CREAT|O_TRUNC

We only need to put these on the directory unsafe list if they have
side effects that fsync(2) should flush out.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: fix bad parent_inode calc in ceph_lookup_open
Sage Weil [Tue, 26 Jul 2011 18:27:48 +0000 (11:27 -0700)]
ceph: fix bad parent_inode calc in ceph_lookup_open

We were always getting NULL here because the intent file f_dentry is always
NULL at this point, which means we were always passing NULL to
ceph_mdsc_do_request.  In reality, this was fine, since this isn't
currently ever a write operation that needs to get strung on the dir's
unsafe list.

Use the dir explicitly, and only pass it if this open has side-effects that
a dir fsync should flush.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: avoid carrying Fw cap during write into page cache
Sage Weil [Tue, 26 Jul 2011 18:27:34 +0000 (11:27 -0700)]
ceph: avoid carrying Fw cap during write into page cache

The generic_file_aio_write call may block on balance_dirty_pages while we
flush data to the OSDs.  If we hold a reference to the FILE_WR cap during
that interval revocation by the MDS (e.g., to do a stat(2)) may be very
slow.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agolibceph: don't time out osd requests that haven't been received
Sage Weil [Tue, 26 Jul 2011 18:27:24 +0000 (11:27 -0700)]
libceph: don't time out osd requests that haven't been received

Keep track of when an outgoing message is ACKed (i.e., the server fully
received it and, presumably, queued it for processing).  Time out OSD
requests only if it's been too long since they've been received.

This prevents timeouts and connection thrashing when the OSDs are simply
busy and are throttling the requests they read off the network.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: report f_bfree based on kb_avail rather than diffing.
Greg Farnum [Tue, 26 Jul 2011 18:26:54 +0000 (11:26 -0700)]
ceph: report f_bfree based on kb_avail rather than diffing.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
12 years agoceph: only queue capsnap if caps are dirty
Sage Weil [Tue, 26 Jul 2011 18:26:41 +0000 (11:26 -0700)]
ceph: only queue capsnap if caps are dirty

We used to go into this branch if i_wrbuffer_ref_head was non-zero.  This
was an ancient check from before we were careful about dealing with all
kinds of caps (and not just dirty pages).  It is cleaner to only queue a
capsnap if there is an actual dirty cap.  If we are racing with...
something...we will end up here with ci->i_wrbuffer_refs but no dirty
caps.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: fix snap writeback when racing with writes
Sage Weil [Tue, 26 Jul 2011 18:26:31 +0000 (11:26 -0700)]
ceph: fix snap writeback when racing with writes

There are two problems that come up when we try to queue a capsnap while a
write is in progress:

 - The FILE_WR cap is held, but not yet dirty, so we may queue a capsnap
   with dirty == 0.  That will crash later in __ceph_flush_snaps().  Or
   on the FILE_WR cap if a write is in progress.
 - We may not have i_head_snapc set, which causes problems pretty quickly.
   Look to the snaprealm in this case.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: use flag bit for at_end readdir flag
Sage Weil [Tue, 26 Jul 2011 18:26:18 +0000 (11:26 -0700)]
ceph: use flag bit for at_end readdir flag

This saves us a word of memory per file.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: add F_SYNC file flag to force sync (non-O_DIRECT) io
Sage Weil [Tue, 26 Jul 2011 18:26:07 +0000 (11:26 -0700)]
ceph: add F_SYNC file flag to force sync (non-O_DIRECT) io

This allows us to force IO through the sync path which you normally only
get when multiple clients are reading/writing to the same file or by
mounting with -o sync.  Among other things, this lets test programs verify
correctness with a single mount.

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoceph: add flags field to file_info
Sage Weil [Tue, 26 Jul 2011 18:25:27 +0000 (11:25 -0700)]
ceph: add flags field to file_info

Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
12 years agoMerge branch 'x86-olpc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 26 Jul 2011 18:11:54 +0000 (11:11 -0700)]
Merge branch 'x86-olpc-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'x86-olpc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, olpc-xo15-sci: Enable EC wakeup capability
  x86, olpc: Fix dependency on POWER_SUPPLY
  x86, olpc: Add XO-1.5 SCI driver
  x86, olpc: Add XO-1 RTC driver
  x86, olpc-xo1-sci: Propagate power supply/battery events
  x86, olpc-xo1-sci: Add lid switch functionality
  x86, olpc-xo1-sci: Add GPE handler and ebook switch functionality
  x86, olpc: EC SCI wakeup mask functionality
  x86, olpc: Add XO-1 SCI driver and power button control
  x86, olpc: Add XO-1 suspend/resume support
  x86, olpc: Rename olpc-xo1 to olpc-xo1-pm
  x86, olpc: Move CS5536-related constants to cs5535.h
  x86, olpc: Add missing elements to device tree

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Tue, 26 Jul 2011 18:11:28 +0000 (11:11 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: Cleanup: check return codes of crypto api calls
  CIFS: Fix oops while mounting with prefixpath
  [CIFS] Redundant null check after dereference
  cifs: use cifs_dirent in cifs_save_resume_key
  cifs: use cifs_dirent to replace cifs_get_name_from_search_buf
  cifs: introduce cifs_dirent
  cifs: cleanup cifs_filldir

12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Tue, 26 Jul 2011 18:10:56 +0000 (11:10 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/rostedt/linux-2.6-ktest

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest:
  ktest: Fix bug when ADD_CONFIG is set but MIN_CONFIG is not
  ktest: Keep fonud configs separate from default configs
  ktest: Add prompt to use OUTPUT_MIN_CONFIG
  ktest: Use Kconfig dependencies to shorten time to make min_config
  ktest: Add test type make_min_config
  ktest: Require one TEST_START in config file
  ktest: Add helper function to avoid duplicate code
  ktest: Add IGNORE_WARNINGS to ignore warnings in some patches
  ktest: Fix tar extracting of modules to target
  ktest: Have the testing tmp dir include machine name
  ktest: Add POST/PRE_BUILD options
  ktest: Allow initrd processing without modules defined
  ktest: Have LOG_FILE evaluate options as well
  ktest: Have wait on stdio honor bug timeout
  ktest: Implement our own force min config
  ktest: Add TEST_NAME option
  ktest: Add CONFIG_BISECT_GOOD option
  ktest: Add detection of triple faults
  ktest: Notify reason to break out of monitoring boot

12 years agovfs: document locking requirements for d_move, __d_move and d_materialise_unique
Jeff Layton [Tue, 26 Jul 2011 17:33:16 +0000 (13:33 -0400)]
vfs: document locking requirements for d_move, __d_move and d_materialise_unique

Adding a comment to d_materialise_unique per Al's request...

d_move and __d_move have some pretty substantial locking requirements,
but they are not clearly documented. Add some comments spelling them
out. Also, document the requirement for the i_mutex of the parent in
d_materialise_unique.

Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback
Linus Torvalds [Tue, 26 Jul 2011 17:39:54 +0000 (10:39 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/wfg/writeback

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/writeback: (27 commits)
  mm: properly reflect task dirty limits in dirty_exceeded logic
  writeback: don't busy retry writeback on new/freeing inodes
  writeback: scale IO chunk size up to half device bandwidth
  writeback: trace global_dirty_state
  writeback: introduce max-pause and pass-good dirty limits
  writeback: introduce smoothed global dirty limit
  writeback: consolidate variable names in balance_dirty_pages()
  writeback: show bdi write bandwidth in debugfs
  writeback: bdi write bandwidth estimation
  writeback: account per-bdi accumulated written pages
  writeback: make writeback_control.nr_to_write straight
  writeback: skip tmpfs early in balance_dirty_pages_ratelimited_nr()
  writeback: trace event writeback_queue_io
  writeback: trace event writeback_single_inode
  writeback: remove .nonblocking and .encountered_congestion
  writeback: remove writeback_control.more_io
  writeback: skip balance_dirty_pages() for in-memory fs
  writeback: add bdi_dirty_limit() kernel-doc
  writeback: avoid extra sync work at enqueue time
  writeback: elevate queue_io() into wb_writeback()
  ...

Fix up trivial conflicts in fs/fs-writeback.c and mm/filemap.c

12 years agoomfs: fix (mode & S_IFDIR) abuse
Al Viro [Tue, 26 Jul 2011 06:34:33 +0000 (02:34 -0400)]
omfs: fix (mode & S_IFDIR) abuse

granted, on a filesystem that has only regular files and directories
it happens to work, but really should be S_ISDIR(mode)...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agobtrfs: S_ISREG(mode) is not mode & S_IFREG...
Al Viro [Sun, 24 Jul 2011 21:08:40 +0000 (17:08 -0400)]
btrfs: S_ISREG(mode) is not mode & S_IFREG...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agoima: fmode_t misspelled as mode_t...
Al Viro [Tue, 26 Jul 2011 08:30:35 +0000 (04:30 -0400)]
ima: fmode_t misspelled as mode_t...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 26 Jul 2011 17:02:58 +0000 (10:02 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix warning with calling smp_processor_id() in preemptible section

12 years agopci-label.c: size_t misspelled as mode_t
Al Viro [Mon, 25 Jul 2011 03:31:06 +0000 (23:31 -0400)]
pci-label.c: size_t misspelled as mode_t

no, really, strlen() and snprintf() do not return mode_t values...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agojffs2: S_ISLNK(mode & S_IFMT) is pointless
Al Viro [Sun, 24 Jul 2011 21:11:33 +0000 (17:11 -0400)]
jffs2: S_ISLNK(mode & S_IFMT) is pointless

it's S_ISLNK(mode), TYVM...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agosnd_msnd ->mode is fmode_t, not mode_t
Al Viro [Tue, 26 Jul 2011 08:35:42 +0000 (04:35 -0400)]
snd_msnd ->mode is fmode_t, not mode_t

we put FMODE_... in there

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agov9fs_iop_get_acl: get rid of unused variable
Al Viro [Tue, 26 Jul 2011 16:57:42 +0000 (12:57 -0400)]
v9fs_iop_get_acl: get rid of unused variable

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agovfs: dont chain pipe/anon/socket on superblock s_inodes list
Eric Dumazet [Tue, 26 Jul 2011 09:36:34 +0000 (11:36 +0200)]
vfs: dont chain pipe/anon/socket on superblock s_inodes list

Workloads using pipes and sockets hit inode_sb_list_lock contention.

superblock s_inodes list is needed for quota, dirty, pagecache and
fsnotify management. pipe/anon/socket fs are clearly not candidates for
these.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agoDocumentation: Exporting: update description of d_splice_alias
Phillip Lougher [Tue, 26 Jul 2011 02:40:45 +0000 (03:40 +0100)]
Documentation: Exporting: update description of d_splice_alias

Following commits a904937 and 0c1aa9a update the d_splice_alias
desciption.

Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agofs: add missing unlock in default_llseek()
Dan Carpenter [Tue, 26 Jul 2011 14:25:20 +0000 (17:25 +0300)]
fs: add missing unlock in default_llseek()

A recent change in linux-next, 982d816581 "fs: add SEEK_HOLE and
SEEK_DATA flags" added some direct returns on error, but it should
have been a goto out.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
12 years agoMerge branch 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied...
Linus Torvalds [Tue, 26 Jul 2011 16:21:09 +0000 (09:21 -0700)]
Merge branch 'drm-core-next' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (135 commits)
  drm/radeon/kms: fix DP training for DPEncoderService revision bigger than 1.1
  drm/radeon/kms: add missing vddci setting on NI+
  drm/radeon: Add a rmb() in IH processing
  drm/radeon: ATOM Endian fix for atombios_crtc_program_pll()
  drm/radeon: Fix the definition of RADEON_BUF_SWAP_32BIT
  drm/radeon: Do an MMIO read on interrupts when not uisng MSIs
  drm/radeon: Writeback endian fixes
  drm/radeon: Remove a bunch of useless _iomem casts
  drm/gem: add support for private objects
  DRM: clean up and document parsing of video= parameter
  DRM: Radeon: Fix section mismatch.
  drm: really make debug levels match in edid failure code
  drm/radeon/kms: fix i2c map for rv250/280
  drm/nouveau/gr: disable fifo access and idle before suspend ctx unload
  drm/nouveau: pass flag to engine fini() method on suspend
  drm/nouveau: replace nv04_graph_fifo_access() use with direct reg bashing
  drm/nv40/gr: rewrite/split context takedown functions
  drm/nouveau: detect disabled device in irq handler and return IRQ_NONE
  drm/nouveau: ignore connector type when deciding digital/analog on DVI-I
  drm/nouveau: Add a quirk for Gigabyte NX86T
  ...

12 years agoMerge branch 'fix/asoc' into for-linus
Takashi Iwai [Tue, 26 Jul 2011 15:47:05 +0000 (17:47 +0200)]
Merge branch 'fix/asoc' into for-linus

12 years agoALSA: hda - Cirrus Logic CS421x support
Tim Howe [Fri, 22 Jul 2011 21:41:00 +0000 (16:41 -0500)]
ALSA: hda - Cirrus Logic CS421x support

This update includes the changes necessary for supporting the
CS421x family of codecs.  Previously this file only supported
the CS420x family of codecs.

This file also contains init verbs to correct several issues in
the CS421x hardware.

Behavior between the CS421x and CS420x codec families is similar,
so several functions have been reused with "if" statements to
determine which codec family (CS421x or CS420x) is present.

Also, this file will be updated sometime in the near future in
order to add support for a system using CS421x that requires
mono mix on the speaker output only.

[Fix const usages and adaption for new APIs by tiwai]

Signed-off-by: Tim Howe <tim.howe@cirrus.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 years agoALSA: Make pcm.h self-contained
Takashi Iwai [Tue, 26 Jul 2011 10:02:56 +0000 (12:02 +0200)]
ALSA: Make pcm.h self-contained

Move the macros depending on snd_mask_min() and co out of pcm.h into
pcm_params.h.  Otherwise using some params_*() macros will give comiple
errors without inclusion of pcm_params.h.

Also use hw_param_interval_c() and hw_param_mask_c() for const pointer.

Reported-by: Tim Blechmann <tim@klingt.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 years agoALSA: hda - Allow codec-specific set_power_state ops
Takashi Iwai [Tue, 26 Jul 2011 08:33:10 +0000 (10:33 +0200)]
ALSA: hda - Allow codec-specific set_power_state ops

The procedure for codec D-state change may have exceptional cases
depending on the codec chip, such as a longer delay or suppressing D3.

This patch adds a new codec ops, set_power_state() to override the system
default function.  For ease of porting, snd_hda_codec_set_power_to_all()
helper function is extracted from the default set_power_state() function.

As an example, the Conexant codec-specific delay is removed from the
default routine but moved to patch_conexant.c.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 years agoALSA: hda - Add post_suspend patch ops
Takashi Iwai [Tue, 26 Jul 2011 08:19:20 +0000 (10:19 +0200)]
ALSA: hda - Add post_suspend patch ops

Add a new ops, post_suspend(), which is called after suspend() ops is
performed.  This is called only in the case of the real PM suspend, and
the codec driver can use this for further changing of D-state or
clearing the LED, etc.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 years agoALSA: hda - Make CONFIG_SND_HDA_POWER_SAVE depending on CONFIG_PM
Takashi Iwai [Tue, 26 Jul 2011 07:52:50 +0000 (09:52 +0200)]
ALSA: hda - Make CONFIG_SND_HDA_POWER_SAVE depending on CONFIG_PM

It makes little sense to enable power-saving without PM.
This removes SND_HDA_NEEDS_RESUME define so that we can use CONFIG_PM
in all places.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 years agoblock: fix warning with calling smp_processor_id() in preemptible section
Jens Axboe [Tue, 26 Jul 2011 13:01:15 +0000 (15:01 +0200)]
block: fix warning with calling smp_processor_id() in preemptible section

After commit 5757a6d7 introduced an unsafe calling of
smp_processor_id(), with preempt debuggin turned on we spew a lot of:

BUG: using smp_processor_id() in preemptible [00000000] code: kjournald/514
caller is __make_request+0x1b8/0x308
[<c0019f44>] (unwind_backtrace+0x0/0xe8) from [<c024b4cc>] (debug_smp_processor_id+0xbc/0xf0)
[<c024b4cc>] (debug_smp_processor_id+0xbc/0xf0) from [<c0223d14>] (__make_request+0x1b8/0x308)
[<c0223d14>] (__make_request+0x1b8/0x308) from [<c02215ac>] (generic_make_request+0x4dc/0x558)
[<c02215ac>] (generic_make_request+0x4dc/0x558) from [<c022173c>] (submit_bio+0x114/0x138)
[<c022173c>] (submit_bio+0x114/0x138) from [<c011f504>] (submit_bh+0x148/0x16c)
[<c011f504>] (submit_bh+0x148/0x16c) from [<c0121ed8>] (__sync_dirty_buffer+0x88/0xd8)
[<c0121ed8>] (__sync_dirty_buffer+0x88/0xd8) from [<c01aff78>] (journal_commit_transaction+0x1198/0x1688)
[<c01aff78>] (journal_commit_transaction+0x1198/0x1688) from [<c01b4034>] (kjournald+0xb4/0x224)
[<c01b4034>] (kjournald+0xb4/0x224) from [<c0069ea0>] (kthread+0x8c/0x94)
[<c0069ea0>] (kthread+0x8c/0x94) from [<c00137f8>] (kernel_thread_exit+0x0/0x8)

Fix this by just using raw_smp_processor_id(), it's just a hint
after all. There's no pinning of the CPU or accessing per-cpu
structures involved.

Reported-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
12 years agodrm/radeon/kms: fix DP training for DPEncoderService revision bigger than 1.1
Jerome Glisse [Mon, 25 Jul 2011 15:57:43 +0000 (11:57 -0400)]
drm/radeon/kms: fix DP training for DPEncoderService revision bigger than 1.1

DPEncoderService newer than 1.1 can't properly program the DP (display port)
link training. When facing such version use the DIGxEncoderControl method
instead. Fix DP link training on some R7XX.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agodrm/radeon/kms: add missing vddci setting on NI+
Alex Deucher [Mon, 25 Jul 2011 22:50:08 +0000 (18:50 -0400)]
drm/radeon/kms: add missing vddci setting on NI+

Need to add vddci setting to pm init as well as
resume.  Fixes hangs on load on some boards.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=38754

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
12 years agotarget: Convert to DIV_ROUND_UP_SECTOR_T usage for sectors / dev_max_sectors
Nicholas Bellinger [Tue, 26 Jul 2011 07:38:40 +0000 (00:38 -0700)]
target: Convert to DIV_ROUND_UP_SECTOR_T usage for sectors / dev_max_sectors

This patch adds the new macro usage of include/linux/kernel.h:DIV_ROUND_UP_SECTOR_T
for the new DIV_ROUND_UP_ULL() usage for 32-bit architectures with unsigned long long
sector_t division in transport_allocate_data_tasks() usage for target_core_mod v4.1

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
12 years agoRevert "microblaze: PCI fix typo fault in of_node pointer moving into pci_bus"
Michal Simek [Tue, 26 Jul 2011 09:24:56 +0000 (11:24 +0200)]
Revert "microblaze: PCI fix typo fault in of_node pointer moving into pci_bus"

This reverts commit c9d761b7c4b658a937a941aea2781f511a0ff3ec.

Ben' commit "microblaze/pci: Move the remains of pci_32.c to pci-common.c"
(sha1: bf13a6fa09b8db7f1fd59b5e2ed3674a89a6a25c)
completely removed pci_32.c that's why my fixing commit caused
the problem with merging and need to be revert.

Signed-off-by: Michal Simek <monstr@monstr.eu>
12 years agoGFS2: Fix mount hang caused by certain access pattern to sysfs files
Steven Whitehouse [Tue, 26 Jul 2011 08:15:45 +0000 (09:15 +0100)]
GFS2: Fix mount hang caused by certain access pattern to sysfs files

Depending upon the order of userspace/kernel during the
mount process, this can result in a hang without the
_all version of the completion.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
12 years agokernel.h: Add DIV_ROUND_UP_ULL and DIV_ROUND_UP_SECTOR_T macro usage
Nicholas Bellinger [Tue, 26 Jul 2011 07:35:26 +0000 (00:35 -0700)]
kernel.h: Add DIV_ROUND_UP_ULL and DIV_ROUND_UP_SECTOR_T macro usage

Add new DIV_ROUND_UP_SECTOR_T macro usage for 32-bit architectures requiring
a new DIV_ROUND_UP_ULL, and existing 64-bit usage with DIV_ROUND_UP.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
12 years agoiscsi-target: Add iSCSI fabric support for target v4.1
Nicholas Bellinger [Sat, 23 Jul 2011 06:43:04 +0000 (06:43 +0000)]
iscsi-target: Add iSCSI fabric support for target v4.1

The Linux-iSCSI.org target module is a full featured in-kernel
software implementation of iSCSI target mode (RFC-3720) for the
current WIP mainline target v4.1 infrastructure code for the v3.1
kernel.  More information can be found here:

http://linux-iscsi.org/wiki/ISCSI

This includes support for:

   * RFC-3720 defined request / response state machines and support for
     all defined iSCSI operation codes from Section 10.2.1.2 using libiscsi
     include/scsi/iscsi_proto.h PDU definitions
   * Target v4.1 compatible control plane using the generic layout in
     target_core_fabric_configfs.c and fabric dependent attributes
     within /sys/kernel/config/target/iscsi/ subdirectories.
   * Target v4.1 compatible iSCSI statistics based on RFC-4544 (iSCSI MIBS)
   * Support for IPv6 and IPv4 network portals in M:N mapping to TPGs
   * iSCSI Error Recovery Hierarchy support
   * Per iSCSI connection RX/TX thread pair scheduling affinity
   * crc32c + crc32c_intel SSEv4 instruction offload support using libcrypto
   * CHAP Authentication support using libcrypto
   * Conversion to use internal SGl allocation with iscsit_alloc_buffs() ->
     transport_generic_map_mem_to_cmd()

(nab: Fix iscsi_proto.h struct scsi_lun usage from linux-next in commit:
      iscsi: Use struct scsi_lun in iscsi structs instead of u8[8])
(nab: Fix 32-bit compile warnings)

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andy Grover <agrover@redhat.com>
Acked-by: Roland Dreier <roland@kernel.org>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
12 years agoALSA: hda - Make sure mute led reflects master mute state
Vitaliy Kulikov [Mon, 25 Jul 2011 22:52:57 +0000 (17:52 -0500)]
ALSA: hda - Make sure mute led reflects master mute state

This patch adds checking of mute state on all outputs besides just
speakers to calculate the master mute state for mute led support.
It also renames and splits the function that does it for better code
clarity.

Signed-off-by: Vitaliy Kulikov <Vitaliy.Kulikov@idt.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 years agoALSA: hda - Fix invalid mute led state on resume of IDT codecs
Vitaliy Kulikov [Fri, 22 Jul 2011 23:18:15 +0000 (18:18 -0500)]
ALSA: hda - Fix invalid mute led state on resume of IDT codecs

Codec state is not restored immediately on resume but on the first
access when power-save is enabled.  That leads to an invalid mute led
state after resume until either sound is played or some control is
changed.  This patch adds a possibility for a vendor specific patch to
restore codec state immediately after resume if required.  And it adds
code to restore IDT codecs state immediately on resume on HP systems
with mute led support.

Signed-off-by: Vitaliy Kulikov <Vitaliy.Kulikov@idt.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
12 years agop9: avoid unused variable warning
Linus Torvalds [Tue, 26 Jul 2011 06:43:53 +0000 (23:43 -0700)]
p9: avoid unused variable warning

Commit 4e34e719e457 ("fs: take the ACL checks to common code") removed
the use of the 'acl' variable in v9fs_iop_get_acl(), but left the
variable definition around.  Remove it to get rid of the warning:

  fs/9p/acl.c: In function ‘v9fs_iop_get_acl’:
  fs/9p/acl.c:101:20: warning: unused variable ‘acl’

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Tue, 26 Jul 2011 06:26:34 +0000 (23:26 -0700)]
Merge branch 'staging-next' of git://git./linux/kernel/git/gregkh/staging-2.6

* 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (741 commits)
  staging:iio:meter:ade7753 should be 16 bit read not 8 bit for mode register.
  staging:iio:kfifo_buf fix double initialization of the ring device structure.
  staging:iio:accel:lis3l02dq: fix incorrect pointer passed to spi_set_drvdata.
  staging:iio:imu fix missing register table index for some channels
  spectra: enable device before poking it
  staging: rts_pstor: Fix a miswriting
  staging/lirc_bt829: Return -ENODEV when no hardware is found.
  staging/lirc_parallel: remove pointless prototypes.
  staging/lirc_parallel: fix panic on rmmod
  staging:iio:adc:ad7476: Incorrect pointer into spi_set_drvdata.
  Staging: zram: Fix kunmapping order
  Revert "gma500: Fix dependencies"
  gma500: Add medfield header
  gma500: wire up the mrst i2c bus from chip_info
  gma500: Fix DPU build
  gma500: Clean up the DPU config and make it runtime
  gma500: resync with Medfield progress
  gma500: Use the mrst helpers and power control for mode commit
  gma500@ Fix backlight range error
  gma500: More Moorestown muddle meddling means MM maybe might modeset
  ...

Fix up fairly trivial conflicts all over, mostly due to header file
cleanup conflicts, but some deleted files and some just context changes:
 - Documentation/feature-removal-schedule.txt
 - drivers/staging/bcm/headers.h
 - drivers/staging/brcm80211/brcmfmac/dhd_linux.c
 - drivers/staging/brcm80211/brcmfmac/dhd_sdio.c
 - drivers/staging/brcm80211/brcmfmac/wl_cfg80211.h
 - drivers/staging/brcm80211/brcmfmac/wl_iw.c
 - drivers/staging/et131x/et131x_netdev.c
 - drivers/staging/rtl8187se/ieee80211/ieee80211_softmac.c
 - drivers/staging/rtl8192e/r8192E.h
 - drivers/staging/usbip/userspace/src/utils.h

12 years agoMerge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
Linus Torvalds [Tue, 26 Jul 2011 06:09:27 +0000 (23:09 -0700)]
Merge branch 'tty-next' of git://git./linux/kernel/git/gregkh/tty-2.6

* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (26 commits)
  amba pl011: workaround for uart registers lockup
  n_gsm: fix the wrong FCS handling
  pch_uart: add missing comment about OKI ML7223
  pch_uart: Add MSI support
  tty: fix "IRQ45: nobody cared"
  PTI feature to allow user to name and mark masterchannel request.
  0 for o PTI Makefile bug.
  tty: serial: samsung.c remove legacy PM code.
  SERIAL: SC26xx: Fix link error.
  serial: mrst_max3110: initialize waitqueue earlier
  mrst_max3110: Change max missing message priority.
  tty: s5pv210: Add delay loop on fifo reset function for UART
  tty/serial: Fix XSCALE serial ports, e.g. ce4100
  serial: bfin_5xx: fix off-by-one with resource size
  drivers/tty: use printk_ratelimited() instead of printk_ratelimit()
  tty: n_gsm: Added refcount usage to gsm_mux and gsm_dlci structs
  tty: n_gsm: Add raw-ip support
  tty: n_gsm: expose gsmtty device nodes at ldisc open time
  pch_phub: Fix register miss-setting issue
  serial: 8250, increase PASS_LIMIT
  ...

12 years agoMerge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
Linus Torvalds [Tue, 26 Jul 2011 06:08:32 +0000 (23:08 -0700)]
Merge branch 'usb-next' of git://git./linux/kernel/git/gregkh/usb-2.6

* 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (115 commits)
  EHCI: fix direction handling for interrupt data toggles
  USB: serial: add IDs for WinChipHead USB->RS232 adapter
  USB: OHCI: fix another regression for NVIDIA controllers
  usb: gadget: m66592-udc: add pullup function
  usb: gadget: m66592-udc: add function for external controller
  usb: gadget: r8a66597-udc: add pullup function
  usb: renesas_usbhs: support multi driver
  usb: renesas_usbhs: inaccessible pipe is not an error
  usb: renesas_usbhs: care buff alignment when dma handler
  USB: PL2303: correctly handle baudrates above 115200
  usb: r8a66597-hcd: fixup USB_PORT_STAT_C_SUSPEND shift
  usb: renesas_usbhs: compile/config are rescued
  usb: renesas_usbhs: fixup comment-out
  usb: update email address in ohci-sh and r8a66597-hcd
  usb: r8a66597-hcd: add function for external controller
  EHCI: only power off port if over-current is active
  USB: mon: Allow to use usbmon without debugfs
  USB: EHCI: go back to using the system clock for QH unlinks
  ehci: add pci quirk for Ordissimo and RM Slate 100 too
  ehci: refactor pci quirk to use standard dmi_check_system method
  ...

Fix up trivial conflicts in Documentation/feature-removal-schedule.txt

12 years agoMerge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 26 Jul 2011 06:06:24 +0000 (23:06 -0700)]
Merge branch 'driver-core-next' of git://git./linux/kernel/git/gregkh/driver-core-2.6

* 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  updated Documentation/ja_JP/SubmittingPatches
  debugfs: add documentation for debugfs_create_x64
  uio: uio_pdrv_genirq: Add OF support
  firmware: gsmi: remove sysfs entries when unload the module
  Documentation/zh_CN: Fix messy code file email-clients.txt
  driver core: add more help description for "path to uevent helper"
  driver-core: modify FIRMWARE_IN_KERNEL help message
  driver-core: Kconfig grammar corrections in firmware configuration
  DOCUMENTATION: Replace create_device() with device_create().
  DOCUMENTATION: Update overview.txt in Doc/driver-model.
  pti: pti_tty_install documentation mispelling.

12 years agoMerge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Tue, 26 Jul 2011 05:59:39 +0000 (22:59 -0700)]
Merge branch 'next' of git://git./linux/kernel/git/benh/powerpc

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (99 commits)
  drivers/virt: add missing linux/interrupt.h to fsl_hypervisor.c
  powerpc/85xx: fix mpic configuration in CAMP mode
  powerpc: Copy back TIF flags on return from softirq stack
  powerpc/64: Make server perfmon only built on ppc64 server devices
  powerpc/pseries: Fix hvc_vio.c build due to recent changes
  powerpc: Exporting boot_cpuid_phys
  powerpc: Add CFAR to oops output
  hvc_console: Add kdb support
  powerpc/pseries: Fix hvterm_raw_get_chars to accept < 16 chars, fixing xmon
  powerpc/irq: Quieten irq mapping printks
  powerpc: Enable lockup and hung task detectors in pseries and ppc64 defeconfigs
  powerpc: Add mpt2sas driver to pseries and ppc64 defconfig
  powerpc: Disable IRQs off tracer in ppc64 defconfig
  powerpc: Sync pseries and ppc64 defconfigs
  powerpc/pseries/hvconsole: Fix dropped console output
  hvc_console: Improve tty/console put_chars handling
  powerpc/kdump: Fix timeout in crash_kexec_wait_realmode
  powerpc/mm: Fix output of total_ram.
  powerpc/cpufreq: Add cpufreq driver for Momentum Maple boards
  powerpc: Correct annotations of pmu registration functions
  ...

Fix up trivial Kconfig/Makefile conflicts in arch/powerpc, drivers, and
drivers/cpufreq

12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Linus Torvalds [Tue, 26 Jul 2011 05:50:54 +0000 (22:50 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/gerg/m68knommu

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k: Revive reporting of spurious interrupts
  m68knommu: Move forward declaration of do_IRQ() from machdep.h to irq.h
  m68k: fix some atomic operation asm address modes for ColdFire
  m68k: use CPU_HAS_NO_BITFIELDS for signal functions
  m68k: merge and clean up delay.h files
  m68knommu: correctly use trap_init
  m68knommu: merge ColdFire 5206 and 5206e platform code
  m68k: merge mmu and non-mmu bitops.h
  m68k: merge MMU and non MMU versions of system.h
  m68k: merge MMU and non-MMU versions of asm/hardirq.h
  m68k: merge the non-mmu and mmu versions of module.c
  m68knommu: Fix printk() format in free_initrd_mem()
  m68knommu: Make empty_zero_page "void *", like on m68k

12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus
Linus Torvalds [Tue, 26 Jul 2011 05:50:35 +0000 (22:50 -0700)]
Merge git://git./linux/kernel/git/pkl/squashfs-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
  Squashfs: Make ZLIB compression support optional
  Squashfs: Update documentation for XZ and add squashfs-tools devel tree

12 years agoMerge branch 'for-3.1' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Tue, 26 Jul 2011 05:49:19 +0000 (22:49 -0700)]
Merge branch 'for-3.1' of git://linux-nfs.org/~bfields/linux

* 'for-3.1' of git://linux-nfs.org/~bfields/linux:
  nfsd: don't break lease on CLAIM_DELEGATE_CUR
  locks: rename lock-manager ops
  nfsd4: update nfsv4.1 implementation notes
  nfsd: turn on reply cache for NFSv4
  nfsd4: call nfsd4_release_compoundargs from pc_release
  nfsd41: Deny new lock before RECLAIM_COMPLETE done
  fs: locks: remove init_once
  nfsd41: check the size of request
  nfsd41: error out when client sets maxreq_sz or maxresp_sz too small
  nfsd4: fix file leak on open_downgrade
  nfsd4: remember to put RW access on stateid destruction
  NFSD: Added TEST_STATEID operation
  NFSD: added FREE_STATEID operation
  svcrpc: fix list-corrupting race on nfsd shutdown
  rpc: allow autoloading of gss mechanisms
  svcauth_unix.c: quiet sparse noise
  svcsock.c: include sunrpc.h to quiet sparse noise
  nfsd: Remove deprecated nfsctl system call and related code.
  NFSD: allow OP_DESTROY_CLIENTID to be only op in COMPOUND

Fix up trivial conflicts in Documentation/feature-removal-schedule.txt

12 years agoMIPS: Close races in TLB modify handlers.
David Daney [Tue, 5 Jul 2011 23:34:46 +0000 (16:34 -0700)]
MIPS: Close races in TLB modify handlers.

Page table entries are made invalid by writing a zero into the the PTE
slot in a page table.  This creates a race condition with the TLB
modify handlers when they are updating the PTE.

CPU0                              CPU1

Test for _PAGE_PRESENT
.                                 set to not _PAGE_PRESENT (zero)
Set to _PAGE_VALID

So now the page not present value (zero) is suddenly valid and user
space programs have access to physical page zero.

We close the race by putting the test for _PAGE_PRESENT and setting of
_PAGE_VALID into an atomic LL/SC section.  This requires more registers
than just K0 and K1 in the handlers, so we need to save some registers
to a save area and then restore them when we are done.

The save area is an array of cacheline aligned structures that should
not suffer cache line bouncing as they are CPU private.

[ralf@linux-mips.org: Fix !defined(CONFIG_MIPS_PGD_C0_CONTEXT) build error.]

Signed-off-by: David Daney <david.daney@cavium.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2577/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
12 years agoMIPS: Add uasm UASM_i_SRL_SAFE macro.
David Daney [Tue, 5 Jul 2011 23:34:45 +0000 (16:34 -0700)]
MIPS: Add uasm UASM_i_SRL_SAFE macro.

This can be used from either 32-bit or 64-bit code to generate logical
right shifts of any constant amount.

Signed-off-by: David Daney <david.daney@cavium.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2576/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
12 years agovfs: fix check_acl compile error when CONFIG_FS_POSIX_ACL is not set
Linus Torvalds [Tue, 26 Jul 2011 05:47:03 +0000 (22:47 -0700)]
vfs: fix check_acl compile error when CONFIG_FS_POSIX_ACL is  not set

Commit e77819e57f08 ("vfs: move ACL cache lookup into generic code")
didn't take the FS_POSIX_ACL config variable into account - when that is
not set, ACL's go away, and the cache helper functions do not exist,
causing compile errors like

  fs/namei.c: In function 'check_acl':
  fs/namei.c:191:10: error: implicit declaration of function 'negative_cached_acl'
  fs/namei.c:196:2: error: implicit declaration of function 'get_cached_acl'
  fs/namei.c:196:6: warning: assignment makes pointer from integer without a cast
  fs/namei.c:212:11: error: implicit declaration of function 'set_cached_acl'

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge 'akpm' patch series
Linus Torvalds [Tue, 26 Jul 2011 04:00:19 +0000 (21:00 -0700)]
Merge 'akpm' patch series

* Merge akpm patch series: (122 commits)
  drivers/connector/cn_proc.c: remove unused local
  Documentation/SubmitChecklist: add RCU debug config options
  reiserfs: use hweight_long()
  reiserfs: use proper little-endian bitops
  pnpacpi: register disabled resources
  drivers/rtc/rtc-tegra.c: properly initialize spinlock
  drivers/rtc/rtc-twl.c: check return value of twl_rtc_write_u8() in twl_rtc_set_time()
  drivers/rtc: add support for Qualcomm PMIC8xxx RTC
  drivers/rtc/rtc-s3c.c: support clock gating
  drivers/rtc/rtc-mpc5121.c: add support for RTC on MPC5200
  init: skip calibration delay if previously done
  misc/eeprom: add eeprom access driver for digsy_mtc board
  misc/eeprom: add driver for microwire 93xx46 EEPROMs
  checkpatch.pl: update $logFunctions
  checkpatch: make utf-8 test --strict
  checkpatch.pl: add ability to ignore various messages
  checkpatch: add a "prefer __aligned" check
  checkpatch: validate signature styles and To: and Cc: lines
  checkpatch: add __rcu as a sparse modifier
  checkpatch: suggest using min_t or max_t
  ...

Did this as a merge because of (trivial) conflicts in
 - Documentation/feature-removal-schedule.txt
 - arch/xtensa/include/asm/uaccess.h
that were just easier to fix up in the merge than in the patch series.

12 years agodrivers/connector/cn_proc.c: remove unused local
Andrew Morton [Tue, 26 Jul 2011 00:13:39 +0000 (17:13 -0700)]
drivers/connector/cn_proc.c: remove unused local

Fix the warning

  drivers/connector/cn_proc.c: In function 'proc_ptrace_connector':
  drivers/connector/cn_proc.c:176: warning: unused variable 'tracer'

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoDocumentation/SubmitChecklist: add RCU debug config options
Paul E. McKenney [Tue, 26 Jul 2011 00:13:38 +0000 (17:13 -0700)]
Documentation/SubmitChecklist: add RCU debug config options

There have been persistent lockdep RCU splats, indicating that submitters
are not testing with CONFIG_PROVE_RCU.  Add this config option to the list
in Documentation/SubmitChecklist.  Also add CONFIG_DEBUG_OBJECTS_RCU_HEAD
for good measure.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoreiserfs: use hweight_long()
Akinobu Mita [Tue, 26 Jul 2011 00:13:38 +0000 (17:13 -0700)]
reiserfs: use hweight_long()

Use hweight_long() to count free bits in the bitmap.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoreiserfs: use proper little-endian bitops
Akinobu Mita [Tue, 26 Jul 2011 00:13:37 +0000 (17:13 -0700)]
reiserfs: use proper little-endian bitops

Using __test_and_{set,clear}_bit_le() with ignoring its return value can
be replaced with __{set,clear}_bit_le().

This introduces reiserfs_{set,clear}_le_bit for __{set,clear}_bit_le and
does the above change with them.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agopnpacpi: register disabled resources
Witold Szczeponik [Tue, 26 Jul 2011 00:13:36 +0000 (17:13 -0700)]
pnpacpi: register disabled resources

When parsing PnP ACPI resource structures, it may happen that some of
the resources are disabled (in which case "the size" of the resource
equals zero).

The current solution is to skip these resources completely - with the
unfortunate side effect that they are not registered despite the fact
that they exist, after all.  (The downside of this approach is that
these resources cannot be used as templates for setting the actual
device's resources because they are missing from the template.) The
kernel's APM implementation does not suffer from this problem and
registers all resources regardless of "their size".

This patch fixes a problem with (at least) the vintage IBM ThinkPad 600E
(and most likely also with the 600, 600X, and 770X which have a very
similar layout) where some of its PnP devices support options where
either an IRQ, a DMA, or an IO port is disabled.  Without this patch,
the devices can not be configured using the
"/sys/bus/pnp/devices/*/resources" interface.

The manipulation of these resources is important because the 600E has
very demanding requirements.  For instance, the number of IRQs is not
sufficient to support all devices of the 600E.  Fortunately, some of the
devices, like the sound card's MPU-401 UART, can be configured to not
use any IRQ, hence freeing an IRQ for a device that requires one.
(Still, the device's "ResourceTemplate" requires an IRQ resource
descriptor which cannot be created if the resource has not been
registered in the first place.)

As an example, the dependent sets of the 600E's CSC0103 device (the
MPU-401 UART) are listed, with the patch applied, as:

  Dependent: 00 - Priority preferred
    port 0x300-0x330, align 0xf, size 0x4, 16-bit address decoding
    irq <none> High-Edge
  Dependent: 01 - Priority acceptable
    port 0x300-0x330, align 0xf, size 0x4, 16-bit address decoding
    irq 5,7,2/9,10,11,15 High-Edge

(The same result is obtained when PNPBIOS is used instead of PnP ACPI.)
Without the patch, the IRQ resource in the preferred option is not
listed at all:

  Dependent: 00 - Priority preferred
    port 0x300-0x330, align 0xf, size 0x4, 16-bit address decoding
  Dependent: 01 - Priority acceptable
    port 0x300-0x330, align 0xf, size 0x4, 16-bit address decoding
    irq 5,7,2/9,10,11,15 High-Edge

And in fact, the 600E's DSDT lists the disabled IRQ as an option, as can
be seen from the following excerpt from the DSDT:

Name (_PRS, ResourceTemplate ()
{
        StartDependentFn (0x00, 0x00)
        {
            IO (Decode16, 0x0300, 0x0330, 0x10, 0x04)
            IRQNoFlags () {}
        }
        StartDependentFn (0x01, 0x00)
        {
            IO (Decode16, 0x0300, 0x0330, 0x10, 0x04)
            IRQNoFlags () {5,7,9,10,11,15}
        }
        EndDependentFn ()
})

With this patch applied, a user space program - or maybe even the kernel
- can allocate all devices' resources optimally.  For the 600E, this
means to find optimal resources for (at least) the serial port, the
parallel port, the infrared port, the MWAVE modem, the sound card, and
the MPU-401 UART.

The patch applies the idea to register disabled resources to all types
of resources, not just to IRQs, DMAs, and IO ports.  At the same time,
it mimics the behavior of the "pnp_assign_xxx" functions from
"drivers/pnp/manager.c" where resources with "no size" are considered
disabled.

No regressions were observed on hardware that does not require this
patch.

The patch is applied against 2.6.39.

NB: The kernel's current PnP interface does not allow for disabling individual
resources using the "/sys/bus/pnp/devices/$device/resources" file.  Assuming
this could be done, a device could be configured to use a disabled resource
using a simple series of calls:

  echo disable > /sys/bus/pnp/devices/$device/resources
  echo clear > /sys/bus/pnp/devices/$device/resources
  echo set irq disabled > /sys/bus/pnp/devices/$device/resources
  echo fill > /sys/bus/pnp/devices/$device/resources
  echo activate > /sys/bus/pnp/devices/$device/resources

This patch addresses only the parsing of PnP ACPI devices.

ChangeLog (v1 -> v2):
 - extend patch description
 - fix typo in patch itself

Signed-off-by: Witold Szczeponik <Witold.Szczeponik@gmx.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Adam Belay <abelay@mit.edu>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>