pandora-kernel.git
8 years agoxhci: Fix spurious wakeups after S5 on Haswell
Takashi Iwai [Thu, 12 Sep 2013 06:11:06 +0000 (08:11 +0200)]
xhci: Fix spurious wakeups after S5 on Haswell

commit 638298dc66ea36623dbc2757a24fc2c4ab41b016 upstream.

Haswell LynxPoint and LynxPoint-LP with the recent Intel BIOS show
mysterious wakeups after shutdown occasionally.  After discussing with
BIOS engineers, they explained that the new BIOS expects that the
wakeup sources are cleared and set to D3 for all wakeup devices when
the system is going to sleep or power off, but the current xhci driver
doesn't do this properly (partly intentionally).

This patch introduces a new quirk, XHCI_SPURIOUS_WAKEUP, for
fixing the spurious wakeups at S5 by calling xhci_reset() in the xhci
shutdown ops as done in xhci_stop(), and setting the device to PCI D3
at shutdown and remove ops.

The PCI D3 call is based on the initial fix patch by Oliver Neukum.

[Note: Sarah changed the quirk name from XHCI_HSW_SPURIOUS_WAKEUP to
XHCI_SPURIOUS_WAKEUP, since none of the other quirks have system names
in them.  Sarah also fixed a collision with a quirk submitted around the
same time, by changing the xhci->quirks bit from 17 to 18.]

This patch should be backported to kernels as old as 3.0, that
contain the commit 1c12443ab8eba71a658fae4572147e56d1f84f66 "xhci: Add
Lynx Point to list of Intel switchable hosts."

Cc: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoxhci: quirk for extra long delay for S4
Oliver Neukum [Mon, 30 Sep 2013 13:50:54 +0000 (15:50 +0200)]
xhci: quirk for extra long delay for S4

commit 455f58925247e8a1a1941e159f3636ad6ee4c90b upstream.

It has been reported that this chipset really cannot
sleep without this extraordinary delay.

This patch should be backported, in order to ensure this host functions
under stable kernels.  The last quirk for Fresco Logic hosts (commit
bba18e33f25072ebf70fd8f7f0cdbf8cdb59a746 "xhci: Extend Fresco Logic MSI
quirk.") was backported to stable kernels as old as 2.6.36.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
[bwh: Backported to 3.2:
 - Adjust context
 - Use xhci_dbg() instead of xhci_dbg_trace()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoxhci: Don't enable/disable RWE on bus suspend/resume.
Sarah Sharp [Mon, 5 Aug 2013 20:36:00 +0000 (13:36 -0700)]
xhci: Don't enable/disable RWE on bus suspend/resume.

commit f217c980ca980e3a645b7485ea5eae9a747f4945 upstream.

The RWE bit of the USB 2.0 PORTPMSC register is supposed to enable
remote wakeup for devices in the lower power link state L1.  It has
nothing to do with the device suspend remote wakeup from L2.  The RWE
bit is designed to be set once (when USB 2.0 LPM is enabled for the
port) and cleared only when USB 2.0 LPM is disabled for the port.

The xHCI bus suspend method was setting the RWE bit erroneously, and the
bus resume method was clearing it.  The xHCI 1.0 specification with
errata up to Aug 12, 2012 says in section 4.23.5.1.1.1 "Hardware
Controlled LPM":

"While Hardware USB2 LPM is enabled, software shall not modify the
HIRDBESL or RWE fields of the USB2 PORTPMSC register..."

If we have previously enabled USB 2.0 LPM for a device, that means when
the USB 2.0 bus is resumed, we violate the xHCI specification by
clearing RWE.  It also means that after a bus resume, the host would
think remote wakeup is disabled from L1 for ports with USB 2.0 Link PM
enabled, which is not what we want.

This patch should be backported to kernels as old as 3.2, that
contain the commit 65580b4321eb36f16ae8b5987bfa1bb948fc5112 "xHCI: set
USB2 hardware LPM".  That was the first kernel that supported USB 2.0
Link PM.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
[bwh: Backported to 3.2: deleted code was cosmetically different]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agodrm/radeon: fix hw contexts for SUMO2 asics
wojciech kapuscinski [Tue, 1 Oct 2013 23:54:33 +0000 (19:54 -0400)]
drm/radeon: fix hw contexts for SUMO2 asics

commit 50b8f5aec04ebec7dbdf2adb17220b9148c99e63 upstream.

They have 4 rather than 8.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=63599

Signed-off-by: wojciech kapuscinski <wojtask9@wp.pl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agohwmon: (applesmc) Always read until end of data
Henrik Rydberg [Wed, 2 Oct 2013 17:15:03 +0000 (19:15 +0200)]
hwmon: (applesmc) Always read until end of data

commit 25f2bd7f5add608c1d1405938f39c96927b275ca upstream.

The crash reported and investigated in commit 5f4513 turned out to be
caused by a change to the read interface on newer (2012) SMCs.

Tests by Chris show that simply reading the data valid line is enough
for the problem to go away. Additional tests show that the newer SMCs
no longer wait for the number of requested bytes, but start sending
data right away.  Apparently the number of bytes to read is no longer
specified as before, but instead found out by reading until end of
data. Failure to read until end of data confuses the state machine,
which eventually causes the crash.

As a remedy, assuming bit0 is the read valid line, make sure there is
nothing more to read before leaving the read function.

Tested to resolve the original problem, and runtested on MBA3,1,
MBP4,1, MBP8,2, MBP10,1, MBP10,2. The patch seems to have no effect on
machines before 2012.

Tested-by: Chris Murphy <chris@cmurf.com>
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agomac80211: correctly close cancelled scans
Emmanuel Grumbach [Mon, 16 Sep 2013 08:12:07 +0000 (11:12 +0300)]
mac80211: correctly close cancelled scans

commit a754055a1296fcbe6f32de3a5eaca6efb2fd1865 upstream.

__ieee80211_scan_completed is called from a worker. This
means that the following flow is possible.

 * driver calls ieee80211_scan_completed
 * mac80211 cancels the scan (that is already complete)
 * __ieee80211_scan_completed runs

When scan_work will finally run, it will see that the scan
hasn't been aborted and might even trigger another scan on
another band. This leads to a situation where cfg80211's
scan is not done and no further scan can be issued.

Fix this by setting a new flag when a HW scan is being
cancelled so that no other scan will be triggered.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoALSA: hda - Add fixup for ASUS N56VZ
Takashi Iwai [Tue, 8 Oct 2013 17:57:50 +0000 (19:57 +0200)]
ALSA: hda - Add fixup for ASUS N56VZ

commit c6cc3d58b4042f5cadae653ff8d3df26af1a0169 upstream.

ASUS N56VZ needs a fixup for the bass speaker pin, which was already
provided via model=asus-mode4.

Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=841645
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agolibata: make ata_eh_qc_retry() bump scmd->allowed on bogus failures
Gwendal Grignou [Fri, 7 Aug 2009 23:17:49 +0000 (16:17 -0700)]
libata: make ata_eh_qc_retry() bump scmd->allowed on bogus failures

commit f13e220161e738c2710b9904dcb3cf8bb0bcce61 upstream.

libata EH decrements scmd->retries when the command failed for reasons
unrelated to the command itself so that, for example, commands aborted
due to suspend / resume cycle don't get penalized; however,
decrementing scmd->retries isn't enough for ATA passthrough commands.

Without this fix, ATA passthrough commands are not resend to the
drive, and no error is signalled to the caller because:

- allowed retry count is 1
- ata_eh_qc_complete fill the sense data, so result is valid
- sense data is filled with untouched ATA registers.

Signed-off-by: Gwendal Grignou <gwendal@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoALSA: snd-usb-usx2y: remove bogus frame checks
Daniel Mack [Wed, 2 Oct 2013 15:49:50 +0000 (17:49 +0200)]
ALSA: snd-usb-usx2y: remove bogus frame checks

commit a9d14bc0b188a822e42787d01e56c06fe9750162 upstream.

The frame check in i_usX2Y_urb_complete() and
i_usX2Y_usbpcm_urb_complete() is bogus and produces false positives as
described in this LAU thread:

  http://linuxaudio.org/mailarchive/lau/2013/5/20/200177

This patch removes the check code entirely.

Cc: fzu@wemgehoertderstaat.de
Reported-by: Dr Nicholas J Bailey <nicholas.bailey@glasgow.ac.uk>
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoiwlwifi: pcie: add SKUs for 6000, 6005 and 6235 series
Emmanuel Grumbach [Tue, 24 Sep 2013 16:34:26 +0000 (19:34 +0300)]
iwlwifi: pcie: add SKUs for 6000, 6005 and 6235 series

commit 08a5dd3842f2ac61c6d69661d2d96022df8ae359 upstream.

Add some new PCI IDs to the table for 6000, 6005 and 6235 series.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
[bwh: Backported to 3.2:
 - Adjust filenames
 - Drop const from struct iwl_cfg]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoiwlwifi: add new pci id for 6x35 series
Shuduo Sang [Sat, 30 Mar 2013 06:26:37 +0000 (14:26 +0800)]
iwlwifi: add new pci id for 6x35 series

commit 20ecf9fd3bebc4147e2996c08a75d6f0229b90df upstream.

some new thinkpad laptops use intel chip with new pci id need be added
lspci -vnn output:
 Network controller [0280]: Intel Corporation Centrino Advanced-N 6235
 [8086:088f] (rev 24)
 Subsystem: Intel Corporation Device [8086:5260]

Signed-off-by: Shuduo Sang <sangshuduo@gmail.com>
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoiwlwifi: one more sku added to 6x35 series
Wey-Yi Guy [Wed, 22 Feb 2012 16:18:55 +0000 (08:18 -0800)]
iwlwifi: one more sku added to 6x35 series

commit 259653d86b80ed01c70d47b7307140ae0ba19420 upstream.

Add new sku to 6x35 series

Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoiwlwifi: update pci subsystem id
Wey-Yi Guy [Wed, 22 Feb 2012 18:21:09 +0000 (10:21 -0800)]
iwlwifi: update pci subsystem id

commit 378911233f424d7a1bf4a579587ae71c7d887166 upstream.

Update the pci subsystem id and product name for 6005 series devices

Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoiwlwifi: remove un-supported SKUs
Wey-Yi Guy [Thu, 10 Nov 2011 14:55:06 +0000 (06:55 -0800)]
iwlwifi: remove un-supported SKUs

commit b6cb406a023184733bffc7762a75a2e204fff6b9 upstream.

BG only SKUs are no longer supported by 2000 and 1x5 series. Remove it

Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoiwlwifi: two more SKUs for 6x05 series
Wey-Yi Guy [Thu, 10 Nov 2011 14:55:03 +0000 (06:55 -0800)]
iwlwifi: two more SKUs for 6x05 series

commit 75a56eccb01fcc3c1ae8000130f3c9b3c8ec68d9 upstream.

Add two more SKUs for 6x05 series of device.
First SKU has low 5GHz channels actives, the other SKU has high 5GHz channels actives.

Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agotile: use a more conservative __my_cpu_offset in CONFIG_PREEMPT
Chris Metcalf [Thu, 26 Sep 2013 17:24:53 +0000 (13:24 -0400)]
tile: use a more conservative __my_cpu_offset in CONFIG_PREEMPT

commit f862eefec0b68e099a9fa58d3761ffb10bad97e1 upstream.

It turns out the kernel relies on barrier() to force a reload of the
percpu offset value.  Since we can't easily modify the definition of
barrier() to include "tp" as an output register, we instead provide a
definition of __my_cpu_offset as extended assembly that includes a fake
stack read to hazard against barrier(), forcing gcc to know that it
must reread "tp" and recompute anything based on "tp" after a barrier.

This fixes observed hangs in the slub allocator when we are looping
on a percpu cmpxchg_double.

A similar fix for ARMv7 was made in June in change 509eb76ebf97.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agomac80211: update sta->last_rx on acked tx frames
Felix Fietkau [Sun, 29 Sep 2013 19:39:34 +0000 (21:39 +0200)]
mac80211: update sta->last_rx on acked tx frames

commit 0c5b93290b2f3c7a376567c03ae8d385b0e99851 upstream.

When clients are idle for too long, hostapd sends nullfunc frames for
probing. When those are acked by the client, the idle time needs to be
updated.

To make this work (and to avoid unnecessary probing), update sta->last_rx
whenever an ACK was received for a tx packet. Only do this if the flag
IEEE80211_HW_REPORTS_TX_ACK_STATUS is set.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agomac80211: drop spoofed packets in ad-hoc mode
Felix Fietkau [Tue, 17 Sep 2013 09:15:43 +0000 (11:15 +0200)]
mac80211: drop spoofed packets in ad-hoc mode

commit 6329b8d917adc077caa60c2447385554130853a3 upstream.

If an Ad-Hoc node receives packets with the Cell ID or its own MAC
address as source address, it hits a WARN_ON in sta_info_insert_check()
With many packets, this can massively spam the logs. One way that this
can easily happen is through having Cisco APs in the area with rouge AP
detection and countermeasures enabled.
Such Cisco APs will regularly send fake beacons, disassoc and deauth
packets that trigger these warnings.

To fix this issue, drop such spoofed packets early in the rx path.

Reported-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
[bwh: Backported to 3.2: use compare_ether_addr() instead of ether_addr_equal()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agorandom: run random_int_secret_init() run after all late_initcalls
Theodore Ts'o [Tue, 10 Sep 2013 14:52:35 +0000 (10:52 -0400)]
random: run random_int_secret_init() run after all late_initcalls

commit 47d06e532e95b71c0db3839ebdef3fe8812fca2c upstream.

The some platforms (e.g., ARM) initializes their clocks as
late_initcalls for some unknown reason.  So make sure
random_int_secret_init() is run after all of the late_initcalls are
run.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agojfs: fix error path in ialloc
Dave Kleikamp [Sat, 7 Sep 2013 02:49:56 +0000 (21:49 -0500)]
jfs: fix error path in ialloc

commit 8660998608cfa1077e560034db81885af8e1e885 upstream.

If insert_inode_locked() fails, we shouldn't be calling
unlock_new_inode().

Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Tested-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoinclude/linux/fs.h: disable preempt when acquire i_size_seqcount write lock
Fan Du [Tue, 30 Apr 2013 22:27:27 +0000 (15:27 -0700)]
include/linux/fs.h: disable preempt when acquire i_size_seqcount write lock

commit 74e3d1e17b2e11d175970b85acd44f5927000ba2 upstream.

Two rt tasks bind to one CPU core.

The higher priority rt task A preempts a lower priority rt task B which
has already taken the write seq lock, and then the higher priority rt
task A try to acquire read seq lock, it's doomed to lockup.

rt task A with lower priority: call write
i_size_write                                        rt task B with higher priority: call sync, and preempt task A
  write_seqcount_begin(&inode->i_size_seqcount);    i_size_read
  inode->i_size = i_size;                             read_seqcount_begin <-- lockup here...

So disable preempt when acquiring every i_size_seqcount *write* lock will
cure the problem.

Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agotracing: Fix potential out-of-bounds in trace_get_user()
Steven Rostedt [Thu, 10 Oct 2013 02:23:23 +0000 (22:23 -0400)]
tracing: Fix potential out-of-bounds in trace_get_user()

commit 057db8488b53d5e4faa0cedb2f39d4ae75dfbdbb upstream.

Andrey reported the following report:

ERROR: AddressSanitizer: heap-buffer-overflow on address ffff8800359c99f3
ffff8800359c99f3 is located 0 bytes to the right of 243-byte region [ffff8800359c9900ffff8800359c99f3)
Accessed by thread T13003:
  #0 ffffffff810dd2da (asan_report_error+0x32a/0x440)
  #1 ffffffff810dc6b0 (asan_check_region+0x30/0x40)
  #2 ffffffff810dd4d3 (__tsan_write1+0x13/0x20)
  #3 ffffffff811cd19e (ftrace_regex_release+0x1be/0x260)
  #4 ffffffff812a1065 (__fput+0x155/0x360)
  #5 ffffffff812a12de (____fput+0x1e/0x30)
  #6 ffffffff8111708d (task_work_run+0x10d/0x140)
  #7 ffffffff810ea043 (do_exit+0x433/0x11f0)
  #8 ffffffff810eaee4 (do_group_exit+0x84/0x130)
  #9 ffffffff810eafb1 (SyS_exit_group+0x21/0x30)
  #10 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Allocated by thread T5167:
  #0 ffffffff810dc778 (asan_slab_alloc+0x48/0xc0)
  #1 ffffffff8128337c (__kmalloc+0xbc/0x500)
  #2 ffffffff811d9d54 (trace_parser_get_init+0x34/0x90)
  #3 ffffffff811cd7b3 (ftrace_regex_open+0x83/0x2e0)
  #4 ffffffff811cda7d (ftrace_filter_open+0x2d/0x40)
  #5 ffffffff8129b4ff (do_dentry_open+0x32f/0x430)
  #6 ffffffff8129b668 (finish_open+0x68/0xa0)
  #7 ffffffff812b66ac (do_last+0xb8c/0x1710)
  #8 ffffffff812b7350 (path_openat+0x120/0xb50)
  #9 ffffffff812b8884 (do_filp_open+0x54/0xb0)
  #10 ffffffff8129d36c (do_sys_open+0x1ac/0x2c0)
  #11 ffffffff8129d4b7 (SyS_open+0x37/0x50)
  #12 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Shadow bytes around the buggy address:
  ffff8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  ffff8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  ffff8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>ffff8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb
  ffff8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  ffff8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap redzone:          fa
  Heap kmalloc redzone:  fb
  Freed heap region:     fd
  Shadow gap:            fe

The out-of-bounds access happens on 'parser->buffer[parser->idx] = 0;'

Although the crash happened in ftrace_regex_open() the real bug
occurred in trace_get_user() where there's an incrementation to
parser->idx without a check against the size. The way it is triggered
is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop
that reads the last character stores it and then breaks out because
there is no more characters. Then the last character is read to determine
what to do next, and the index is incremented without checking size.

Then the caller of trace_get_user() usually nulls out the last character
with a zero, but since the index is equal to the size, it writes a nul
character after the allocated space, which can corrupt memory.

Luckily, only root user has write access to this file.

Link: http://lkml.kernel.org/r/20131009222323.04fd1a0d@gandalf.local.home
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonetfilter: nf_ct_sip: don't drop packets with offsets pointing outside the packet
Patrick McHardy [Fri, 5 Apr 2013 08:13:30 +0000 (08:13 +0000)]
netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the packet

commit 3a7b21eaf4fb3c971bdb47a98f570550ddfe4471 upstream.

Some Cisco phones create huge messages that are spread over multiple packets.
After calculating the offset of the SIP body, it is validated to be within
the packet and the packet is dropped otherwise. This breaks operation of
these phones. Since connection tracking is supposed to be passive, just let
those packets pass unmodified and untracked.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[bwh: Backported to 3.2: there is no log message to delete]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years ago8139cp: re-enable interrupts after tx timeout
David Woodhouse [Sat, 24 Nov 2012 12:11:21 +0000 (12:11 +0000)]
8139cp: re-enable interrupts after tx timeout

commit 01ffc0a7f1c1801a2354719dedbc32aff45b987d upstream.

Recovery doesn't work too well if we leave interrupts disabled...

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoinet: fix possible memory corruption with UDP_CORK and UFO
Hannes Frederic Sowa [Mon, 21 Oct 2013 22:07:47 +0000 (00:07 +0200)]
inet: fix possible memory corruption with UDP_CORK and UFO

[ This is a simplified -stable version of a set of upstream commits. ]

This is a replacement patch only for stable which does fix the problems
handled by the following two commits in -net:

"ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9)
"ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b)

Three frames are written on a corked udp socket for which the output
netdevice has UFO enabled.  If the first and third frame are smaller than
the mtu and the second one is bigger, we enqueue the second frame with
skb_append_datato_frags without initializing the gso fields. This leads
to the third frame appended regulary and thus constructing an invalid skb.

This fixes the problem by always using skb_append_datato_frags as soon
as the first frag got enqueued to the skb without marking the packet
as SKB_GSO_UDP.

The problem with only two frames for ipv6 was fixed by "ipv6: udp
packets following an UFO enqueued packet need also be handled by UFO"
(2811ebac2521ceac84f2bdae402455baa6a7fb47).

Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoperf tools: Fix getrusage() related build failure on glibc trunk
Markus Trippelsdorf [Wed, 4 Apr 2012 08:45:27 +0000 (10:45 +0200)]
perf tools: Fix getrusage() related build failure on glibc trunk

commit 7b78f13603c6fcb64e020a0bbe31a651ea2b657b upstream.

On a system running glibc trunk perf doesn't build:

    CC builtin-sched.o
builtin-sched.c: In function ‘get_cpu_usage_nsec_parent’: builtin-sched.c:399:16: error: storage size of ‘ru’ isn’t known builtin-sched.c:403:2: error: implicit declaration of function ‘getrusage’ [-Werror=implicit-function-declaration]
    [...]

Fix it by including sys/resource.h.

Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120404084527.GA294@x4
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoxen-netback: use jiffies_64 value to calculate credit timeout
Wei Liu [Mon, 28 Oct 2013 12:07:57 +0000 (12:07 +0000)]
xen-netback: use jiffies_64 value to calculate credit timeout

[ Upstream commit 059dfa6a93b779516321e5112db9d7621b1367ba ]

time_after_eq() only works if the delta is < MAX_ULONG/2.

For a 32bit Dom0, if netfront sends packets at a very low rate, the time
between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2
and the test for timer_after_eq() will be incorrect. Credit will not be
replenished and the guest may become unable to send packets (e.g., if
prior to the long gap, all credit was exhausted).

Use jiffies_64 variant to mitigate this problem for 32bit Dom0.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jason Luan <jianhai.luan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoperf: Fix perf ring buffer memory ordering
Peter Zijlstra [Mon, 28 Oct 2013 12:55:29 +0000 (13:55 +0100)]
perf: Fix perf ring buffer memory ordering

commit bf378d341e4873ed928dc3c636252e6895a21f50 upstream.

The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.

When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.

Reported-by: Victor Kaplansky <victork@il.ibm.com>
Tested-by: Victor Kaplansky <victork@il.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: michael@ellerman.id.au
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agozram: allow request end to coincide with disksize
Sergey Senozhatsky [Sat, 22 Jun 2013 14:21:00 +0000 (17:21 +0300)]
zram: allow request end to coincide with disksize

commit 75c7caf5a052ffd8db3312fa7864ee2d142890c4 upstream.

Pass valid_io_request() checks if request end coincides with disksize
(end equals bound), only fail if we attempt to read beyond the bound.

mkfs.ext2 produces numerous errors:
[ 2164.632747] quiet_error: 1 callbacks suppressed
[ 2164.633260] Buffer I/O error on device zram0, logical block 153599
[ 2164.633265] lost page write due to I/O error on zram0

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoext3: return 32/64-bit dir name hash according to usage type
Eric Sandeen [Thu, 26 Apr 2012 18:10:39 +0000 (13:10 -0500)]
ext3: return 32/64-bit dir name hash according to usage type

commit d7dab39b6e16d5eea78ed3c705d2a2d0772b4f06 upstream.

This is based on commit d1f5273e9adb40724a85272f248f210dc4ce919a
ext4: return 32/64-bit dir name hash according to usage type
by Fan Yong <yong.fan@whamcloud.com>

Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
and telldir().  However, this causes problems if there are 32-bit hash
collisions, since the NFSv2 server can get stuck resending the same
entries from the directory repeatedly.

Allow ext3 to return a full 64-bit hash (both major and minor) for
telldir to decrease the chance of hash collisions.

This patch does implement a new ext3_dir_llseek op, because with 64-bit
hashes, nfs will attempt to seek to a hash "offset" which is much
larger than ext3's s_maxbytes.  So for dx dirs, we call
generic_file_llseek_size() with the appropriate max hash value as the
maximum seekable size.  Otherwise we just pass through to
generic_file_llseek().

Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Patch-updated-by: Eric Sandeen <sandeen@redhat.com>
(blame us if something is not correct)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)
Bernd Schubert [Mon, 19 Mar 2012 02:44:50 +0000 (22:44 -0400)]
nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)

commit 06effdbb49af5f6c7d20affaec74603914acc768 upstream.

Use 32-bit or 64-bit llseek() hashes for directory offsets depending on
the NFS version. NFSv2 gets 32-bit hashes only.

NOTE: This patch got rather complex as Christoph asked to set the
filp->f_mode flag in the open call or immediatly after dentry_open()
in nfsd_open() to avoid races.
Personally I still do not see a reason for that and in my opinion
FMODE_32BITHASH/FMODE_64BITHASH flags could be set nfsd_readdir(), as it
follows directly after nfsd_open() without a chance of races.

Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonfsd: rename 'int access' to 'int may_flags' in nfsd_open()
Bernd Schubert [Mon, 19 Mar 2012 02:44:49 +0000 (22:44 -0400)]
nfsd: rename 'int access' to 'int may_flags' in nfsd_open()

commit 999448a8c0202d8c41711c92385323520644527b upstream.

Just rename this variable, as the next patch will add a flag and
'access' as variable name would not be correct any more.

Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoext4: return 32/64-bit dir name hash according to usage type
Fan Yong [Mon, 19 Mar 2012 02:44:40 +0000 (22:44 -0400)]
ext4: return 32/64-bit dir name hash according to usage type

commit d1f5273e9adb40724a85272f248f210dc4ce919a upstream.

Traditionally ext2/3/4 has returned a 32-bit hash value from llseek()
to appease NFSv2, which can only handle a 32-bit cookie for seekdir()
and telldir().  However, this causes problems if there are 32-bit hash
collisions, since the NFSv2 server can get stuck resending the same
entries from the directory repeatedly.

Allow ext4 to return a full 64-bit hash (both major and minor) for
telldir to decrease the chance of hash collisions.  This still needs
integration on the NFS side.

Patch-updated-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
(blame me if something is not correct)

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agofs: add new FMODE flags: FMODE_32bithash and FMODE_64bithash
Bernd Schubert [Wed, 14 Mar 2012 02:51:38 +0000 (22:51 -0400)]
fs: add new FMODE flags: FMODE_32bithash and FMODE_64bithash

commit 6a8a13e03861c0ab83ab07d573ca793cff0e5d00 upstream.

Those flags are supposed to be set by NFS readdir() to tell ext3/ext4
to 32bit (NFSv2) or 64bit hash values (offsets) in seekdir().

Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoPCI: fix truncation of resource size to 32 bits
Nikhil P Rao [Wed, 20 Jun 2012 19:56:00 +0000 (12:56 -0700)]
PCI: fix truncation of resource size to 32 bits

commit d6776e6d5c2f8db0252f447b09736075e1bbe387 upstream.

_pci_assign_resource() took an int "size" argument, which meant that
sizes larger than 4GB were truncated.  Change type to resource_size_t.

[bhelgaas: changelog]
Signed-off-by: Nikhil P Rao <nikhil.rao@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agodavinci_emac.c: Fix IFF_ALLMULTI setup
Mariusz Ceier [Mon, 21 Oct 2013 17:45:04 +0000 (19:45 +0200)]
davinci_emac.c: Fix IFF_ALLMULTI setup

[ Upstream commit d69e0f7ea95fef8059251325a79c004bac01f018 ]

When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't,
emac_dev_mcast_set should only enable RX of multicasts and reset
MACHASH registers.

It does this, but afterwards it either sets up multicast MACs
filtering or disables RX of multicasts and resets MACHASH registers
again, rendering IFF_ALLMULTI flag useless.

This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and
disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set.

Tested with kernel 2.6.37.

Signed-off-by: Mariusz Ceier <mceier+kernel@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonet: fix cipso packet validation when !NETLABEL
Seif Mazareeb [Fri, 18 Oct 2013 03:33:21 +0000 (20:33 -0700)]
net: fix cipso packet validation when !NETLABEL

[ Upstream commit f2e5ddcc0d12f9c4c7b254358ad245c9dddce13b ]

When CONFIG_NETLABEL is disabled, the cipso_v4_validate() function could loop
forever in the main loop if opt[opt_iter +1] == 0, this will causing a kernel
crash in an SMP system, since the CPU executing this function will
stall /not respond to IPIs.

This problem can be reproduced by running the IP Stack Integrity Checker
(http://isic.sourceforge.net) using the following command on a Linux machine
connected to DUT:

"icmpsic -s rand -d <DUT IP address> -r 123456"
wait (1-2 min)

Signed-off-by: Seif Mazareeb <seif@marvell.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonet: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race
Daniel Borkmann [Thu, 17 Oct 2013 20:51:31 +0000 (22:51 +0200)]
net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race

[ Upstream commit 90c6bd34f884cd9cee21f1d152baf6c18bcac949 ]

In the case of credentials passing in unix stream sockets (dgram
sockets seem not affected), we get a rather sparse race after
commit 16e5726 ("af_unix: dont send SCM_CREDENTIALS by default").

We have a stream server on receiver side that requests credential
passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED
on each spawned/accepted socket on server side to 1 first (as it's
not inherited), it can happen that in the time between accept() and
setsockopt() we get interrupted, the sender is being scheduled and
continues with passing data to our receiver. At that time SO_PASSCRED
is neither set on sender nor receiver side, hence in cmsg's
SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534
(== overflow{u,g}id) instead of what we actually would like to see.

On the sender side, here nc -U, the tests in maybe_add_creds()
invoked through unix_stream_sendmsg() would fail, as at that exact
time, as mentioned, the sender has neither SO_PASSCRED on his side
nor sees it on the server side, and we have a valid 'other' socket
in place. Thus, sender believes it would just look like a normal
connection, not needing/requesting SO_PASSCRED at that time.

As reverting 16e5726 would not be an option due to the significant
performance regression reported when having creds always passed,
one way/trade-off to prevent that would be to set SO_PASSCRED on
the listener socket and allow inheriting these flags to the spawned
socket on server side in accept(). It seems also logical to do so
if we'd tell the listener socket to pass those flags onwards, and
would fix the race.

Before, strace:

recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
        msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
        cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}},
        msg_flags=0}, 0) = 5

After, strace:

recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
        msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
        cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}},
        msg_flags=0}, 0) = 5

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agowanxl: fix info leak in ioctl
Salva Peiró [Wed, 16 Oct 2013 10:46:50 +0000 (12:46 +0200)]
wanxl: fix info leak in ioctl

[ Upstream commit 2b13d06c9584b4eb773f1e80bbaedab9a1c344e1 ]

The wanxl_ioctl() code fails to initialize the two padding bytes of
struct sync_serial_settings after the ->loopback member. Add an explicit
memset(0) before filling the structure to avoid the info leak.

Signed-off-by: Salva Peiró <speiro@ai2.upv.es>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosctp: Perform software checksum if packet has to be fragmented.
Vlad Yasevich [Wed, 16 Oct 2013 02:01:31 +0000 (22:01 -0400)]
sctp: Perform software checksum if packet has to be fragmented.

[ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ]

IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum.
This causes problems if SCTP packets has to be fragmented and
ipsummed has been set to PARTIAL due to checksum offload support.
This condition can happen when retransmitting after MTU discover,
or when INIT or other control chunks are larger then MTU.
Check for the rare fragmentation condition in SCTP and use software
checksum calculation in this case.

CC: Fan Du <fan.du@windriver.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosctp: Use software crc32 checksum when xfrm transform will happen.
Fan Du [Wed, 16 Oct 2013 02:01:30 +0000 (22:01 -0400)]
sctp: Use software crc32 checksum when xfrm transform will happen.

[ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ]

igb/ixgbe have hardware sctp checksum support, when this feature is enabled
and also IPsec is armed to protect sctp traffic, ugly things happened as
xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing
up and pack the 16bits result in the checksum field). The result is fail
establishment of sctp communication.

Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonet: dst: provide accessor function to dst->xfrm
Vlad Yasevich [Wed, 16 Oct 2013 02:01:29 +0000 (22:01 -0400)]
net: dst: provide accessor function to dst->xfrm

[ Upstream commit e87b3998d795123b4139bc3f25490dd236f68212 ]

dst->xfrm is conditionally defined.  Provide accessor funtion that
is always available.

Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agobnx2x: record rx queue for LRO packets
Eric Dumazet [Sat, 12 Oct 2013 21:08:34 +0000 (14:08 -0700)]
bnx2x: record rx queue for LRO packets

[ Upstream commit 60e66fee56b2256dcb1dc2ea1b2ddcb6e273857d ]

RPS support is kind of broken on bnx2x, because only non LRO packets
get proper rx queue information. This triggers reorders, as it seems
bnx2x like to generate a non LRO packet for segment including TCP PUSH
flag : (this might be pure coincidence, but all the reorders I've
seen involve segments with a PUSH)

11:13:34.335847 IP A > B: . 415808:447136(31328) ack 1 win 457 <nop,nop,timestamp 3789336 3985797>
11:13:34.335992 IP A > B: . 447136:448560(1424) ack 1 win 457 <nop,nop,timestamp 3789336 3985797>
11:13:34.336391 IP A > B: . 448560:479888(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985797>
11:13:34.336425 IP A > B: P 511216:512640(1424) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
11:13:34.336423 IP A > B: . 479888:511216(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
11:13:34.336924 IP A > B: . 512640:543968(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>
11:13:34.336963 IP A > B: . 543968:575296(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798>

We must call skb_record_rx_queue() to properly give to RPS (and more
generally for TX queue selection on forward path) the receive queue
information.

Similar fix is needed for skb_mark_napi_id(), but will be handled
in a separate patch to ease stable backports.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Acked-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoconnector: use nlmsg_len() to check message length
Mathias Krause [Mon, 30 Sep 2013 20:03:07 +0000 (22:03 +0200)]
connector: use nlmsg_len() to check message length

[ Upstream commit 162b2bedc084d2d908a04c93383ba02348b648b0 ]

The current code tests the length of the whole netlink message to be
at least as long to fit a cn_msg. This is wrong as nlmsg_len includes
the length of the netlink message header. Use nlmsg_len() instead to
fix this "off-by-NLMSG_HDRLEN" size check.

Cc: stable@vger.kernel.org # v2.6.14+
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agofarsync: fix info leak in ioctl
Salva Peiró [Fri, 11 Oct 2013 09:50:03 +0000 (12:50 +0300)]
farsync: fix info leak in ioctl

[ Upstream commit 96b340406724d87e4621284ebac5e059d67b2194 ]

The fst_get_iface() code fails to initialize the two padding bytes of
struct sync_serial_settings after the ->loopback member. Add an explicit
memset(0) before filling the structure to avoid the info leak.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agol2tp: must disable bh before calling l2tp_xmit_skb()
Eric Dumazet [Thu, 10 Oct 2013 13:30:09 +0000 (06:30 -0700)]
l2tp: must disable bh before calling l2tp_xmit_skb()

[ Upstream commit 455cc32bf128e114455d11ad919321ab89a2c312 ]

François Cachereul made a very nice bug report and suspected
the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from
process context was not good.

This problem was added by commit 6af88da14ee284aaad6e4326da09a89191ab6165
("l2tp: Fix locking in l2tp_core.c").

l2tp_eth_dev_xmit() runs from BH context, so we must disable BH
from other l2tp_xmit_skb() users.

[  452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662]
[  452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox
ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod
virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan]
[  452.064012] CPU 1
[  452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643]
[  452.080015] CPU 2
[  452.080015]
[  452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
[  452.080015] RIP: 0010:[<ffffffff81059f6c>]  [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f
[  452.080015] RSP: 0018:ffff88007125fc18  EFLAGS: 00000293
[  452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000
[  452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110
[  452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000
[  452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286
[  452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000
[  452.080015] FS:  00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000
[  452.080015] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0
[  452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0)
[  452.080015] Stack:
[  452.080015]  ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1
[  452.080015]  ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e
[  452.080015]  000000000000005c 000000080000000e 0000000000000000 ffff880071170600
[  452.080015] Call Trace:
[  452.080015]  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[  452.080015]  [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
[  452.080015]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[  452.080015]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[  452.080015]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[  452.080015]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[  452.080015]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[  452.080015]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[  452.080015]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[  452.080015]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[  452.080015]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[  452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3
[  452.080015] Call Trace:
[  452.080015]  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[  452.080015]  [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core]
[  452.080015]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[  452.080015]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[  452.080015]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[  452.080015]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[  452.080015]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[  452.080015]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[  452.080015]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[  452.080015]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[  452.080015]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[  452.064012]
[  452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs
[  452.064012] RIP: 0010:[<ffffffff81059f6e>]  [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f
[  452.064012] RSP: 0018:ffff8800b6e83ba0  EFLAGS: 00000297
[  452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002
[  452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110
[  452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c
[  452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18
[  452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0
[  452.064012] FS:  00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000
[  452.064012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0
[  452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410)
[  452.064012] Stack:
[  452.064012]  ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a
[  452.064012]  ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62
[  452.064012]  0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276
[  452.064012] Call Trace:
[  452.064012]  <IRQ>
[  452.064012]  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[  452.064012]  [<ffffffff8121c64a>] spin_lock+0x9/0xb
[  452.064012]  [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
[  452.064012]  [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
[  452.064012]  [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
[  452.064012]  [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
[  452.064012]  [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
[  452.064012]  [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
[  452.064012]  [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
[  452.064012]  [<ffffffff811fe78f>] ip_rcv+0x210/0x269
[  452.064012]  [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
[  452.064012]  [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
[  452.064012]  [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
[  452.064012]  [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
[  452.064012]  [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
[  452.064012]  [<ffffffff811d9417>] net_rx_action+0x73/0x184
[  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[  452.064012]  [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
[  452.064012]  [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
[  452.064012]  [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
[  452.064012]  [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
[  452.064012]  [<ffffffff81003587>] do_softirq+0x45/0x82
[  452.064012]  [<ffffffff81034667>] irq_exit+0x42/0x9c
[  452.064012]  [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
[  452.064012]  [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
[  452.064012]  <EOI>
[  452.064012]  [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
[  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[  452.064012]  [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
[  452.064012]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[  452.064012]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[  452.064012]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[  452.064012]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[  452.064012]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[  452.064012]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[  452.064012]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[  452.064012]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[  452.064012]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b
[  452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48
[  452.064012] Call Trace:
[  452.064012]  <IRQ>  [<ffffffff81256559>] _raw_spin_lock+0xe/0x10
[  452.064012]  [<ffffffff8121c64a>] spin_lock+0x9/0xb
[  452.064012]  [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269
[  452.064012]  [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae
[  452.064012]  [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0
[  452.064012]  [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c
[  452.064012]  [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5
[  452.064012]  [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84
[  452.064012]  [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3
[  452.064012]  [<ffffffff811fe78f>] ip_rcv+0x210/0x269
[  452.064012]  [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb
[  452.064012]  [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7
[  452.064012]  [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e
[  452.064012]  [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b
[  452.064012]  [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net]
[  452.064012]  [<ffffffff811d9417>] net_rx_action+0x73/0x184
[  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[  452.064012]  [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8
[  452.064012]  [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12
[  452.064012]  [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10
[  452.064012]  [<ffffffff8125e0ac>] call_softirq+0x1c/0x26
[  452.064012]  [<ffffffff81003587>] do_softirq+0x45/0x82
[  452.064012]  [<ffffffff81034667>] irq_exit+0x42/0x9c
[  452.064012]  [<ffffffff8125e146>] do_IRQ+0x8e/0xa5
[  452.064012]  [<ffffffff8125676e>] common_interrupt+0x6e/0x6e
[  452.064012]  <EOI>  [<ffffffff810b82a1>] ? kfree+0x8a/0xa3
[  452.064012]  [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core]
[  452.064012]  [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core]
[  452.064012]  [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp]
[  452.064012]  [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24
[  452.064012]  [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6
[  452.064012]  [<ffffffff81254e88>] ? __schedule+0x5c1/0x616
[  452.064012]  [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c
[  452.064012]  [<ffffffff810bbd21>] ? fget_light+0x75/0x89
[  452.064012]  [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56
[  452.064012]  [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b
[  452.064012]  [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b

Reported-by: François Cachereul <f.cachereul@alphalink.fr>
Tested-by: François Cachereul <f.cachereul@alphalink.fr>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonet: vlan: fix nlmsg size calculation in vlan_get_size()
Marc Kleine-Budde [Mon, 7 Oct 2013 21:19:58 +0000 (23:19 +0200)]
net: vlan: fix nlmsg size calculation in vlan_get_size()

[ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ]

This patch fixes the calculation of the nlmsg size, by adding the missing
nla_total_size().

Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoipv6: restrict neighbor entry creation to output flow
Marcelo Ricardo Leitner [Tue, 8 Oct 2013 14:41:13 +0000 (16:41 +0200)]
ipv6: restrict neighbor entry creation to output flow

This patch is based on 3.2.y branch, the one used by reporter. Please let me
know if it should be different. Thanks.

The patch which introduced the regression was applied on stables:
3.0.64 3.4.31 3.7.8 3.2.39

The patch which introduced the regression was for stable trees only.

---8<---

Commit 0d6a77079c475033cb622c07c5a880b392ef664e "ipv6: do not create
neighbor entries for local delivery" introduced a regression on
which routes to local delivery would not work anymore. Like this:

    $ ip -6 route add local 2001::/64 dev lo
    $ ping6 -c1 2001::9
    PING 2001::9(2001::9) 56 data bytes
    ping: sendmsg: Invalid argument

As this is a local delivery, that commit would not allow the creation of a
neighbor entry and thus the packet cannot be sent.

But as TPROXY scenario actually needs to avoid the neighbor entry creation only
for input flow, this patch now limits previous patch to input flow, keeping
output as before that patch.

Reported-by: Debabrata Banerjee <dbavatar@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agocan: dev: fix nlmsg size calculation in can_get_size()
Marc Kleine-Budde [Sat, 5 Oct 2013 19:25:17 +0000 (21:25 +0200)]
can: dev: fix nlmsg size calculation in can_get_size()

[ Upstream commit fe119a05f8ca481623a8d02efcc984332e612528 ]

This patch fixes the calculation of the nlmsg size, by adding the missing
nla_total_size().

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoipv4: fix ineffective source address selection
Jiri Benc [Fri, 4 Oct 2013 15:04:48 +0000 (17:04 +0200)]
ipv4: fix ineffective source address selection

[ Upstream commit 0a7e22609067ff524fc7bbd45c6951dd08561667 ]

When sending out multicast messages, the source address in inet->mc_addr is
ignored and rewritten by an autoselected one. This is caused by a typo in
commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output
route lookups").

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoproc connector: fix info leaks
Mathias Krause [Mon, 30 Sep 2013 20:03:06 +0000 (22:03 +0200)]
proc connector: fix info leaks

[ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ]

Initialize event_data for all possible message types to prevent leaking
kernel stack contents to userland (up to 20 bytes). Also set the flags
member of the connector message to 0 to prevent leaking two more stack
bytes this way.

Cc: stable@vger.kernel.org # v2.6.15+
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonet: heap overflow in __audit_sockaddr()
Dan Carpenter [Wed, 2 Oct 2013 21:27:20 +0000 (00:27 +0300)]
net: heap overflow in __audit_sockaddr()

[ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ]

We need to cap ->msg_namelen or it leads to a buffer overflow when we
to the memcpy() in __audit_sockaddr().  It requires CAP_AUDIT_CONTROL to
exploit this bug.

The call tree is:
___sys_recvmsg()
  move_addr_to_user()
    audit_sockaddr()
      __audit_sockaddr()

Reported-by: Jüri Aedla <juri.aedla@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonet: do not call sock_put() on TIMEWAIT sockets
Eric Dumazet [Wed, 2 Oct 2013 04:04:11 +0000 (21:04 -0700)]
net: do not call sock_put() on TIMEWAIT sockets

[ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ]

commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU /
hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets.

We should instead use inet_twsk_put()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agotcp: do not forget FIN in tcp_shifted_skb()
Eric Dumazet [Fri, 4 Oct 2013 17:31:41 +0000 (10:31 -0700)]
tcp: do not forget FIN in tcp_shifted_skb()

[ Upstream commit 5e8a402f831dbe7ee831340a91439e46f0d38acd ]

Yuchung found following problem :

 There are bugs in the SACK processing code, merging part in
 tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
 skbs FIN flag. When a receiver first SACK the FIN sequence, and later
 throw away ofo queue (e.g., sack-reneging), the sender will stop
 retransmitting the FIN flag, and hangs forever.

Following packetdrill test can be used to reproduce the bug.

$ cat sack-merge-bug.pkt
`sysctl -q net.ipv4.tcp_fack=0`

// Establish a connection and send 10 MSS.
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+.000 bind(3, ..., ...) = 0
+.000 listen(3, 1) = 0

+.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
+.001 < . 1:1(0) ack 1 win 1024
+.000 accept(3, ..., ...) = 4

+.100 write(4, ..., 12000) = 12000
+.000 shutdown(4, SHUT_WR) = 0
+.000 > . 1:10001(10000) ack 1
+.050 < . 1:1(0) ack 2001 win 257
+.000 > FP. 10001:12001(2000) ack 1
+.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop>
+.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop>
// SACK reneg
+.050 < . 1:1(0) ack 12001 win 257
+0 %{ print "unacked: ",tcpi_unacked }%
+5 %{ print "" }%

First, a typo inverted left/right of one OR operation, then
code forgot to advance end_seq if the merged skb carried FIN.

Bug was added in 2.6.29 by commit 832d11c5cd076ab
("tcp: Try to restore large SKBs while SACK processing")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agotcp: must unclone packets before mangling them
Eric Dumazet [Tue, 15 Oct 2013 18:54:30 +0000 (11:54 -0700)]
tcp: must unclone packets before mangling them

[ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ]

TCP stack should make sure it owns skbs before mangling them.

We had various crashes using bnx2x, and it turned out gso_size
was cleared right before bnx2x driver was populating TC descriptor
of the _previous_ packet send. TCP stack can sometime retransmit
packets that are still in Qdisc.

Of course we could make bnx2x driver more robust (using
ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack.

We have identified two points where skb_unclone() was needed.

This patch adds a WARN_ON_ONCE() to warn us if we missed another
fix of this kind.

Kudos to Neal for finding the root cause of this bug. Its visible
using small MSS.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoLinux 3.2.52 v3.2.52
Ben Hutchings [Sat, 26 Oct 2013 20:06:14 +0000 (21:06 +0100)]
Linux 3.2.52

8 years agocan: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and abort pending TX
Marc Kleine-Budde [Fri, 4 Oct 2013 08:52:36 +0000 (10:52 +0200)]
can: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and abort pending TX

commit d5a7b406c529e4595ce03dc8f6dcf7fa36f106fa upstream.

In patch

    0d1862e can: flexcan: fix flexcan_chip_start() on imx6

the loop in flexcan_chip_start() that iterates over all mailboxes after the
soft reset of the CAN core was removed. This loop put all mailboxes (even the
ones marked as reserved 1...7) into EMPTY/INACTIVE mode. On mailboxes 8...63,
this aborts any pending TX messages.

After a cold boot there is random garbage in the mailboxes, which leads to
spontaneous transmit of CAN frames during first activation. Further if the
interface was disabled with a pending message (usually due to an error
condition on the CAN bus), this message is retransmitted after enabling the
interface again.

This patch fixes the regression by:
1) Limiting the maximum number of used mailboxes to 8, 0...7 are used by the RX
FIFO, 8 is used by TX.
2) Marking the TX mailbox as EMPTY/INACTIVE, so that any pending TX of that
mailbox is aborted.

Cc: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
[bwh: Backported to 3.2:
 - Adjust context
 - Hardware local echo is still enabled]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agogianfar: Change default HW Tx queue scheduling mode
Claudiu Manoil [Sun, 23 Sep 2012 22:39:08 +0000 (22:39 +0000)]
gianfar: Change default HW Tx queue scheduling mode

commit b98b8babd6e3370fadb7c6eaacb00eb2f6344a6c upstream.

This is primarily to address transmission timeout occurrences, when
multiple H/W Tx queues are being used concurrently. Because in
the priority scheduling mode the controller does not service the
Tx queues equally (but in ascending index order), Tx timeouts are
being triggered rightaway for a basic test with multiple simultaneous
connections like:
iperf -c <server_ip> -n 100M -P 8

resulting in kernel trace:
NETDEV WATCHDOG: eth1 (fsl-gianfar): transmit queue <X> timed out
------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:255
...
and controller reset during intense traffic, and possibly further
complications.

This patch changes the default H/W Tx scheduling setting (TXSCHED)
for multi-queue devices, from priority scheduling mode to a weighted
round robin mode with equal weights for all H/W Tx queues, and
addresses the issue above.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agomm, show_mem: suppress page counts in non-blockable contexts
David Rientjes [Mon, 29 Apr 2013 22:06:11 +0000 (15:06 -0700)]
mm, show_mem: suppress page counts in non-blockable contexts

commit 4b59e6c4730978679b414a8da61514a2518da512 upstream.

On large systems with a lot of memory, walking all RAM to determine page
types may take a half second or even more.

In non-blockable contexts, the page allocator will emit a page allocation
failure warning unless __GFP_NOWARN is specified.  In such contexts, irqs
are typically disabled and such a lengthy delay may even result in NMI
watchdog timeouts.

To fix this, suppress the page walk in such contexts when printing the
page allocation failure warning.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler()
Lv Zheng [Fri, 13 Sep 2013 05:13:23 +0000 (13:13 +0800)]
ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler()

commit 06a8566bcf5cf7db9843a82cde7a33c7bf3947d9 upstream.

This patch fixes the issues indicated by the test results that
ipmi_msg_handler() is invoked in atomic context.

BUG: scheduling while atomic: kipmi0/18933/0x10000100
Modules linked in: ipmi_si acpi_ipmi ...
CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G       AW    3.10.0-rc7+ #2
Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010
 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8
 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4
 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8
Call Trace:
 <IRQ>  [<ffffffff814c4a1e>] dump_stack+0x19/0x1b
 [<ffffffff814bfbab>] __schedule_bug+0x46/0x54
 [<ffffffff814c73e0>] __schedule+0x83/0x59c
 [<ffffffff81058853>] __cond_resched+0x22/0x2d
 [<ffffffff814c794b>] _cond_resched+0x14/0x1d
 [<ffffffff814c6d82>] mutex_lock+0x11/0x32
 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58
 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si]
 [<ffffffff812bf6e4>] deliver_response+0x55/0x5a
 [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65
 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19
 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc
 [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si]
 ...

Also Tony Camuso says:

 We were getting occasional "Scheduling while atomic" call traces
 during boot on some systems. Problem was first seen on a Cisco C210
 but we were able to reproduce it on a Cisco c220m3. Setting
 CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around
 tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device.

 =================================
 [ INFO: inconsistent lock state ]
 2.6.32-415.el6.x86_64-debug-splck #1
 ---------------------------------
 inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
 ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes:
  (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126
 {SOFTIRQ-ON-W} state was registered at:
   [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570
   [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120
   [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400
   [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60
   [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234
   [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be

The fix implemented by this change has been tested by Tony:

 Tested the patch in a boot loop with lockdep debug enabled and never
 saw the problem in over 400 reboots.

Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agostaging: comedi: ni_65xx: (bug fix) confine insn_bits to one subdevice
Ian Abbott [Wed, 2 Oct 2013 13:57:51 +0000 (14:57 +0100)]
staging: comedi: ni_65xx: (bug fix) confine insn_bits to one subdevice

commit 677a31565692d596ef42ea589b53ba289abf4713 upstream.

The `insn_bits` handler `ni_65xx_dio_insn_bits()` has a `for` loop that
currently writes (optionally) and reads back up to 5 "ports" consisting
of 8 channels each.  It reads up to 32 1-bit channels but can only read
and write a whole port at once - it needs to handle up to 5 ports as the
first channel it reads might not be aligned on a port boundary.  It
breaks out of the loop early if the next port it handles is beyond the
final port on the card.  It also breaks out early on the 5th port in the
loop if the first channel was aligned.  Unfortunately, it doesn't check
that the current port it is dealing with belongs to the comedi subdevice
the `insn_bits` handler is acting on.  That's a bug.

Redo the `for` loop to terminate after the final port belonging to the
subdevice, changing the loop variable in the process to simplify things
a bit.  The `for` loop could now try and handle more than 5 ports if the
subdevice has more than 40 channels, but the test `if (bitshift >= 32)`
ensures it will break out early after 4 or 5 ports (depending on whether
the first channel is aligned on a port boundary).  (`bitshift` will be
between -7 and 7 inclusive on the first iteration, increasing by 8 for
each subsequent operation.)

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Ian Abbott: This patch applies to kernels 2.6.34.y through to 3.5.y
 inclusive.]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoext4: avoid hang when mounting non-journal filesystems with orphan list
Theodore Ts'o [Thu, 27 Dec 2012 06:42:50 +0000 (01:42 -0500)]
ext4: avoid hang when mounting non-journal filesystems with orphan list

commit 0e9a9a1ad619e7e987815d20262d36a2f95717ca upstream.

When trying to mount a file system which does not contain a journal,
but which does have a orphan list containing an inode which needs to
be truncated, the mount call with hang forever in
ext4_orphan_cleanup() because ext4_orphan_del() will return
immediately without removing the inode from the orphan list, leading
to an uninterruptible loop in kernel code which will busy out one of
the CPU's on the system.

This can be trivially reproduced by trying to mount the file system
found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs
source tree.  If a malicious user were to put this on a USB stick, and
mount it on a Linux desktop which has automatic mounts enabled, this
could be considered a potential denial of service attack.  (Not a big
deal in practice, but professional paranoids worry about such things,
and have even been known to allocate CVE numbers for such problems.)

-js: This is a fix for CVE-2013-2015.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agohwmon: (applesmc) Silence uninitialized warnings
Henrik Rydberg [Thu, 26 Jan 2012 11:08:41 +0000 (06:08 -0500)]
hwmon: (applesmc) Silence uninitialized warnings

commit 0fc86eca1b338d06ec500b34ef7def79c32b602b upstream.

Some error paths do not set a result, leading to the (false)
assumption that the value may be used uninitialized. Set results for
those paths as well.

Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoxhci: Fix race between ep halt and URB cancellation
Florian Wolter [Wed, 14 Aug 2013 08:33:16 +0000 (10:33 +0200)]
xhci: Fix race between ep halt and URB cancellation

commit 526867c3ca0caa2e3e846cb993b0f961c33c2abb upstream.

The halted state of a endpoint cannot be cleared over CLEAR_HALT from a
user process, because the stopped_td variable was overwritten in the
handle_stopped_endpoint() function. So the xhci_endpoint_reset() function will
refuse the reset and communication with device can not run over this endpoint.
https://bugzilla.kernel.org/show_bug.cgi?id=60699

Signed-off-by: Florian Wolter <wolly84@web.de>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agocciss: fix info leak in cciss_ioctl32_passthru()
Dan Carpenter [Tue, 24 Sep 2013 22:27:45 +0000 (15:27 -0700)]
cciss: fix info leak in cciss_ioctl32_passthru()

commit 58f09e00ae095e46ef9edfcf3a5fd9ccdfad065e upstream.

The arg64 struct has a hole after ->buf_size which isn't cleared.  Or if
any of the calls to copy_from_user() fail then that would cause an
information leak as well.

This was assigned CVE-2013-2147.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agocpqarray: fix info leak in ida_locked_ioctl()
Dan Carpenter [Tue, 24 Sep 2013 22:27:44 +0000 (15:27 -0700)]
cpqarray: fix info leak in ida_locked_ioctl()

commit 627aad1c01da6f881e7f98d71fd928ca0c316b1a upstream.

The pciinfo struct has a two byte hole after ->dev_fn so stack
information could be leaked to the user.

This was assigned CVE-2013-2147.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoiscsi: don't hang in endless loop if no targets present
Sasha Levin [Thu, 26 Jan 2012 03:16:16 +0000 (22:16 -0500)]
iscsi: don't hang in endless loop if no targets present

commit 46a7c17d26967922092f3a8291815ffb20f6cabe upstream.

iscsi_if_send_reply() may return -ESRCH if there were no targets to send
data to. Currently we're ignoring this value and looping in attempt to do it
over and over, which will usually lead in a hung task like this one:

[ 4920.817298] INFO: task trinity:9074 blocked for more than 120 seconds.
[ 4920.818527] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4920.819982] trinity         D 0000000000000000  5504  9074   2756 0x00000004
[ 4920.825374]  ffff880003961a98 0000000000000086 ffff8800001aa000 ffff8800001aa000
[ 4920.826791]  00000000001d4340 ffff880003961fd8 ffff880003960000 00000000001d4340
[ 4920.828241]  00000000001d4340 00000000001d4340 ffff880003961fd8 00000000001d4340
[ 4920.833231]
[ 4920.833519] Call Trace:
[ 4920.834010]  [<ffffffff826363fa>] schedule+0x3a/0x50
[ 4920.834953]  [<ffffffff82634ac9>] __mutex_lock_common+0x209/0x5b0
[ 4920.836226]  [<ffffffff81af805d>] ? iscsi_if_rx+0x2d/0x990
[ 4920.837281]  [<ffffffff81053943>] ? sched_clock+0x13/0x20
[ 4920.838305]  [<ffffffff81af805d>] ? iscsi_if_rx+0x2d/0x990
[ 4920.839336]  [<ffffffff82634eb0>] mutex_lock_nested+0x40/0x50
[ 4920.840423]  [<ffffffff81af805d>] iscsi_if_rx+0x2d/0x990
[ 4920.841434]  [<ffffffff810dffed>] ? sub_preempt_count+0x9d/0xd0
[ 4920.842548]  [<ffffffff82637bb0>] ? _raw_read_unlock+0x30/0x60
[ 4920.843666]  [<ffffffff821f71de>] netlink_unicast+0x1ae/0x1f0
[ 4920.844751]  [<ffffffff821f7997>] netlink_sendmsg+0x227/0x350
[ 4920.845850]  [<ffffffff821857bd>] ? sock_update_netprioidx+0xdd/0x1b0
[ 4920.847060]  [<ffffffff82185732>] ? sock_update_netprioidx+0x52/0x1b0
[ 4920.848276]  [<ffffffff8217f226>] sock_aio_write+0x166/0x180
[ 4920.849348]  [<ffffffff810dfe41>] ? get_parent_ip+0x11/0x50
[ 4920.850428]  [<ffffffff811d0d9a>] do_sync_write+0xda/0x120
[ 4920.851465]  [<ffffffff810dffed>] ? sub_preempt_count+0x9d/0xd0
[ 4920.852579]  [<ffffffff810dfe41>] ? get_parent_ip+0x11/0x50
[ 4920.853608]  [<ffffffff81791887>] ? security_file_permission+0x27/0xb0
[ 4920.854821]  [<ffffffff811d0f4c>] vfs_write+0x16c/0x180
[ 4920.855781]  [<ffffffff811d104f>] sys_write+0x4f/0xa0
[ 4920.856798]  [<ffffffff82638e79>] system_call_fastpath+0x16/0x1b
[ 4920.877487] 1 lock held by trinity/9074:
[ 4920.878239]  #0:  (rx_queue_mutex){+.+...}, at: [<ffffffff81af805d>] iscsi_if_rx+0x2d/0x990
[ 4920.880005] Kernel panic - not syncing: hung_task: blocked tasks

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoRevert "sctp: fix call to SCTP_CMD_PROCESS_SACK in sctp_cmd_interpreter()"
Ben Hutchings [Sun, 20 Oct 2013 16:06:39 +0000 (17:06 +0100)]
Revert "sctp: fix call to SCTP_CMD_PROCESS_SACK in sctp_cmd_interpreter()"

This reverts commit de77b7955c3985ca95f64af3cb10557eb17eacee, which was
commit f6e80abeab928b7c47cc1fbf53df13b4398a2bec upstream.

This fix was only appropriate for Linux 3.7 onward, and introduced a
regression when applied to earlier versions.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoisofs: Refuse RW mount of the filesystem instead of making it RO
Jan Kara [Thu, 25 Jul 2013 09:49:11 +0000 (11:49 +0200)]
isofs: Refuse RW mount of the filesystem instead of making it RO

commit 17b7f7cf58926844e1dd40f5eb5348d481deca6a upstream.

Refuse RW mount of isofs filesystem. So far we just silently changed it
to RO mount but when the media is writeable, block layer won't notice
this change and thus will think device is used RW and will block eject
button of the drive. That is unexpected by users because for
non-writeable media eject button works just fine.

Userspace mount(8) command handles this just fine and retries mounting
with MS_RDONLY set so userspace shouldn't see any regression.  Plus any
tool mounting isofs is likely confronted with the case of read-only
media where block layer already refuses to mount the filesystem without
MS_RDONLY set so our behavior shouldn't be anything new for it.

Reported-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoHID: usbhid: quirk for N-Trig DuoSense Touch Screen
Vasily Titskiy [Fri, 30 Aug 2013 22:25:04 +0000 (18:25 -0400)]
HID: usbhid: quirk for N-Trig DuoSense Touch Screen

commit 9e0bf92c223dabe0789714f8f85f6e26f8f9cda4 upstream.

The DuoSense touchscreen device causes a 10 second timeout. This fix
removes the delay.

Signed-off-by: Vasily Titskiy <qehgt0@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoHID: Fix Speedlink VAD Cezanne support for some devices
Stefan Kriwanek [Sun, 25 Aug 2013 08:46:13 +0000 (10:46 +0200)]
HID: Fix Speedlink VAD Cezanne support for some devices

commit 06bb5219118fb098f4b0c7dcb484b28a52bf1c14 upstream.

Some devices of the "Speedlink VAD Cezanne" model need more aggressive fixing
than already done.

I made sure through testing that this patch would not interfere with the proper
working of a device that is bug-free. (The driver drops EV_REL events with
abs(val) >= 256, which are not achievable even on the highest laser resolution
hardware setting.)

Signed-off-by: Stefan Kriwanek <mail@stefankriwanek.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agofanotify: dont merge permission events
Lino Sanfilippo [Fri, 23 Mar 2012 01:42:23 +0000 (02:42 +0100)]
fanotify: dont merge permission events

commit 03a1cec1f17ac1a6041996b3e40f96b5a2f90e1b upstream.

Boyd Yang reported a problem for the case that multiple threads of the same
thread group are waiting for a reponse for a permission event.
In this case it is possible that some of the threads are never woken up, even
if the response for the event has been received
(see http://marc.info/?l=linux-kernel&m=131822913806350&w=2).

The reason is that we are currently merging permission events if they belong to
the same thread group. But we are not prepared to wake up more than one waiter
for each event. We do

wait_event(group->fanotify_data.access_waitq, event->response ||
atomic_read(&group->fanotify_data.bypass_perm));
and after that
  event->response = 0;

which is the reason that even if we woke up all waiters for the same event
some of them may see event->response being already set 0 again, then go back to
sleep and block forever.

With this patch we avoid that more than one thread is waiting for a response
by not merging permission events for the same thread group any more.

Reported-by: Boyd Yang <boyd.yang@gmail.com>
Signed-off-by: Lino Sanfilippo <LinoSanfilipp@gmx.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoperf tools: Handle JITed code in shared memory
Andi Kleen [Thu, 25 Apr 2013 00:03:02 +0000 (17:03 -0700)]
perf tools: Handle JITed code in shared memory

commit 89365e6c9ad4c0e090e4c6a4b67a3ce319381d89 upstream.

Need to check for /dev/zero.

Most likely more strings are missing too.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1366848182-30449-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoperf: Fix perf_cgroup_switch for sw-events
Peter Zijlstra [Tue, 2 Oct 2012 13:41:23 +0000 (15:41 +0200)]
perf: Fix perf_cgroup_switch for sw-events

commit 95cf59ea72331d0093010543b8951bb43f262cac upstream.

Jiri reported that he could trigger the WARN_ON_ONCE() in
perf_cgroup_switch() using sw-events. This is because sw-events share
a cpuctx with multiple PMUs.

Use the ->unique_pmu pointer to limit the pmu iteration to unique
cpuctx instances.

Reported-and-Tested-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-so7wi2zf3jjzrwcutm2mkz0j@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoperf: Clarify perf_cpu_context::active_pmu usage by renaming it to ::unique_pmu
Peter Zijlstra [Tue, 2 Oct 2012 13:38:52 +0000 (15:38 +0200)]
perf: Clarify perf_cpu_context::active_pmu usage by renaming it to ::unique_pmu

commit 3f1f33206c16c7b3839d71372bc2ac3f305aa802 upstream.

Stephane thought the perf_cpu_context::active_pmu name confusing and
suggested using 'unique_pmu' instead.

This pointer is a pointer to a 'random' pmu sharing the cpuctx
instance, therefore limiting a for_each_pmu loop to those where
cpuctx->unique_pmu matches the pmu we get a loop over unique cpuctx
instances.

Suggested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-kxyjqpfj2fn9gt7kwu5ag9ks@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agocgroup: fail if monitored file and event_control are in different cgroup
Li Zefan [Mon, 18 Feb 2013 06:13:35 +0000 (14:13 +0800)]
cgroup: fail if monitored file and event_control are in different cgroup

commit f169007b2773f285e098cb84c74aac0154d65ff7 upstream.

If we pass fd of memory.usage_in_bytes of cgroup A to cgroup.event_control
of cgroup B, then we won't get memory usage notification from A but B!

What's worse, if A and B are in different mount hierarchy, we'll end up
accessing NULL pointer!

Disallow this kind of invalid usage.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosfc: Fix efx_rx_buf_offset() for recycled pages
Ben Hutchings [Fri, 6 Sep 2013 21:39:20 +0000 (22:39 +0100)]
sfc: Fix efx_rx_buf_offset() for recycled pages

This bug fix is only for stable branches older than 3.10.  The bug was
fixed upstream by commit 2768935a4660 ('sfc: reuse pages to avoid DMA
mapping/unmapping costs'), but that change is totally unsuitable for
stable.

Commit b590ace09d51 ('sfc: Fix efx_rx_buf_offset() in the presence of
swiotlb') added an explicit page_offset member to struct
efx_rx_buffer, which must be set consistently with the u.page and
dma_addr fields.  However, it failed to add the necessary assignment
in efx_resurrect_rx_buffer().  It also did not correct the calculation
of efx_rx_buffer::dma_addr in efx_resurrect_rx_buffer(), which assumes
that DMA-mapping a page will result in a page-aligned DMA address
(exactly what swiotlb violates).

Add the assignment of efx_rx_buffer::page_offset and change the
calculation of dma_addr to make use of it.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
8 years agomacvtap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
Jason Wang [Thu, 18 Jul 2013 02:55:16 +0000 (10:55 +0800)]
macvtap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS

commit ece793fcfc417b3925844be88a6a6dc82ae8f7c6 upstream.

We try to linearize part of the skb when the number of iov is greater than
MAX_SKB_FRAGS. This is not enough since each single vector may occupy more than
one pages, so zerocopy_sg_fromiovec() may still fail and may break the guest
network.

Solve this problem by calculate the pages needed for iov before trying to do
zerocopy and switch to use copy instead of zerocopy if it needs more than
MAX_SKB_FRAGS.

This is done through introducing a new helper to count the pages for iov, and
call uarg->callback() manually when switching from zerocopy to copy to notify
vhost.

We can do further optimization on top.

This bug were introduced from b92946e2919134ebe2a4083e4302236295ea2a73
(macvtap: zerocopy: validate vectors before building skb).

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: callback only takes one argument]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoRevert "zram: use zram->lock to protect zram_free_page() in swap free notify path"
Ben Hutchings [Sun, 20 Oct 2013 12:37:39 +0000 (13:37 +0100)]
Revert "zram: use zram->lock to protect zram_free_page() in swap free notify path"

This reverts commit 9e443904906ca2b5b3ae71f34ac4a4fa6905623e, which
was commit 57ab048532c0d975538cebd4456491b5c34248f4 upstream.

Taking the semaphore here leads to sleeping in atomic context.  This
was fixed in mainline commit a0c516cbfc74 ("zram: don't grab mutex in
zram_slot_free_noity") but that is too difficult to backport.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agopowerpc/pseries/lparcfg: Fix possible overflow are more than 1026
Chen Gang [Mon, 22 Apr 2013 17:12:54 +0000 (17:12 +0000)]
powerpc/pseries/lparcfg: Fix possible overflow are more than 1026

commit 5676005acf26ab7e924a8438ea4746e47d405762 upstream.

need set '\0' for 'local_buffer'.

SPLPAR_MAXLENGTH is 1026, RTAS_DATA_BUF_SIZE is 4096. so the contents of
rtas_data_buf may truncated in memcpy.

if contents are really truncated.
  the splpar_strlen is more than 1026. the next while loop checking will
  not find the end of buffer. that will cause memory access violation.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agom68knommu: clean up linker script
Greg Ungerer [Thu, 5 Jan 2012 05:51:13 +0000 (15:51 +1000)]
m68knommu: clean up linker script

commit f84f52a5c15db7d14a534815f27253b001735183 upstream.

There is a lot of years of collected cruft in the m68knommu linker script.
Clean it all up and use the well defined linker script support macros.

Support is maintained for building both ROM/FLASH based and RAM based setups.
No major changes to section layouts, though the rodata section is now lumped
in with the read/write data section.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agom68k: use non-MMU linker script for ColdFire MMU builds
Greg Ungerer [Wed, 19 Oct 2011 03:50:35 +0000 (13:50 +1000)]
m68k: use non-MMU linker script for ColdFire MMU builds

commit ed865e31a8273be200db9ddcdb6b844e48777abd upstream.

Use the non-MMU linker script for ColdFire builds when we are building
for MMU enabled. The image layout is correct for loading on existing
ColdFire dev boards. The only addition required to the current non-MMU
linker script is to add support for the fixup section.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agom68k: consolidate the vmlinux.lds linker scripts
Greg Ungerer [Thu, 8 Dec 2011 05:39:05 +0000 (15:39 +1000)]
m68k: consolidate the vmlinux.lds linker scripts

commit 40c1b9cfeedf79b909c961e0e00a13497e80bc82 upstream.

The merge of m68knommu left the linker scripts a little disorganized.
Some consistent naming and squashing two of scripts that just include
others can simplify things a lot.

So merge the two simple including scripts, and rename the nommu script
to be consistent with the existing m68k linker scripts.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agousb: core: don't try to reset_device() a port that got just disconnected
Julius Werner [Wed, 31 Jul 2013 02:51:20 +0000 (19:51 -0700)]
usb: core: don't try to reset_device() a port that got just disconnected

commit 481f2d4f89f87a0baa26147f323380e31cfa7c44 upstream.

The USB hub driver's event handler contains a check to catch SuperSpeed
devices that transitioned into the SS.Inactive state and tries to fix
them with a reset. It decides whether to do a plain hub port reset or
call the usb_reset_device() function based on whether there was a device
attached to the port.

However, there are device/hub combinations (found with a JetFlash
Transcend mass storage stick (8564:1000) on the root hub of an Intel
LynxPoint PCH) which can transition to the SS.Inactive state on
disconnect (and stay there long enough for the host to notice). In this
case, above-mentioned reset check will call usb_reset_device() on the
stale device data structure. The kernel will send pointless LPM control
messages to the no longer connected device address and can even cause
several 5 second khubd stalls on some (buggy?) host controllers, before
finally accepting the device's fate amongst a flurry of error messages.

This patch makes the choice of reset dependent on the port status that
has just been read from the hub in addition to the existence of an
in-kernel data structure for the device, and only proceeds with the more
extensive reset if both are valid.

Signed-off-by: Julius Werner <jwerner@chromium.org>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agodebugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)
Oleg Nesterov [Fri, 26 Jul 2013 15:12:56 +0000 (17:12 +0200)]
debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)

commit 776164c1faac4966ab14418bb0922e1820da1d19 upstream.

debugfs_remove_recursive() is wrong,

1. it wrongly assumes that !list_empty(d_subdirs) means that this
   dir should be removed.

   This is not that bad by itself, but:

2. if d_subdirs does not becomes empty after __debugfs_remove()
   it gives up and silently fails, it doesn't even try to remove
   other entries.

   However ->d_subdirs can be non-empty because it still has the
   already deleted !debugfs_positive() entries.

3. simple_release_fs() is called even if __debugfs_remove() fails.

Suppose we have

dir1/
dir2/
file2
file1

and someone opens dir1/dir2/file2.

Now, debugfs_remove_recursive(dir1/dir2) succeeds, and dir1/dir2 goes
away.

But debugfs_remove_recursive(dir1) silently fails and doesn't remove
this directory. Because it tries to delete (the already deleted)
dir1/dir2/file2 again and then fails due to "Avoid infinite loop"
logic.

Test-case:

#!/bin/sh

cd /sys/kernel/debug/tracing
echo 'p:probe/sigprocmask sigprocmask' >> kprobe_events
sleep 1000 < events/probe/sigprocmask/id &
echo -n >| kprobe_events

[ -d events/probe ] && echo "ERR!! failed to rm probe"

And after that it is not possible to create another probe entry.

With this patch debugfs_remove_recursive() skips !debugfs_positive()
files although this is not strictly needed. The most important change
is that it does not try to make ->d_subdirs empty, it simply scans
the whole list(s) recursively and removes as much as possible.

Link: http://lkml.kernel.org/r/20130726151256.GC19472@redhat.com
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoperf: Use css_tryget() to avoid propping up css refcount
Salman Qazi [Thu, 14 Jun 2012 22:31:09 +0000 (15:31 -0700)]
perf: Use css_tryget() to avoid propping up css refcount

commit 9c5da09d266ca9b32eb16cf940f8161d949c2fe5 upstream.

An rmdir pushes css's ref count to zero.  However, if the associated
directory is open at the time, the dentry ref count is non-zero.  If
the fd for this directory is then passed into perf_event_open, it
does a css_get().  This bounces the ref count back up from zero.  This
is a problem by itself.  But what makes it turn into a crash is the
fact that we end up doing an extra dput, since we perform a dput
when css_put sees the ref count go down to zero.

css_tryget() does not fall into that trap. So, we use that instead.

Reproduction test-case for the bug:

 #include <unistd.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <fcntl.h>
 #include <linux/unistd.h>
 #include <linux/perf_event.h>
 #include <string.h>
 #include <errno.h>
 #include <stdio.h>

 #define PERF_FLAG_PID_CGROUP    (1U << 2)

 int perf_event_open(struct perf_event_attr *hw_event_uptr,
                     pid_t pid, int cpu, int group_fd, unsigned long flags) {
         return syscall(__NR_perf_event_open,hw_event_uptr, pid, cpu,
                 group_fd, flags);
 }

 /*
  * Directly poke at the perf_event bug, since it's proving hard to repro
  * depending on where in the kernel tree.  what moved?
  */
 int main(int argc, char **argv)
 {
        int fd;
        struct perf_event_attr attr;
        memset(&attr, 0, sizeof(attr));
        attr.exclude_kernel = 1;
        attr.size = sizeof(attr);
        mkdir("/dev/cgroup/perf_event/blah", 0777);
        fd = open("/dev/cgroup/perf_event/blah", O_RDONLY);
        perror("open");
        rmdir("/dev/cgroup/perf_event/blah");
        sleep(2);
        perf_event_open(&attr, fd, 0, -1,  PERF_FLAG_PID_CGROUP);
        perror("perf_event_open");
        close(fd);
        return 0;
 }

Signed-off-by: Salman Qazi <sqazi@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/20120614223108.1025.2503.stgit@dungbeetle.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agokernel-doc: bugfix - multi-line macros
Daniel Santos [Fri, 5 Oct 2012 00:15:05 +0000 (17:15 -0700)]
kernel-doc: bugfix - multi-line macros

commit 654784284430bf2739985914b65e09c7c35a7273 upstream.

Prior to this patch the following code breaks:

/**
 * multiline_example - this breaks kernel-doc
 */
 #define multiline_example( \
myparam)

Producing this error:

Error(somefile.h:983): cannot understand prototype: 'multiline_example( \ '

This patch fixes the issue by appending all lines ending in a blackslash
(optionally followed by whitespace), removing the backslash and any
whitespace after it prior to appending (just like the C pre-processor
would).

This fixes a break in kerel-doc introduced by the additions to rbtree.h.

Signed-off-by: Daniel Santos <daniel.santos@pobox.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosparc32: Fix exit flag passed from traced sys_sigreturn
Kirill Tkhai [Thu, 25 Jul 2013 21:17:15 +0000 (01:17 +0400)]
sparc32: Fix exit flag passed from traced sys_sigreturn

[ Upstream commit 7a3b0f89e3fea680f93932691ca41a68eee7ab5e ]

Pass 1 in %o1 to indicate that syscall_trace accounts exit.

Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosparc64: Fix not SRA'ed %o5 in 32-bit traced syscall
Kirill Tkhai [Fri, 26 Jul 2013 13:21:12 +0000 (17:21 +0400)]
sparc64: Fix not SRA'ed %o5 in 32-bit traced syscall

[ Upstream commit ab2abda6377723e0d5fbbfe5f5aa16a5523344d1 ]

(From v1 to v2: changed comment)

On the way linux_sparc_syscall32->linux_syscall_trace32->goto 2f,
register %o5 doesn't clear its second 32-bit.

Fix that.

Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosparc64: Fix off by one in trampoline TLB mapping installation loop.
David S. Miller [Thu, 22 Aug 2013 23:38:46 +0000 (16:38 -0700)]
sparc64: Fix off by one in trampoline TLB mapping installation loop.

[ Upstream commit 63d499662aeec1864ec36d042aca8184ea6a938e ]

Reported-by: Kirill Tkhai <tkhai@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosparc64: Remove RWSEM export leftovers
Kirill Tkhai [Mon, 12 Aug 2013 12:02:24 +0000 (16:02 +0400)]
sparc64: Remove RWSEM export leftovers

[ Upstream commit 61d9b9355b0d427bd1e732bd54628ff9103e496f ]

The functions

__down_read
__down_read_trylock
__down_write
__down_write_trylock
__up_read
__up_write
__downgrade_write

are implemented inline, so remove corresponding EXPORT_SYMBOLs
(They lead to compile errors on RT kernel).

Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosparc64: Fix ITLB handler of null page
Kirill Tkhai [Fri, 2 Aug 2013 15:23:18 +0000 (19:23 +0400)]
sparc64: Fix ITLB handler of null page

[ Upstream commit 1c2696cdaad84580545a2e9c0879ff597880b1a9 ]

1)Use kvmap_itlb_longpath instead of kvmap_dtlb_longpath.

2)Handle page #0 only, don't handle page #1: bleu -> blu

 (KERNBASE is 0x400000, so #1 does not exist too. But everything
  is possible in the future. Fix to not to have problems later.)

3)Remove unused kvmap_itlb_nonlinear.

Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoesp_scsi: Fix tag state corruption when autosensing.
David S. Miller [Fri, 2 Aug 2013 01:08:34 +0000 (18:08 -0700)]
esp_scsi: Fix tag state corruption when autosensing.

[ Upstream commit 21af8107f27878813d0364733c0b08813c2c192a ]

Meelis Roos reports a crash in esp_free_lun_tag() in the presense
of a disk which has died.

The issue is that when we issue an autosense command, we do so by
hijacking the original command that caused the check-condition.

When we do so we clear out the ent->tag[] array when we issue it via
find_and_prep_issuable_command().  This is so that the autosense
command is forced to be issued non-tagged.

That is problematic, because it is the value of ent->tag[] which
determines whether we issued the original scsi command as tagged
vs. non-tagged (see esp_alloc_lun_tag()).

And that, in turn, is what trips up the sanity checks in
esp_free_lun_tag().  That function needs the original ->tag[] values
in order to free up the tag slot properly.

Fix this by remembering the original command's tag values, and
having esp_alloc_lun_tag() and esp_free_lun_tag() use them.

Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoll_temac: Reset dma descriptors indexes on ndo_open
Ricardo Ribalda [Tue, 1 Oct 2013 06:17:10 +0000 (08:17 +0200)]
ll_temac: Reset dma descriptors indexes on ndo_open

[ Upstream commit 7167cf0e8cd10287b7912b9ffcccd9616f382922 ]

The dma descriptors indexes are only initialized on the probe function.

If a packet is on the buffer when temac_stop is called, the dma
descriptors indexes can be left on a incorrect state where no other
package can be sent.

So an interface could be left in an usable state after ifdow/ifup.

This patch makes sure that the descriptors indexes are in a proper
status when the device is open.

Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put
Salam Noureddine [Sun, 29 Sep 2013 20:41:34 +0000 (13:41 -0700)]
ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put

[ Upstream commit 9260d3e1013701aa814d10c8fc6a9f92bd17d643 ]

It is possible for the timer handlers to run after the call to
ipv6_mc_down so use in6_dev_put instead of __in6_dev_put in the
handler function in order to do proper cleanup when the refcnt
reaches 0. Otherwise, the refcnt can reach zero without the
inet6_dev being destroyed and we end up leaking a reference to
the net_device and see messages like the following,

unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Tested on linux-3.4.43.

Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put
Salam Noureddine [Sun, 29 Sep 2013 20:39:42 +0000 (13:39 -0700)]
ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put

[ Upstream commit e2401654dd0f5f3fb7a8d80dad9554d73d7ca394 ]

It is possible for the timer handlers to run after the call to
ip_mc_down so use in_dev_put instead of __in_dev_put in the handler
function in order to do proper cleanup when the refcnt reaches 0.
Otherwise, the refcnt can reach zero without the in_device being
destroyed and we end up leaking a reference to the net_device and
see messages like the following,

unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Tested on linux-3.4.43.

Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agobonding: Fix broken promiscuity reference counting issue
Neil Horman [Fri, 27 Sep 2013 16:22:15 +0000 (12:22 -0400)]
bonding: Fix broken promiscuity reference counting issue

[ Upstream commit 5a0068deb611109c5ba77358be533f763f395ee4 ]

Recently grabbed this report:
https://bugzilla.redhat.com/show_bug.cgi?id=1005567

Of an issue in which the bonding driver, with an attached vlan encountered the
following errors when bond0 was taken down and back up:

dummy1: promiscuity touches roof, set promiscuity failed. promiscuity feature of
device might be broken.

The error occurs because, during __bond_release_one, if we release our last
slave, we take on a random mac address and issue a NETDEV_CHANGEADDR
notification.  With an attached vlan, the vlan may see that the vlan and bond
mac address were in sync, but no longer are.  This triggers a call to dev_uc_add
and dev_set_rx_mode, which enables IFF_PROMISC on the bond device.  Then, when
we complete __bond_release_one, we use the current state of the bond flags to
determine if we should decrement the promiscuity of the releasing slave.  But
since the bond changed promiscuity state during the release operation, we
incorrectly decrement the slave promisc count when it wasn't in promiscuous mode
to begin with, causing the above error

Fix is pretty simple, just cache the bonding flags at the start of the function
and use those when determining the need to set promiscuity.

This is also needed for the ALLMULTI flag

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Mark Wu <wudxw@linux.vnet.ibm.com>
CC: "David S. Miller" <davem@davemloft.net>
Reported-by: Mark Wu <wudxw@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agodm9601: fix IFF_ALLMULTI handling
Peter Korsgaard [Mon, 30 Sep 2013 21:28:20 +0000 (23:28 +0200)]
dm9601: fix IFF_ALLMULTI handling

[ Upstream commit bf0ea6380724beb64f27a722dfc4b0edabff816e ]

Pass-all-multicast is controlled by bit 3 in RX control, not bit 2
(pass undersized frames).

Reported-by: Joseph Chang <joseph_chang@davicom.com.tw>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agovia-rhine: fix VLAN priority field (PCP, IEEE 802.1p)
Roger Luethi [Sat, 21 Sep 2013 12:24:11 +0000 (14:24 +0200)]
via-rhine: fix VLAN priority field (PCP, IEEE 802.1p)

[ Upstream commit 207070f5221e2a901d56a49df9cde47d9b716cd7 ]

Outgoing packets sent by via-rhine have their VLAN PCP field off by one
(when hardware acceleration is enabled). The TX descriptor expects only VID
and PCP (without a CFI/DEI bit).

Peter Boström noticed and reported the bug.

Signed-off-by: Roger Luethi <rl@hellgate.ch>
Cc: Peter Boström <peter.bostrom@netrounds.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoipv6: udp packets following an UFO enqueued packet need also be handled by UFO
Hannes Frederic Sowa [Sat, 21 Sep 2013 04:27:00 +0000 (06:27 +0200)]
ipv6: udp packets following an UFO enqueued packet need also be handled by UFO

[ Upstream commit 2811ebac2521ceac84f2bdae402455baa6a7fb47 ]

In the following scenario the socket is corked:
If the first UDP packet is larger then the mtu we try to append it to the
write queue via ip6_ufo_append_data. A following packet, which is smaller
than the mtu would be appended to the already queued up gso-skb via
plain ip6_append_data. This causes random memory corruptions.

In ip6_ufo_append_data we also have to be careful to not queue up the
same skb multiple times. So setup the gso frame only when no first skb
is available.

This also fixes a shortcoming where we add the current packet's length to
cork->length but return early because of a packet > mtu with dontfrag set
(instead of sutracting it again).

Found with trinity.

Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>