pandora-kernel.git
15 years agoMerge branch 'net-2.6-misc-20080611a' of git://git.linux-ipv6.org/gitroot/yoshfuji...
David S. Miller [Thu, 12 Jun 2008 01:11:16 +0000 (18:11 -0700)]
Merge branch 'net-2.6-misc-20080611a' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-fix

15 years agoMerge branch 'master' of git://eden-feed.erg.abdn.ac.uk/net-2.6
David S. Miller [Thu, 12 Jun 2008 00:53:04 +0000 (17:53 -0700)]
Merge branch 'master' of git://eden-feed.erg.abdn.ac.uk/net-2.6

15 years agonetfilter: nf_conntrack: fix ctnetlink related crash in nf_nat_setup_info()
Patrick McHardy [Thu, 12 Jun 2008 00:51:10 +0000 (17:51 -0700)]
netfilter: nf_conntrack: fix ctnetlink related crash in nf_nat_setup_info()

When creation of a new conntrack entry in ctnetlink fails after having
set up the NAT mappings, the conntrack has an extension area allocated
that is not getting properly destroyed when freeing the conntrack again.
This means the NAT extension is still in the bysource hash, causing a
crash when walking over the hash chain the next time:

BUG: unable to handle kernel paging request at 00120fbd
IP: [<c03d394b>] nf_nat_setup_info+0x221/0x58a
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP

Pid: 2795, comm: conntrackd Not tainted (2.6.26-rc5 #1)
EIP: 0060:[<c03d394b>] EFLAGS: 00010206 CPU: 1
EIP is at nf_nat_setup_info+0x221/0x58a
EAX: 00120fbd EBX: 00120fbd ECX: 00000001 EDX: 00000000
ESI: 0000019e EDI: e853bbb4 EBP: e853bbc8 ESP: e853bb78
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process conntrackd (pid: 2795, ti=e853a000 task=f7de10f0 task.ti=e853a000)
Stack: 00000000 e853bc2c e85672ec 00000008 c0561084 63c1db4a 00000000 00000000
       00000000 0002e109 61d2b1c3 00000000 00000000 00000000 01114e22 61d2b1c3
       00000000 00000000 f7444674 e853bc04 00000008 c038e728 0000000a f7444674
Call Trace:
 [<c038e728>] nla_parse+0x5c/0xb0
 [<c0397c1b>] ctnetlink_change_status+0x190/0x1c6
 [<c0397eec>] ctnetlink_new_conntrack+0x189/0x61f
 [<c0119aee>] update_curr+0x3d/0x52
 [<c03902d1>] nfnetlink_rcv_msg+0xc1/0xd8
 [<c0390228>] nfnetlink_rcv_msg+0x18/0xd8
 [<c0390210>] nfnetlink_rcv_msg+0x0/0xd8
 [<c038d2ce>] netlink_rcv_skb+0x2d/0x71
 [<c0390205>] nfnetlink_rcv+0x19/0x24
 [<c038d0f5>] netlink_unicast+0x1b3/0x216
 ...

Move invocation of the extension destructors to nf_conntrack_free()
to fix this problem.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10875

Reported-and-Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetfilter: Make nflog quiet when no one listen in userspace.
Eric Leblond [Thu, 12 Jun 2008 00:50:27 +0000 (17:50 -0700)]
netfilter: Make nflog quiet when no one listen in userspace.

The message "nf_log_packet: can't log since no backend logging module loaded
in! Please either load one, or disable logging explicitly" was displayed for
each logged packet when no userspace application is listening to nflog events.
The message seems to warn for a problem with a kernel module missing but as
said before this is not the case. I thus propose to suppress the message (I
don't see any reason to flood the log because a user application has crashed.)

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoipv6: Fail with appropriate error code when setting not-applicable sockopt.
YOSHIFUJI Hideaki [Wed, 11 Jun 2008 18:27:26 +0000 (03:27 +0900)]
ipv6: Fail with appropriate error code when setting not-applicable sockopt.

IPV6_MULTICAST_HOPS, for example, is not valid for stream sockets.
Since they are virtually unavailable for stream sockets,
we should return ENOPROTOOPT instead of EINVAL.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years agoipv6: Check IPV6_MULTICAST_LOOP option value.
YOSHIFUJI Hideaki [Wed, 11 Jun 2008 18:14:51 +0000 (03:14 +0900)]
ipv6: Check IPV6_MULTICAST_LOOP option value.

Only 0 and 1 are valid for IPV6_MULTICAST_LOOP socket option,
and we should return an error of EINVAL otherwise, per RFC3493.

Based on patch from Shan Wei <shanwei@cn.fujitsu.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years agoipv6: Check the hop limit setting in ancillary data.
Shan Wei [Tue, 10 Jun 2008 07:50:55 +0000 (15:50 +0800)]
ipv6: Check the hop limit setting in ancillary data.

When specifing the outgoing hop limit as ancillary data for sendmsg(),
the kernel doesn't check the integer hop limit value as specified in
[RFC-3542] section 6.3.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years agoipv6 route: Fix route lifetime in netlink message.
YOSHIFUJI Hideaki [Mon, 12 May 2008 17:52:55 +0000 (02:52 +0900)]
ipv6 route: Fix route lifetime in netlink message.

1) We may have route lifetime larger than INT_MAX.
In that case we had wired value in lifetime.
Use INT_MAX if lifetime does not fit in s32.

2) Lifetime is valid iif RTF_EXPIRES is set.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years agoipv6 mcast: Check address family of gf_group in getsockopt(MS_FILTER).
YOSHIFUJI Hideaki [Mon, 28 Apr 2008 05:40:55 +0000 (14:40 +0900)]
ipv6 mcast: Check address family of gf_group in getsockopt(MS_FILTER).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years agodccp: Bug in initial acknowledgment number assignment
Gerrit Renker [Wed, 11 Jun 2008 10:19:10 +0000 (11:19 +0100)]
dccp: Bug in initial acknowledgment number assignment

Step 8.5 in RFC 4340 says for the newly cloned socket

           Initialize S.GAR := S.ISS,

but what in fact the code (minisocks.c) does is

           Initialize S.GAR := S.ISR,

which is wrong (typo?) -- fixed by the patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
15 years agodccp ccid-3: X truncated due to type conversion
Gerrit Renker [Wed, 11 Jun 2008 10:19:10 +0000 (11:19 +0100)]
dccp ccid-3: X truncated due to type conversion

This fixes a bug in computing the inter-packet-interval t_ipi = s/X:

 scaled_div32(a, b) uses u32 for b, but in "scaled_div32(s, X)" the type of the
 sending rate `X' is u64. Since X is scaled by 2^6, this truncates rates greater
 than 2^26 Bps (~537 Mbps).

Using full 64-bit division now.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
15 years agodccp ccid-3: TFRC reverse-lookup Bug-Fix
Gerrit Renker [Wed, 11 Jun 2008 10:19:10 +0000 (11:19 +0100)]
dccp ccid-3: TFRC reverse-lookup Bug-Fix

This fixes a bug in the reverse lookup of p: given a value f(p), instead of p,
the function returned the smallest tabulated value f(p).

The smallest tabulated value of

   10^6 * f(p) =  sqrt(2*p/3) + 12 * sqrt(3*p/8) * (32 * p^3 + p)

for p=0.0001 is 8172.

Since this value is scaled by 10^6, the outcome of this bug is that a loss
of 8172/10^6 = 0.8172% was reported whenever the input was below the table
resolution of 0.01%.

This means that the value was over 80 times too high, resulting in large spikes
of the initial loss interval, thus unnecessarily reducing the throughput.

Also corrected the printk format (%u for u32).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
15 years agodccp ccid-2: Bug-Fix - Ack Vectors need to be ignored on request sockets
Gerrit Renker [Wed, 11 Jun 2008 10:19:09 +0000 (11:19 +0100)]
dccp ccid-2: Bug-Fix - Ack Vectors need to be ignored on request sockets

This fixes an oversight from an earlier patch, ensuring that Ack Vectors
are not processed on request sockets.

The issue is that Ack Vectors must not be parsed on request sockets, since
the Ack Vector feature depends on the selection of the (TX) CCID. During the
initial handshake the CCIDs are undefined, and so RFC 4340, 10.3 applies:

 "Using CCID-specific options and feature options during a negotiation
  for the corresponding CCID feature is NOT RECOMMENDED [...]"

And it is not even possible: when the server receives the Request from the
client, the CCID and Ack vector features are undefined; when the Ack finalising
the 3-way hanshake arrives, the request socket has not been cloned yet into a
full socket. (This order is necessary, since otherwise the newly created socket
would have to be destroyed whenever an option error occurred - a malicious
hacker could simply send garbage options and exploit this.)

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
15 years agodccp: Fix sparse warnings
Gerrit Renker [Wed, 11 Jun 2008 10:19:09 +0000 (11:19 +0100)]
dccp: Fix sparse warnings

This patch fixes the following sparse warnings:
 * nested min(max()) expression:
   net/dccp/ccids/ccid3.c:91:21: warning: symbol '__x' shadows an earlier one
   net/dccp/ccids/ccid3.c:91:21: warning: symbol '__y' shadows an earlier one

 * Declaration of function prototypes in .c instead of .h file, resulting in
   "should it be static?" warnings.

 * Declared "struct dccpw" static (local to dccp_probe).

 * Disabled dccp_delayed_ack() - not fully removed due to RFC 4340, 11.3
   ("Receivers SHOULD implement delayed acknowledgement timers ...").

 * Used a different local variable name to avoid
   net/dccp/ackvec.c:293:13: warning: symbol 'state' shadows an earlier one
   net/dccp/ackvec.c:238:33: originally declared here

 * Removed unused functions `dccp_ackvector_print' and `dccp_ackvec_print'.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
15 years agodccp ccid-3: Bug-Fix - Zero RTT is possible
Gerrit Renker [Wed, 11 Jun 2008 10:19:09 +0000 (11:19 +0100)]
dccp ccid-3: Bug-Fix - Zero RTT is possible

In commit $(825de27d9e40b3117b29a79d412b7a4b78c5d815) (from 27th May, commit
message `dccp ccid-3: Fix "t_ipi explosion" bug'), the CCID-3 window counter
computation was fixed to cope with RTTs < 4 microseconds.

Such RTTs can be found e.g. when running CCID-3 over loopback. The fix removed
a check against RTT < 4, but introduced a divide-by-zero bug.

All steady-state RTTs in DCCP are filtered using dccp_sample_rtt(), which
ensures non-zero samples. However, a zero RTT is possible on initialisation,
when there is no RTT sample from the Request/Response exchange.

The fix is to use the fallback-RTT from RFC 4340, 3.4.

This is also better than just fixing update_win_count() since it allows other
parts of the code to always assume that the RTT is non-zero during the time
that the CCID is used.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
15 years agoMerge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
David S. Miller [Tue, 10 Jun 2008 23:21:55 +0000 (16:21 -0700)]
Merge branch 'davem-fixes' of /linux/kernel/git/jgarzik/netdev-2.6

15 years agonet: Fix routing tables with id > 255 for legacy software
Krzysztof Piotr Oledzki [Tue, 10 Jun 2008 22:44:49 +0000 (15:44 -0700)]
net: Fix routing tables with id > 255 for legacy software

Most legacy software do not like tables > 255 as rtm_table is u8
so tb_id is sent &0xff and it is possible to mismatch for example
table 510 with table 254 (main).

This patch introduces RT_TABLE_COMPAT=252 so the code uses it if
tb_id > 255. It makes such old applications happy, new
ones are still able to use RTA_TABLE to get a proper table id.

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosky2: Hold RTNL while calling dev_close()
Ben Hutchings [Sat, 31 May 2008 15:52:52 +0000 (16:52 +0100)]
sky2: Hold RTNL while calling dev_close()

dev_close() must be called holding the RTNL.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agos2io iomem annotations
Al Viro [Mon, 2 Jun 2008 09:59:02 +0000 (10:59 +0100)]
s2io iomem annotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoatl1: fix suspend regression
Jay Cliburn [Sun, 1 Jun 2008 21:57:11 +0000 (16:57 -0500)]
atl1: fix suspend regression

Using vendor magic to force the PHY into power save mode breaks
suspend.  It isn't needed anyway, so remove it.

Tested-by: Avuton Olrich <avuton@gmail.com>
Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoqeth: start dev queue after tx drop error
Frank Blaschka [Fri, 6 Jun 2008 10:37:48 +0000 (12:37 +0200)]
qeth: start dev queue after tx drop error

In case the xmit function drop out with an error, we have to wake
the netdevice queue to start another xmit.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoqeth: Prepare-function to call s390dbf was wrong
Peter Tiedemann [Fri, 6 Jun 2008 10:37:47 +0000 (12:37 +0200)]
qeth: Prepare-function to call s390dbf was wrong

Prepare-function to call s390dbf was wrong handling variable arguments.
This worked as macro but not as function any more.
Now using va_list processing.

Signed-off-by: Peter Tiedemann <ptiedem@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoqeth: reduce number of kernel messages
Frank Blaschka [Fri, 6 Jun 2008 10:37:46 +0000 (12:37 +0200)]
qeth: reduce number of kernel messages

Remove unnecessary messages. Write important debug information to
s390dbf.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoqeth: Use ccw_device_get_id().
Cornelia Huck [Fri, 6 Jun 2008 10:37:45 +0000 (12:37 +0200)]
qeth: Use ccw_device_get_id().

Get the devno from the ccw device via ccw_device_get_id() instead
of parsing the bus_id.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoqeth: layer 3 Oops in ip event handler
Frank Blaschka [Fri, 6 Jun 2008 10:37:44 +0000 (12:37 +0200)]
qeth: layer 3 Oops in ip event handler

The ip event handler may present us non qeth network interfaces.
Add qeth card pointer check.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agovirtio: use callback on empty in virtio_net
Rusty Russell [Sun, 8 Jun 2008 10:51:55 +0000 (20:51 +1000)]
virtio: use callback on empty in virtio_net

virtio_net uses a timer to free old transmitted packets, rather than
leaving callbacks enabled all the time.  If the host promises to
always notify us when the transmit ring is empty, we can free packets
at that point and avoid the timer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agovirtio: virtio_net free transmit skbs in a timer
Mark McLoughlin [Sun, 8 Jun 2008 10:50:56 +0000 (20:50 +1000)]
virtio: virtio_net free transmit skbs in a timer

virtio_net currently only frees old transmit skbs just
before queueing new ones. If the queue is full, it then
enables interrupts and waits for notification that more
work has been performed.

However, a side-effect of this scheme is that there are
always xmit skbs left dangling when no new packets are
sent, against the Documentation/networking/driver.txt
guideline:

  "... it is not allowed for your TX mitigation scheme
   to let TX packets "hang out" in the TX ring unreclaimed
   forever if no new TX packets are sent."

Add a timer to ensure that any time we queue new TX
skbs, we will shortly free them again.

This fixes an easily reproduced hang at shutdown where
iptables attempts to unload nf_conntrack and nf_conntrack
waits for an skb it is tracking to be freed, but virtio_net
never frees it.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agovirtio: Fix typo in virtio_net_hdr comments
Mark McLoughlin [Sun, 8 Jun 2008 10:49:59 +0000 (20:49 +1000)]
virtio: Fix typo in virtio_net_hdr comments

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agovirtio_net: Fix skb->csum_start computation
Mark McLoughlin [Sun, 8 Jun 2008 10:49:00 +0000 (20:49 +1000)]
virtio_net: Fix skb->csum_start computation

hdr->csum_start is the offset from the start of the ethernet
header to the transport layer checksum field. skb->csum_start
is the offset from skb->head.

skb_partial_csum_set() assumes that skb->data points to the
ethernet header - i.e. it computes skb->csum_start by adding
the headroom to hdr->csum_start.

Since eth_type_trans() skb_pull()s the ethernet header,
skb_partial_csum_set() should be called before
eth_type_trans().

(Without this patch, GSO packets from a guest to the world outside the
host are corrupted).

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoehea: set mac address fix
Jan-Bernd Themann [Mon, 9 Jun 2008 14:17:37 +0000 (15:17 +0100)]
ehea: set mac address fix

eHEA has to call firmware functions in order to change the mac address
of a logical port. This patch checks if the logical port is up
when calling the register / deregister mac address calls. If the port
is down these firmware calls would fail and are therefore not executed.

Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agosfc: Recover from RX queue flush failure
Steve Hodgson [Mon, 9 Jun 2008 18:34:32 +0000 (19:34 +0100)]
sfc: Recover from RX queue flush failure

RX queue flush can fail if traffic continues to arrive.  Recover by
performing an invisible reset.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoadd missing lance_* exports
Adrian Bunk [Mon, 9 Jun 2008 22:22:16 +0000 (01:22 +0300)]
add missing lance_* exports

This patch fixes the following build error:

<--  snip  -->

...
  Building modules, stage 2.
  MODPOST 1203 modules
ERROR: "lance_open" [drivers/net/mvme147.ko] undefined!
ERROR: "lance_close" [drivers/net/mvme147.ko] undefined!
ERROR: "lance_tx_timeout" [drivers/net/mvme147.ko] undefined!
ERROR: "lance_set_multicast" [drivers/net/mvme147.ko] undefined!
ERROR: "lance_start_xmit" [drivers/net/mvme147.ko] undefined!
...
make[2]: *** [__modpost] Error 1

<--  snip  -->

Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoixgbe: fix typo
Jeff Kirsher [Mon, 9 Jun 2008 22:57:17 +0000 (15:57 -0700)]
ixgbe: fix typo

Define names were accidently transposed.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoforcedeth: msi interrupts
Ayaz Abdulla [Mon, 9 Jun 2008 23:51:06 +0000 (16:51 -0700)]
forcedeth: msi interrupts

Add a workaround for lost MSI interrupts.  There is a race condition in
the HW in which future interrupts could be missed.  The workaround is to
toggle the MSI irq mask.

Added cleanup based on comments from Andrew Morton.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
15 years agoipsec: pfkey should ignore events when no listeners
Jamal Hadi Salim [Tue, 10 Jun 2008 21:25:34 +0000 (14:25 -0700)]
ipsec: pfkey should ignore events when no listeners

When pfkey has no km listeners, it still does a lot of work
before finding out there aint nobody out there.
If a tree falls in a forest and no one is around to hear it, does it make
a sound? In this case it makes a lot of noise:
With this short-circuit adding 10s of thousands of SAs using
netlink improves performance by ~10%.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agopppoe: Unshare skb before anything else
Herbert Xu [Tue, 10 Jun 2008 21:08:25 +0000 (14:08 -0700)]
pppoe: Unshare skb before anything else

We need to unshare the skb first as otherwise pskb_may_pull may
write to a shared skb which could be bad.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonet pppoe: Check packet length on all receive paths
Herbert Xu [Tue, 10 Jun 2008 21:07:25 +0000 (14:07 -0700)]
net pppoe: Check packet length on all receive paths

The length field in the PPPOE header wasn't checked completely.
This patch causes all packets shorter than the declared length
to be dropped.

It also changes the memcpy_toiovec call to skb_copy_datagram_iovec
so that paged packets (rare for PPPOE) are handled properly.

Thanks to Ilja of the Netric Security Team for discovering and
reporting this bug, and Chris Wright for the total_len check.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoisdn: use simple_read_from_buffer()
Akinobu Mita [Tue, 10 Jun 2008 19:50:14 +0000 (12:50 -0700)]
isdn: use simple_read_from_buffer()

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoisdn divas: fix proc creation
Alexey Dobriyan [Tue, 10 Jun 2008 19:49:31 +0000 (12:49 -0700)]
isdn divas: fix proc creation

1. creating proc entry and not saving pointer to PDE and checking it
   is not going to work.
2. if proc entry wasn't created, no reason to remove it on error path.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agodrivers/atm/eni.h: remove unused macro KERNEL_OFFSET
Pradeep Singh Rautela [Tue, 10 Jun 2008 19:46:52 +0000 (12:46 -0700)]
drivers/atm/eni.h: remove unused macro KERNEL_OFFSET

KERNEL_OFFSET macro in eni.h is not required as it is not used anywhere.
Remove the unused macro from eni.h header file.

Signed-off-by: Pradeep Singh <rautelap@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoinet{6}_request_sock: Init ->opt and ->pktopts in the constructor
Arnaldo Carvalho de Melo [Tue, 10 Jun 2008 19:39:35 +0000 (12:39 -0700)]
inet{6}_request_sock: Init ->opt and ->pktopts in the constructor

Wei Yongjun noticed that we may call reqsk_free on request sock objects where
the opt fields may not be initialized, fix it by introducing inet_reqsk_alloc
where we initialize ->opt to NULL and set ->pktopts to NULL in
inet6_reqsk_alloc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoipv4: Remove unused declaration from include/net/tcp.h.
Rami Rosen [Tue, 10 Jun 2008 19:37:42 +0000 (12:37 -0700)]
ipv4: Remove unused declaration from include/net/tcp.h.

- The tcp_unhash() method in /include/net/tcp.h is no more needed, as the
unhash method in tcp_prot structure is now inet_unhash (instead of
tcp_unhash in the
past); see tcp_prot structure in net/ipv4/tcp_ipv4.c.

- So, this patch removes tcp_unhash() declaration from include/net/tcp.h

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agol2tp: Fix potential memory corruption in pppol2tp_recvmsg()
James Chapman [Tue, 10 Jun 2008 19:35:00 +0000 (12:35 -0700)]
l2tp: Fix potential memory corruption in pppol2tp_recvmsg()

This patch fixes a potential memory corruption in
pppol2tp_recvmsg(). If skb->len is bigger than the caller's buffer
length, memcpy_toiovec() will go into unintialized data on the kernel
heap, interpret it as an iovec and start modifying memory.

The fix is to change the memcpy_toiovec() call to
skb_copy_datagram_iovec() so that paged packets (rare for PPPOL2TP)
are handled properly. Also check that the caller's buffer is big
enough for the data and set the MSG_TRUNC flag if it is not so.

Reported-by: Ilja <ilja@netric.org>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoipv6 netns: init net is used to set bindv6only for new sock
Pavel Emelyanov [Mon, 9 Jun 2008 22:53:30 +0000 (15:53 -0700)]
ipv6 netns: init net is used to set bindv6only for new sock

The bindv6only is tuned via sysctl. It is already on a struct net
and per-net sysctls allow for its modification (ipv6_sysctl_net_init).

Despite this the value configured in the init net is used for the
rest of them.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoirda: net/irda build fix: mcs7780
Ingo Molnar [Mon, 9 Jun 2008 22:47:38 +0000 (15:47 -0700)]
irda: net/irda build fix: mcs7780

-tip testing found the following build error:

  drivers/built-in.o: In function `mcs_receive_irq':
  mcs7780.c:(.text+0x4e429): undefined reference to `crc32_le'
  drivers/built-in.o: In function `mcs_hard_xmit':
  mcs7780.c:(.text+0x4e9af): undefined reference to `crc32_le'

with:

  http://redhat.com/~mingo/misc/config-Sun_Jun__8_22_56_14_CEST_2008.bad

the reason is a missing enablement of the CRC32 library in the Kconfig.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Update version to 3.92.1
Matt Carlson [Mon, 9 Jun 2008 22:41:33 +0000 (15:41 -0700)]
tg3: Update version to 3.92.1

This patch increments the version to 3.92.1.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Fix 5761 WOL
Matt Carlson [Mon, 9 Jun 2008 22:41:12 +0000 (15:41 -0700)]
tg3: Fix 5761 WOL

On 5761 non-e devices, two problems prevent the administrator from
overriding the WOL settings in the device's NVRAM.

The first problem is that GPIO 0 and GPIO 2 have been swapped.  This
change prevented the administrator from turning on WOL when it is
disabled in NVRAM.  The fix is to add a new path for the 5761 that
swaps the two GPIOs in the code as well.

The second problem is that GPIO 1 could not be toggled by the driver
because the GPIO is shared with the debug UART GPIO.  This will prevent
the administrator from being able to turn WOL off if it was enabled in
NVRAM.  The fix is to always disable the debug UART after a GRC reset.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Fix a flags typo
Matt Carlson [Mon, 9 Jun 2008 22:40:26 +0000 (15:40 -0700)]
tg3: Fix a flags typo

This patch fixes a problem where the TG3_FLAG_10_100_ONLY flag was
testing against the wrong flags variable.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotg3: Fix 5714S / 5715S / 5780S link failures
Matt Carlson [Mon, 9 Jun 2008 22:39:55 +0000 (15:39 -0700)]
tg3: Fix 5714S / 5715S / 5780S link failures

The git commit ef167e27039eeaea6d3cdd5c547b082e89840bdd entitled
"Fix supporting flowctrl code" introduced a bug that prevents 5714S,
5715S and 5780S devices from falling back to a forced link mode.  The
problem is that the added flow control check will always fail if flow
control is set to autoneg and either RX or TX (or both) flow control
is enabled.  The driver defaults to setting flow control to autoneg
and advertises both RX and TX flow control.

The fix is to remove the errant check.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoiwlwifi: fix oops in iwl3945_led_brightness_set
Marcin Slusarz [Sun, 8 Jun 2008 11:13:06 +0000 (13:13 +0200)]
iwlwifi: fix oops in iwl3945_led_brightness_set

fix race between:
ieee80211_open->ieee80211_led_radio->led_trigger_event->led_set_brightness->iwl3945_led_brightness_set
(which assumes that "led->priv" is not NULL)
and
iwl3945_pci_probe->iwl3945_setup_deferred_work->(...)->iwl3945_bg_alive_start->iwl3945_alive_start->iwl3945_led_register->iwl3945_led_register_led
which sets priv field in struct iwl3945_led
after
led->led_dev.brightness_set = iwl3945_led_brightness_set;
(...)
led_classdev_register(device, &led->led_dev);

http://kerneloops.org/guilty.php?guilty=iwl3945_led_brightness_set&version=2.6.25-release&start=1671168&end=1703935&class=oops

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Zhu Yi <yi.zhu@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: linux-wireless@vger.kernel.org
Cc: ipw3945-devel@lists.sourceforge.net
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agoinclude/linux/ssb/ssb_driver_gige.h typo fix
Adrian Bunk [Thu, 5 Jun 2008 18:29:49 +0000 (21:29 +0300)]
include/linux/ssb/ssb_driver_gige.h typo fix

This patch fixes a typo in the name of a config variable.

Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Reviewed-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agomac80211: Checking IBSS support while changing channel in ad-hoc mode
Assaf Krauss [Thu, 5 Jun 2008 16:55:21 +0000 (19:55 +0300)]
mac80211: Checking IBSS support while changing channel in ad-hoc mode

This patch adds a check to the set_channel flow. When attempting to change
the channel while in IBSS mode, and the new channel does not support IBSS
mode, the flow return with an error value with no consequences on the
mac80211 and driver state.

Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agomac80211: decrease IBSS creation latency
Dan Williams [Wed, 4 Jun 2008 17:59:34 +0000 (13:59 -0400)]
mac80211: decrease IBSS creation latency

Sufficient scans (at least 2 or 3) should have been done within 7
seconds to find an existing IBSS to join.  This should improve IBSS
creation latency; and since IBSS merging is still in effect, shouldn't
have detrimental effects on eventual IBSS convergence.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agozd1211rw: Fix data padding for QoS
Michael Buesch [Thu, 5 Jun 2008 14:55:10 +0000 (16:55 +0200)]
zd1211rw: Fix data padding for QoS

This patch fixes a data alignment issue in the zd1211rw driver.
The IEEE80211_STYPE_QOS_DATA bit should be used as a bitwise test
to test for the presence of the 2 byte QoS control field.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agomac80211: Fixing slow IBSS rejoin
Assaf Krauss [Wed, 4 Jun 2008 17:27:59 +0000 (20:27 +0300)]
mac80211: Fixing slow IBSS rejoin

This patch fixes the issue of slow reconnection to an IBSS cell after
disconnection from it. Now the interface's bssid is reset upon ifdown.

ieee80211_sta_find_ibss:
if (found && memcmp(ifsta->bssid, bssid, ETH_ALEN) != 0 &&
    (bss = ieee80211_rx_bss_get(dev, bssid,
local->hw.conf.channel->center_freq,
ifsta->ssid, ifsta->ssid_len)))

Note:
In general disconnection is still not handled properly in mac80211

Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agolibertas: fix sleep confirmation
Holger Schurig [Wed, 4 Jun 2008 09:10:40 +0000 (11:10 +0200)]
libertas: fix sleep confirmation

This fixes an issus that made "iwconfig eth1 power on" non-working.
When we get a "PS sleep" event, we have to confirm this to the firmware.
The confirm happens with a command, but this command is special: the
firmware won't send us a response. if_cs_host_to_card() is setting
priv->dnld_sent anyway, so this variable stayed at DNLD_DATA_SENT and
was never cleared back.

Now I put the special knowledge that the CMD_802_11_PS_MODE with
CMD_SUBCMD_SLEEP_CONFIRMED doesn't need to need a response by directly
clearing the dnld_sent state in lbs_send_confirmsleep().

Signed-off-by: Holger Schurig <hs4233@mail.mn-solutions.de>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agomac80211: send association event on IBSS create
Dan Williams [Wed, 4 Jun 2008 03:39:55 +0000 (23:39 -0400)]
mac80211: send association event on IBSS create

Otherwise userspace has no idea the IBSS creation succeeded.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agoipw2200: queue direct scans
Dan Williams [Mon, 2 Jun 2008 21:51:23 +0000 (17:51 -0400)]
ipw2200: queue direct scans

When another scan is in progress, a direct scan gets dropped on the
floor.  However, that direct scan is usually the scan that's really
needed by userspace, and gets stomped on by all the broadcast scans the
ipw2200 driver issues internally.  Make sure the direct scan happens
eventually, and as a bonus ensure that the passive scan worker is
cleaned up when appropriate.

The change of request_passive_scan form a struct work to struct
delayed_work is only to make the set_wx_scan() code a bit simpler, it's
still only used with a delay of 0 to match previous behavior.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agol2tp: Fix possible oops if transmitting or receiving when tunnel goes down
James Chapman [Wed, 4 Jun 2008 22:54:07 +0000 (15:54 -0700)]
l2tp: Fix possible oops if transmitting or receiving when tunnel goes down

Some problems have been experienced in the field which cause an oops
in the pppol2tp driver if L2TP tunnels fail while passing data.

The pppol2tp driver uses private data that is referenced via the
sk->sk_user_data of its UDP and PPPoL2TP sockets. This patch makes
sure that the driver uses sock_hold() when it holds a reference to the
sk pointer. This affects its sendmsg(), recvmsg(), getname(),
[gs]etsockopt() and ioctl() handlers.

Tested by ISP where problem was seen. System has been up 10 days with
no oops since running this patch. Without the patch, an oops would
occur every 1-2 days.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: Fix for race due to temporary drop of the socket lock in skb_splice_bits.
Octavian Purdila [Wed, 4 Jun 2008 22:45:58 +0000 (15:45 -0700)]
tcp: Fix for race due to temporary drop of the socket lock in skb_splice_bits.

skb_splice_bits temporary drops the socket lock while iterating over
the socket queue in order to break a reverse locking condition which
happens with sendfile. This, however, opens a window of opportunity
for tcp_collapse() to aggregate skbs and thus potentially free the
current skb used in skb_splice_bits and tcp_read_sock.

This patch fixes the problem by (re-)getting the same "logical skb"
after the lock has been temporary dropped.

Based on idea and initial patch from Evgeniy Polyakov.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agotcp: Increment OUTRSTS in tcp_send_active_reset()
Sridhar Samudrala [Wed, 4 Jun 2008 22:19:35 +0000 (15:19 -0700)]
tcp: Increment OUTRSTS in tcp_send_active_reset()

TCP "resets sent" counter is not incremented when a TCP Reset is
sent via tcp_send_active_reset().

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoraw: Raw socket leak.
Denis V. Lunev [Wed, 4 Jun 2008 22:16:12 +0000 (15:16 -0700)]
raw: Raw socket leak.

The program below just leaks the raw kernel socket

int main() {
        int fd = socket(PF_INET, SOCK_RAW, IPPROTO_UDP);
        struct sockaddr_in addr;

        memset(&addr, 0, sizeof(addr));
        inet_aton("127.0.0.1", &addr.sin_addr);
        addr.sin_family = AF_INET;
        addr.sin_port = htons(2048);
        sendto(fd,  "a", 1, MSG_MORE, &addr, sizeof(addr));
        return 0;
}

Corked packet is allocated via sock_wmalloc which holds the owner socket,
so one should uncork it and flush all pending data on close. Do this in the
same way as in UDP.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agolt2p: Fix possible WARN_ON from socket code when UDP socket is closed
James Chapman [Wed, 4 Jun 2008 22:07:32 +0000 (15:07 -0700)]
lt2p: Fix possible WARN_ON from socket code when UDP socket is closed

If an L2TP daemon closes a tunnel socket while packets are queued in
the tunnel's reorder queue, a kernel warning is logged because the
socket is closed while skbs are still referencing it. The fix is to
purge the queue in the socket's release handler.

WARNING: at include/net/sock.h:351 udp_lib_unhash+0x41/0x68()
Pid: 12998, comm: openl2tpd Not tainted 2.6.25 #8
 [<c0423c58>] warn_on_slowpath+0x41/0x51
 [<c05d33a7>] udp_lib_unhash+0x41/0x68
 [<c059424d>] sk_common_release+0x23/0x90
 [<c05d16be>] udp_lib_close+0x8/0xa
 [<c05d8684>] inet_release+0x42/0x48
 [<c0592599>] sock_release+0x14/0x60
 [<c059299f>] sock_close+0x29/0x30
 [<c046ef52>] __fput+0xad/0x15b
 [<c046f1d9>] fput+0x17/0x19
 [<c046c8c4>] filp_close+0x50/0x5a
 [<c046da06>] sys_close+0x69/0x9f
 [<c04048ce>] syscall_call+0x7/0xb

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireles...
David S. Miller [Wed, 4 Jun 2008 21:58:13 +0000 (14:58 -0700)]
Merge branch 'master' of /linux/kernel/git/linville/wireless-2.6

15 years agoUSB ID for Philips CPWUA054/00 Wireless USB Adapter 11g
Felix Homann [Thu, 29 May 2008 07:36:45 +0000 (00:36 -0700)]
USB ID for Philips CPWUA054/00 Wireless USB Adapter 11g

Enable the Philips CPWUA054/00 in p54usb.

Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agossb: Fix context assertion in ssb_pcicore_dev_irqvecs_enable
Michael Buesch [Mon, 2 Jun 2008 14:15:23 +0000 (16:15 +0200)]
ssb: Fix context assertion in ssb_pcicore_dev_irqvecs_enable

This fixes a context assertion in ssb that makes b44 print
out warnings on resume.

This fixes the following kernel oops:
http://www.kerneloops.org/oops.php?number=12732
http://www.kerneloops.org/oops.php?number=11410

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agolibertas: fix command size for CMD_802_11_SUBSCRIBE_EVENT
Holger Schurig [Fri, 30 May 2008 12:53:22 +0000 (14:53 +0200)]
libertas: fix command size for CMD_802_11_SUBSCRIBE_EVENT

The size was two small by two bytes.

Signed-off-by: Holger Schurig
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agoipw2200: expire and use oldest BSS on adhoc create
Dan Williams [Thu, 29 May 2008 18:38:28 +0000 (14:38 -0400)]
ipw2200: expire and use oldest BSS on adhoc create

If there are no networks on the free list, expire the oldest one when
creating a new adhoc network.  Because ipw2200 and the ieee80211 stack
don't actually cull old networks and place them back on the free list
unless they are needed for new probe responses, over time the free list
would become empty and creating an adhoc network would fail due to the !
list_empty(...) check.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agoairo warning fix
Andrew Morton [Wed, 28 May 2008 19:40:39 +0000 (12:40 -0700)]
airo warning fix

WARNING: space prohibited between function name and open parenthesis '('
#22: FILE: drivers/net/wireless/airo.c:2907:
+ while ((IN4500 (ai, COMMAND) & COMMAND_BUSY) && (delay < 10000)) {

total: 0 errors, 1 warnings, 8 lines checked

./patches/wireless-airo-waitbusy-wont-delay.patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Dan Williams <dcbw@redhat.com>
Cc: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agob43legacy: Fix controller restart crash
Michael Buesch [Thu, 22 May 2008 15:06:36 +0000 (17:06 +0200)]
b43legacy: Fix controller restart crash

This fixes a kernel crash on rmmod, in the case where the controller
was restarted before doing the rmmod.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
15 years agosctp: Fix ECN markings for IPv6
Vlad Yasevich [Wed, 4 Jun 2008 19:40:15 +0000 (12:40 -0700)]
sctp: Fix ECN markings for IPv6

Commit e9df2e8fd8fbc95c57dbd1d33dada66c4627b44c ("[IPV6]: Use
appropriate sock tclass setting for routing lookup.") also changed the
way that ECN capable transports mark this capability in IPv6.  As a
result, SCTP was not marking ECN capablity because the traffic class
was never set.  This patch brings back the markings for IPv6 traffic.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosctp: Flush the queue only once during fast retransmit.
Vlad Yasevich [Wed, 4 Jun 2008 19:39:36 +0000 (12:39 -0700)]
sctp: Flush the queue only once during fast retransmit.

When fast retransmit is triggered by a sack, we should flush the queue
only once so that only 1 retransmit happens.  Also, since we could
potentially have non-fast-rtx chunks on the retransmit queue, we need
make sure any chunks eligable for fast retransmit are sent first
during fast retransmission.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosctp: Start T3-RTX timer when fast retransmitting lowest TSN
Vlad Yasevich [Wed, 4 Jun 2008 19:39:11 +0000 (12:39 -0700)]
sctp: Start T3-RTX timer when fast retransmitting lowest TSN

When we are trying to fast retransmit the lowest outstanding TSN, we
need to restart the T3-RTX timer, so that subsequent timeouts will
correctly tag all the packets necessary for retransmissions.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosctp: Correctly implement Fast Recovery cwnd manipulations.
Vlad Yasevich [Wed, 4 Jun 2008 19:38:43 +0000 (12:38 -0700)]
sctp: Correctly implement Fast Recovery cwnd manipulations.

Correctly keep track of Fast Recovery state and do not reduce
congestion window multiple times during sucht state.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosctp: Move sctp_v4_dst_saddr out of loop
Gui Jianfeng [Wed, 4 Jun 2008 19:38:07 +0000 (12:38 -0700)]
sctp: Move sctp_v4_dst_saddr out of loop

There's no need to execute sctp_v4_dst_saddr() for each
iteration, just move it out of loop.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosctp: retran_path update bug fix
Gui Jianfeng [Wed, 4 Jun 2008 19:37:33 +0000 (12:37 -0700)]
sctp: retran_path update bug fix

If the current retran_path is the only active one, it should
update it to the the next inactive one.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoMerge branch 'net-2.6-misc-20080605a' of git://git.linux-ipv6.org/gitroot/yoshfuji...
David S. Miller [Wed, 4 Jun 2008 19:10:21 +0000 (12:10 -0700)]
Merge branch 'net-2.6-misc-20080605a' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-fix

15 years agotcp: fix skb vs fack_count out-of-sync condition
Ilpo Järvinen [Wed, 4 Jun 2008 19:07:44 +0000 (12:07 -0700)]
tcp: fix skb vs fack_count out-of-sync condition

This bug is able to corrupt fackets_out in very rare cases.
In order for this to cause corruption:
  1) DSACK in the middle of previous SACK block must be generated.
  2) In order to take that particular branch, part or all of the
     DSACKed segment must already be SACKed so that we have that
     in cache in the first place.
  3) The new info must be top enough so that fackets_out will be
     updated on this iteration.
...then fack_count is updated while skb wasn't, then we walk again
that particular segment thus updating fack_count twice for
a single skb and finally that value is assigned to fackets_out
by tcp_sacktag_one.

It is safe to call tcp_sacktag_one just once for a segment (at
DSACK), no need to call again for plain SACK.

Potential problem of the miscount are limited to premature entry
to recovery and to inflated reordering metric (which could even
cancel each other out in the most the luckiest scenarios :-)).
Both are quite insignificant in worst case too and there exists
also code to reset them (fackets_out once sacked_out becomes zero
and reordering metric on RTO).

This has been reported by a number of people, because it occurred
quite rarely, it has been very evasive. Andy Furniss was able to
get it to occur couple of times so that a bit more info was
collected about the problem using a debug patch, though it still
required lot of checking around. Thanks also to others who have
tried to help here.

This is listed as Bugzilla #10346. The bug was introduced by
me in commit 68f8353b48 ([TCP]: Rewrite SACK block processing &
sack_recv_cache use), I probably thought back then that there's
need to scan that entry twice or didn't dare to make it go
through it just once there. Going through twice would have
required restoring fack_count after the walk but as noted above,
I chose to drop the additional walk step altogether here.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agosunhme: Cleanup use of deprecated calls to save_and_cli and restore_flags.
Mark Asselstine [Wed, 4 Jun 2008 19:06:28 +0000 (12:06 -0700)]
sunhme: Cleanup use of deprecated calls to save_and_cli and restore_flags.

Make use of local_irq_save and local_irq_restore rather then the
deprecated save_and_cli and restore_flags calls.

Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoxfrm: xfrm_algo: correct usage of RIPEMD-160
Adrian-Ken Rueegsegger [Wed, 4 Jun 2008 19:04:55 +0000 (12:04 -0700)]
xfrm: xfrm_algo: correct usage of RIPEMD-160

This patch fixes the usage of RIPEMD-160 in xfrm_algo which in turn
allows hmac(rmd160) to be used as authentication mechanism in IPsec
ESP and AH (see RFC 2857).

Signed-off-by: Adrian-Ken Rueegsegger <rueegsegger@swiss-it.ch>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years ago[IPV6]: Do not change protocol for UDPv6 sockets with pending sent data.
Denis V. Lunev [Wed, 4 Jun 2008 11:49:08 +0000 (15:49 +0400)]
[IPV6]: Do not change protocol for UDPv6 sockets with pending sent data.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV6]: inet_sk(sk)->cork.opt leak
Denis V. Lunev [Wed, 4 Jun 2008 11:49:07 +0000 (15:49 +0400)]
[IPV6]: inet_sk(sk)->cork.opt leak

IPv6 UDP sockets wth IPv4 mapped address use udp_sendmsg to send the data
actually. In this case ip_flush_pending_frames should be called instead
of ip6_flush_pending_frames.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV6]: Do not change protocol for raw IPv6 sockets.
Denis V. Lunev [Wed, 4 Jun 2008 11:49:06 +0000 (15:49 +0400)]
[IPV6]: Do not change protocol for raw IPv6 sockets.

It is not allowed to change underlying protocol for
   int fd = socket(PF_INET6, SOCK_RAW, IPPROTO_UDP);

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV6] NETNS: Handle ancillary data in appropriate namespace.
YOSHIFUJI Hideaki [Wed, 4 Jun 2008 04:02:49 +0000 (13:02 +0900)]
[IPV6] NETNS: Handle ancillary data in appropriate namespace.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV6]: Check outgoing interface even if source address is unspecified.
YOSHIFUJI Hideaki [Wed, 4 Jun 2008 04:01:37 +0000 (13:01 +0900)]
[IPV6]: Check outgoing interface even if source address is unspecified.

The outgoing interface index (ipi6_ifindex) in IPV6_PKTINFO
ancillary data, is not checked if the source address (ipi6_addr)
is unspecified.  If the ipi6_ifindex is the not-exist interface,
it should be fail.

Based on patch from Shan Wei <shanwei@cn.fujitsu.com> and
Brian Haley <brian.haley@hp.com>.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV6]: Fix the data length of get destination options with short length
Yang Hongyang [Wed, 28 May 2008 08:27:28 +0000 (16:27 +0800)]
[IPV6]: Fix the data length of get destination options with short length

 If get destination options with length which is not enough for that
option,getsockopt() will still return the real length of the option,
which is larger then the buffer space.
 This is because ipv6_getsockopt_sticky() returns the real length of
the option.

This patch fix this problem.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV6]: Fix the return value of get destination options with NULL data pointer
Yang Hongyang [Wed, 28 May 2008 08:23:47 +0000 (16:23 +0800)]
[IPV6]: Fix the return value of get destination options with NULL data pointer

If we pass NULL data buffer to getsockopt(), it will return 0,
and the option length is set to -EFAULT:
    getsockopt(sk, IPPROTO_IPV6, IPV6_DSTOPTS, NULL, &len);

This is because ipv6_getsockopt_sticky() will return -EFAULT or
-EINVAL if some error occur.

This patch fix this problem.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV6] ADDRCONF: Allow longer lifetime on 64bit archs.
YOSHIFUJI Hideaki [Tue, 27 May 2008 08:37:49 +0000 (17:37 +0900)]
[IPV6] ADDRCONF: Allow longer lifetime on 64bit archs.

- Allow longer lifetimes (>= 0x7fffffff/HZ) on 64bit archs
  by using unsigned long.
- Shadow this arithmetic overflow workaround by introducing
  helper functions: addrconf_timeout_fixup() and
  addrconf_finite_timeout().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV4] TUNNEL4: Fix incoming packet length check for inter-protocol tunnel.
YOSHIFUJI Hideaki [Fri, 30 May 2008 02:35:03 +0000 (11:35 +0900)]
[IPV4] TUNNEL4: Fix incoming packet length check for inter-protocol tunnel.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV6] TUNNEL6: Fix incoming packet length check for inter-protocol tunnel.
Colin [Mon, 26 May 2008 16:04:43 +0000 (00:04 +0800)]
[IPV6] TUNNEL6: Fix incoming packet length check for inter-protocol tunnel.

I discover a strange behavior in [ipv4 in ipv6] tunnel. When IPv6 tunnel
payload is less than 40(0x28), packet can be sent to network, received in
physical interface, but not seen in IP tunnel interface. No counter increase
in tunnel interface.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV6] ADDRCONF: Check range of prefix length
Thomas Graf [Wed, 28 May 2008 14:54:22 +0000 (16:54 +0200)]
[IPV6] ADDRCONF: Check range of prefix length

As of now, the prefix length is not vaildated when adding or deleting
addresses. The value is passed directly into the inet6_ifaddr structure
and later passed on to memcmp() as length indicator which relies on
the value never to exceed 128 (bits).

Due to the missing check, the currently code allows for any 8 bit
value to be passed on as prefix length while using the netlink
interface, and any 32 bit value while using the ioctl interface.

[Use unsigned int instead to generate better code - yoshfuji]

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years ago[IPV6] UDP: Possible dst leak in udpv6_sendmsg.
YOSHIFUJI Hideaki [Tue, 3 Jun 2008 16:30:25 +0000 (01:30 +0900)]
[IPV6] UDP: Possible dst leak in udpv6_sendmsg.

ip6_sk_dst_lookup returns held dst entry. It should be released
on all paths beyond this point. Add missed release when up->pending
is set.

Bug report and initial patch by Denis V. Lunev <den@openvz.org>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Denis V. Lunev <den@openvz.org>
15 years ago[SCTP]: Fix NULL dereference of asoc.
YOSHIFUJI Hideaki [Thu, 29 May 2008 10:55:05 +0000 (19:55 +0900)]
[SCTP]: Fix NULL dereference of asoc.

Commit 7cbca67c073263c179f605bdbbdc565ab29d801d ("[IPV6]: Support
Source Address Selection API (RFC5014)") introduced NULL dereference
of asoc to sctp_v6_get_saddr in net/sctp/ipv6.c.
Pointed out by Johann Felix Soden <johfel@users.sourceforge.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
15 years agoMerge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik...
David S. Miller [Wed, 4 Jun 2008 18:50:00 +0000 (11:50 -0700)]
Merge branch 'davem-fixes' of /linux/kernel/git/jgarzik/netdev-2.6

15 years agotcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp))
Ilpo Järvinen [Wed, 4 Jun 2008 18:34:22 +0000 (11:34 -0700)]
tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp))

It is possible that this skip path causes TCP to end up into an
invalid state where ca_state was left to CA_Open while some
segments already came into sacked_out. If next valid ACK doesn't
contain new SACK information TCP fails to enter into
tcp_fastretrans_alert(). Thus at least high_seq is set
incorrectly to a too high seqno because some new data segments
could be sent in between (and also, limited transmit is not
being correctly invoked there). Reordering in both directions
can easily cause this situation to occur.

I guess we would want to use tcp_moderate_cwnd(tp) there as well
as it may be possible to use this to trigger oversized burst to
network by sending an old ACK with huge amount of SACK info, but
I'm a bit unsure about its effects (mainly to FlightSize), so to
be on the safe side I just currently fixed it minimally to keep
TCP's state consistent (obviously, such nasty ACKs have been
possible this far). Though it seems that FlightSize is already
underestimated by some amount, so probably on the long term we
might want to trigger recovery there too, if appropriate, to make
FlightSize calculation to resemble reality at the time when the
losses where discovered (but such change scares me too much now
and requires some more thinking anyway how to do that as it
likely involves some code shuffling).

This bug was found by Brian Vowell while running my TCP debug
patch to find cause of another TCP issue (fackets_out
miscount).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetfilter: nf_conntrack_ipv6: fix inconsistent lock state in nf_ct_frag6_gather()
Jarek Poplawski [Wed, 4 Jun 2008 16:58:27 +0000 (09:58 -0700)]
netfilter: nf_conntrack_ipv6: fix inconsistent lock state in nf_ct_frag6_gather()

[   63.531438] =================================
[   63.531520] [ INFO: inconsistent lock state ]
[   63.531520] 2.6.26-rc4 #7
[   63.531520] ---------------------------------
[   63.531520] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
[   63.531520] tcpsic6/3864 [HC0[0]:SC1[1]:HE1:SE0] takes:
[   63.531520]  (&q->lock#2){-+..}, at: [<c07175b0>] ipv6_frag_rcv+0xd0/0xbd0
[   63.531520] {softirq-on-W} state was registered at:
[   63.531520]   [<c0143bba>] __lock_acquire+0x3aa/0x1080
[   63.531520]   [<c0144906>] lock_acquire+0x76/0xa0
[   63.531520]   [<c07a8f0b>] _spin_lock+0x2b/0x40
[   63.531520]   [<c0727636>] nf_ct_frag6_gather+0x3f6/0x910
 ...

According to this and another similar lockdep report inet_fragment
locks are taken from nf_ct_frag6_gather() with softirqs enabled, but
these locks are mainly used in softirq context, so disabling BHs is
necessary.

Reported-and-tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetfilter: xt_connlimit: fix accouning when receive RST packet in ESTABLISHED state
Dong Wei [Wed, 4 Jun 2008 16:57:51 +0000 (09:57 -0700)]
netfilter: xt_connlimit: fix accouning when receive RST packet in ESTABLISHED state

In xt_connlimit match module, the counter of an IP is decreased when
the TCP packet is go through the chain with ip_conntrack state TW.
Well, it's very natural that the server and client close the socket
with FIN packet. But when the client/server close the socket with RST
packet(using so_linger), the counter for this connection still exsit.
The following patch can fix it which is based on linux-2.6.25.4

Signed-off-by: Dong Wei <dwei.zh@gmail.com>
Acked-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoroute: Remove unused ifa_anycast field
Thomas Graf [Tue, 3 Jun 2008 23:37:33 +0000 (16:37 -0700)]
route: Remove unused ifa_anycast field

The field was supposed to allow the creation of an anycast route by
assigning an anycast address to an address prefix. It was never
implemented so this field is unused and serves no purpose. Remove it.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agonetlink: Improve returned error codes
Thomas Graf [Tue, 3 Jun 2008 23:36:54 +0000 (16:36 -0700)]
netlink: Improve returned error codes

Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and
nla_nest_cancel() void functions.

Return -EMSGSIZE instead of -1 if the provided message buffer is not
big enough.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 years agoroute: Mark unused routing attributes as such
Thomas Graf [Tue, 3 Jun 2008 23:36:27 +0000 (16:36 -0700)]
route: Mark unused routing attributes as such

Also removes an unused policy entry for an attribute which is
only used in kernel->user direction.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>