From: Andrew Gallatin Date: Wed, 31 Oct 2007 21:40:06 +0000 (-0400) Subject: Fix myri10ge NAPI oops & warnings X-Git-Tag: v2.6.24-rc2~55^2 X-Git-Url: http://git.openpandora.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=c956a24018819bd903fad0cd275a63c089cdba53;p=pandora-kernel.git Fix myri10ge NAPI oops & warnings When testing the myri10ge driver with 2.6.24-rc1, I found that the machine crashed under heavy load: Unable to handle kernel paging request at 0000000000100108 RIP: [] net_rx_action+0x11b/0x184 The address corresponds to the list_move_tail() in netif_rx_complete(): if (unlikely(work == weight)) list_move_tail(&n->poll_list, list); Eventually, I traced the crashes to calling netif_rx_complete() with work_done == budget. From looking at other drivers, it appears that one should only call netif_rx_complete() when work_done < budget. To fix it, I changed the test in myri10ge_poll() so that it refers to to work_done rather than looking at the rx ring status. If work_done is < budget, then that implies we have no more packets to process. Any races will be resolved by the NIC when the write to irq_claim is made. In myri10ge_clean_rx_done(), if we ever exceeded our budget, it would report a work_done one larger than was acutally done. This is because the increment was done in the conditional, so work_done would be incremented regardless of whether or not the test passed or failed. This would lead to the WARN_ON_ONCE(work > weight); warning in net_rx_action triggering. I've moved the increment of work_done inside the loop. Note that this would only be a problem when we had exceeded our budget. Signed off by: Andrew Gallatin Andrew Gallatin Myricom Inc Signed-off-by: Jeff Garzik --- Reading git-diff-tree failed