In several places, this snippet is used when removing neigh entries:
list_del(&neigh->list);
ipoib_neigh_free(neigh);
The list_del() removes neigh from the associated struct ipoib_path, while
ipoib_neigh_free() removes neigh from the device's neigh entry lookup
table. Both of these operations are protected by the priv->lock
spinlock. The table however is also protected via RCU, and so naturally
the lock is not held when doing reads.
This leads to a race condition, in which a thread may successfully look
up a neigh entry that has already been deleted from neigh->list. Since
the previous deletion will have marked the entry with poison, a second
list_del() on the object will cause a panic:
We move the list_del() into ipoib_neigh_free(), so that deletion happens
only once, after the entry has been successfully removed from the lookup
table. This same behavior is already used in ipoib_del_neighs_by_gid()
and __ipoib_reap_neigh().
Signed-off-by: Jim Foraker <foraker1@llnl.gov> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>