of logical flows. Packets for each flow are steered to a separate receive
queue, which in turn can be processed by separate CPUs. This mechanism is
generally known as “Receive-side Scaling” (RSS). The goal of RSS and
-the other scaling techniques to increase performance uniformly.
+the other scaling techniques is to increase performance uniformly.
Multi-queue distribution can also be used for traffic prioritization, but
that is not the focus of these techniques.
an IRQ may be handled on any CPU. Because a non-negligible part of packet
processing takes place in receive interrupt handling, it is advantageous
to spread receive interrupts between CPUs. To manually adjust the IRQ
-affinity of each interrupt see Documentation/IRQ-affinity. Some systems
+affinity of each interrupt see Documentation/IRQ-affinity.txt. Some systems
will be running irqbalance, a daemon that dynamically optimizes IRQ
assignments and as a result may override any manual settings.
same CPU. Indeed, with many flows and few CPUs, it is very likely that
a single application thread handles flows with many different flow hashes.
-rps_sock_table is a global flow table that contains the *desired* CPU for
-flows: the CPU that is currently processing the flow in userspace. Each
-table value is a CPU index that is updated during calls to recvmsg and
-sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage()
+rps_sock_flow_table is a global flow table that contains the *desired* CPU
+for flows: the CPU that is currently processing the flow in userspace.
+Each table value is a CPU index that is updated during calls to recvmsg
+and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage()
and tcp_splice_read()).
When the scheduler moves a thread to a new CPU while it has outstanding
The number of entries in the per-queue flow table are set through:
- /sys/class/net/<dev>/queues/tx-<n>/rps_flow_cnt
+ /sys/class/net/<dev>/queues/rx-<n>/rps_flow_cnt
== Suggested Configuration