@ptesarik@infosec.exchange
@amonakov@mastodon.gamedev.place Interesting. Indeed, on a second thought, my explanation doesn't make much sense. But the observed reality is I can reliably get netperf throughput >= 1850 Mbps with CONFIG_RETHUNK=y but <= 1840 Mbps without (and with no other changes to the setup). All with tiny (64-byte) buffer size, so an extremely syscall-heavy workload.
EDIT: Obviously, the memory layout also changes, but I have checked that L1I cache misses are comparable (and approx. 0.1%) this time.
@ptesarik@infosec.exchange
@amonakov@mastodon.gamedev.place But are you 100% certain that BPU cannot predict never-before-seen unconditional jumps?