Thread | Brutkey

@amonakov@mastodon.gamedev.place @vbabka@mastodon.social @ljs@mastodonapp.uk @mpdesouza@floss.social @gnutools@fosstodon.org I can feel the pain. Then again, I still missed an important bit on my Ice Lake case. Most likely, it's not L1I aliasing but BPU (Branch Prediction Unit) aliasing. Although the statistics counters don't show any relevant difference between the fast and slow builds, it seems that a single mispredicted branch sends the instruction decoder onto the wrong path, causes a penalty for switching from DSB to MITE and evicts useful information from L1 I-Cache. Unfortunately, I'm unable to confirm this hypothesis, because more tracing also ruins the equilibrium, of course. But if true, it's insane that a single case of bad speculation can cost over 4% in a microbenchmark.