High performance often means breaking away from the abstraction.
At my day job we have a server whose sole job is to relay TCP connections from node to node (direct connections are not possible, it’s a mobile network, and discussions with the operator have indicated that it’ll cause more trouble than it’s worth to allow them). It’s written in Zig, but we drop down to native FreeBSD interfaces rather than using Io in order to do zero copy IO with SO_SPLICE (potentially even at the NIC level with the right card).
I don’t think it’s fair to expect std.Io max out performance on every OS (even ignoring the questions of latency vs throughput and what “max performance” means)