Std.posix.sendmsg handling ETIMEDOUT (errno 110)

Got a weird one here …

I have a server backend that accepts connections, then intermittently writes to the connections. As the clients disconnect, this is normally picked up on the next write, as it returns a write error.

Haven’t seen an issue with this.

Just deployed to an arm64 / linux server, (Debian 13), and under load with clients disconnecting at random … after a while it goes into a crash loop trying to write.

its reporting - UnexpectedErrno (110) in std.posix.sendmsg(), over and over, consuming all CPU

110 == ETIMEDOUT

It happens after a while - I can’t reproduce the error on demand, just takes a while and then it goes into this death loop.

So I have added the case of ETIMEOUT in std.posix.sendmsg(), and added it to the SendMsgError enum … and rebuilt the service. Running it again for a while to try and get it to crash again.

Question then - is it worth putting up a teeny patch to handle Timeout errors on sendmsg, since at least this combo of arm64 + linux on this host seems to be reporting that as a write error.

we discussed similar issue some days ago
possibly it worth to add your request as input to #6389 - fix will be done in this area - at least error will be visible and may be it will be possible to handle it in code

I had simular situation with connect(). Because fix will be done in 0.15.+ but I am using 0.14.1 I added fixed implementation to own code

I can guess (because timeout) that you use blocked sockets.
For this case temporary solution will be replacement sendmsg() with loop of send() (clear disconnect errors)

2 Likes

Excellent reference… I will tag along with that GitHub issue and try and get it resolved through that.

Thx

1 Like

Having run into this issue myself, I found that this was the more “correct” way to clear the error from the socket:

var err_buf: [1]u8 = undefined;
std.posix.getsockopt(
    send_sock,
    std.os.linux.SOL.SOCKET,
    std.os.linux.SO.ERROR,
    err_buf[0..],
) catch |err| {
    log.err("Could not clear Socket Error: {t}.", .{ err });
    // Handle the socket error not clearing (which shouldn't be able to happen to my knowledge)
}

There was a potentially unwanted side effect from using send() again that’s escaping me at the moment. The general idea is the same though!

2 Likes

looks we had similar experience with this part of std code -
I used getsockoptError() and immediatelly failed for already connected socket

.ISCONN => unreachable, // The socket is already connected.

so direct usage of getsockopt() is better idea

2 Likes