The issue I have: network protocol requires to specify payload length in header. Payload could vary in size and not easily countable from a particular request structure. So my idea is to use array list to write data to, write its length to the header and use real writer to push header and array list.
std.Io.Writer.Allocating
If your protocol has a max packet size you could use the std.Io.Writer.fixed backed by an array
It doesn’t. Currently I have own ugly writer interface (not compatible with std.io) with counting and fixed implementations and do encoding in two passes: first writes nothing but counts final length, second does an actual write to allocated buf with required capacity.
If I’m reading this correctly, do you mean you want something like
var data = std.Io.Writer.Allocating.init(gpa);
Then, once you have filled data with your unknown amount of data, use data.written().len and data.written() to get your size and payload packed to a second writer which is actually writing onto the wire? If so, another thought is that your data, and thus it’s allocator, could be reused, potentially, from payload to payload, and perhaps you could reuse allocated space that way (e.g., with an arena).
(i.e., using data.clearRetainingCapacity(), and so on)
var aw: std.Io.Writer.Allocating = .fromArrayList(gpa, self);
... = aw.toArrayList();
also exists
Omg. I’m blind. Thank you!
If so, another thought is that your
data, and thus it’s allocator, could be reused, potentially, from payload to payload
Exactly. That’s the idea. Array list is just grows to max sent message size but for smaller payloads it doesn’t need a new allocation.
You dont need an arena, both the allocating writer and arraylist have a clearRetainCapacity function to do exactly that.
Huh, I’m suddenly wondering if along with std.Io.Writer.Allocating we could create another new implementation that just measures it, like a “std.Io.Writer.Measuring”.
Not that I’ve ever needed to calculate my allocations that exactly, I can’t think of pretty much anything where it’d be worth it in my code but it’s an interesting idea.
I’m not sure I’m following. .len() will always report the bytes written, even if there was no allocation necessary (if the allocated space was already sufficiently large). So, you could reserve plenty of space, if you knew how much you needed, and never have an allocating write(). But the OP didn’t know how much memory was going to be needed, so the allocating writer made sense. .len was useful so he could pack the payload size in front of the payload, for the receiving end.
I could confirm, the end result is perfect. It’s a single pass encode, and has zero (amortized) allocations. There is no need in separate counting writer. Thank you again for help!
Annotated real code:
// self.write_buf is an allocating writer
self.write_buf.clearRetainingCapacity(); // reset
try request_value.encode(&self.write_buf.writer); // encode payload
const len = 4 + self.write_buf.written().len; // use payload length
const pad = wire.pad4(len);
// write header, `writer` is a real writer
try writer.writeByte(opcode);
if (Request.extension == null) {
try writer.writeByte(request_value.headerByte1());
} else {
try writer.writeByte(Request.opcode);
}
try writer.writeInt(u16, @intCast((len + pad) / 4), .native); // write packet length
// write payload
try writer.writeAll(self.write_buf.written());
try writer.splatByteAll(0, pad);
Also it has a nice symmetry with read side using std.Io.Reader.appendExact (also using ArrayList). Later could be changed to limited reader though.
Though this might be considered a good case for the question, “do I really need to?” (since your last operations are the likes of writeAll(), which calls drain repeatedly…). Others may weigh in with more sage advice, but, I think:
- “The default flush implementation calls drain repeatedly until
endis zero, however it is legal for implementations to manageenddifferently.” from here – since it is possible for implementations to implement something different from the default, it’s probably good to callflush()regardless. - You might change your code, e.g., by putting one last
writeByte()in, and you’d have to make this change then. I think: "might as well follow the pattern; get used to placing (and looking for)flush()at the tail of a write sequence for which you really are “finished”.
There is a conditional flush
zix11/src/protocol.zig at 653afe4c5cc7a4978fbfd53f851bc502ee75ac01 · baverman/zix11 · GitHub
Protocol in general supports bached requests and it’s up to client to decide how to pack it into an actual transport frame.
It’s a silly idea and not a serious suggestion, I’m not even sure it would work, but I’ll spell it out more clearly: Basically, in response to an idea of above of “first figure out the exact total length of the allocation (without allocating anything) then allocate it once and write it”, I’m imagining a new “Writer” class that is compatible with the interface but does no writes at all, only length measurement. So you could write it to that fake length-calculating one, collect the length measurement, allocate it once, and then write it to that fixed buffer.
std.Io.Writer.Discarding tracks how much was written.
But using it is discouraged as writing often involves more expensive computation to get the data to be written, that pattern would duplicate that work and add to it with runtime vtable calls.
Instead, it is usually better to either calculate the size upfront, with dedicated code (not a writer), or dynamically grow.
The above is a good default, but it is always best to benchmark multiple solutions if you need maximum performance.
That makes perfect sense, thanks!