Where do packets fit into the reader / writer chain?

Background

I have been working on an implementation of a layer 2-ish protocol. I don’t currently use readers or writers across function boundaries. However, with the new writer interface coming, I think it will be convienient to use and might improve the performance of my existing implementation. The current implementation does some excessive copying when moving bounded arrays around.

The problem is that my protocol is not a streaming protocol. It is a packet protocol. Here is how it works:

I use linux raw sockets. The raw socket interface for linux is:

fn send(bytes: []const u8) !void {
    // NIC sends an ethernet frame with contents `bytes`
    // NIC cannot partially send, it either sends all of it or none of it.
    // always results in a single ethernet frame sent.
    // The frame is delimited by the physical layer. Ethernet frames have a start and an end
    // that can be determined without looking at the data.
}

fn recv(buffer: []u8) !usize {
    // NIC writes to `buffer`. length written is returned as usize
    // NIC cannot partially read, it either reads a full ethernet frame or nothing.
    // Read bytes are always a full ethernet frame, delimited by the physical layer.
}

Lets dream up an example layer-2 protocol:

  1. there are packets. the packets have meaning, you are not allowed to concatenate two packets. They must be delimited. This means you must call send once per packet.
  2. The first byte is the number of “chunks” in the packet, as a u8.
  3. Following the first byte are chunks. Each chunk is 8 bytes long.
  4. Maximum packet length is 1499 bytes (1 + 187 * 8)

Here is a representation of a packet as a struct:

const Packet = struct {
    n_chunks: u8,
    chunks: [][8]u8,
}

Where do packets fit into the reader / writer chain?

Option 1: BufferedWriter -like thing

I write a struct that wraps a raw socket called RawSocketWriter. It exposes the Writer interface:

const RawSocketWriter = struct {

    socket: LinuxRawSocket,
    // this buffer is 1499 bytes because the protocol 
    // defines this as the max packet length
    buffer: [1499]u8, 

    fn writer() Writer {
        // return writer vtable here
        // contaiting function pointer to
        // write()
    }

    fn write(bytes: []u8) !void {
        // inspect / parse the data
        // and call `send()` when we detect the end
        // of a packet.
        // this is only possible because the protocol
        // is length-prefixed.
        // if we are not at the end of a packet yet, 
        // store it into the buffer

    }

}

This is kind of like buffered writer, it parses the data being written to it and calls send() whenever it knows the end of a packet has been reached.

One downside of this interface is that I have to re-parse the data being sent into my writer, which is a waste of CPU.

Option 2: Expose writer but you must write() complete packets.

Expose a writer, but you must call write on the exposed writer interface with complete packets.

This defeats the purpose of writers and would just force having a buffered writer before this writer.

Option 3: Continue not using writers

I currently only use writers to build up individual packets, then I put the packets into linked lists / queues for sending. I can continue doing this but I think there is excessive copying going on.

What do you think?

Some unanswered bits:

  1. I will likely have multiple threads constructing / sending / recving packets.

In the development version std.IO.Writer has been reworked to put a buffer in the interface, and as I understand it this use case would be solved by 1) making sure you have a large enough buffer and 2) only calling into the write function once an entire packet is ready.

Single threaded, this is not a problem at all.

Multithreaded, either guard the writer/reader with a mutex or:

You can get away with not guarding them if you only read/write complete packets, if the buffers can contain them whole, or they fit into an Ethernet frame.

The raw writer/reader doesn’t need an embedded buffer, the new interface contains buffers, it just needs to manage the size of reads/writes appropriately.

The reader would need a buffer large enough to contain a whole Ethernet frame’s data, or a whole multiple of that.