I have been playing around with file reading and found I like using a FixedBufferAllocator over a simple read to buffer since it carries it’s own size not dependant on the buffer it is held in / I don’t have to figure out where the input ends in my buffer manually. But is there a speed penalty for constantly allocating and freeing the FBA?
Standard file read:
const file = try std.fs.cwd().openFile("example_file.txt", .{.mode = .read_only});
var buffer:[200]u8 = [_]u8{0} ** 200;
var buffered_reader = std.io.bufferedReader(file.reader());
const buf_reader = buffer_reader.reader();
while (try buf_reader.readUntilDelimiterOrEof(&buffer, '\n') != null) {
//doing things here
clear_buffer_function(replace all values with 0);
}
With FBA:
const file = try std.fs.cwd().openFile("example_file.txt", .{.mode = .read_only});
var buffer:[200]u8 = undefined;
var fba = std.heap.FixedBufferAllocator.init(&buffer);
const allocator = fba.allocator();
var file_line: ?[]u8 = undefined;
var buffered_reader = std.io.bufferedReader(file.reader());
const buf_reader = buffered_reader.reader();
file_line = try buf_reader.readUntilDelimiterOrEofAlloc(allocator, '\n', 200);
while (file_line != null) {
//do things here
allocator.free(file_line);
file_line = try buf_reader.readUntilDelimierOrEofAlloc(allocator, '\n', 200);
}
I don’t hink there is a meaningful penalty really all an FBA does, is returning a pointer to some array of bytes, and increasing it’s internal index by that amount of bytes plus the alignment. As for the free I’m not sure if this is a no_op, or if it does reset it’s index to the beginning of the array of bytes, but in any case you are looking at a very small penalty at best. Depending on your application I think the FBA, is more resilient in the sense that you can easily change your allocation strategy and add logging capabilities. So even if there is a small but noticeable impact (which I heavily doubt) I think an FBA is still the best approach.
Don’t you think the FBA is a more flexible approach ? like one thing that I believe the FBA helps with, is communicating intent clearly, but of course you are also right, it’s technically more complex. I can also see the flip side of your argument in regards to the error, technically speaking the “running out of memory” part can also be an upside to write something safer right ?
So normally yes that is the case, and what I would do, but I am writing a fasta file parser where there is a standard format where the maximum line size is 120. (started with 200 since I didn’t fully understand if that was the most up-to-date standard)
The reason for an FBA is that when using a standard read to buffer name headers are longer than the lines of gene code and when I don’t do a cleaning operation I get holdover reading of the buffer. So an output would look like
and so on.
Maybe using the capture removes this issue? I haven’t tried that yet, but the FBA was the immediate solution I thought of and was easy enough to implement.
Yes, the part of the buffer that is filled is returned by readUntilDelimiterOrEof as a slice and captured as |line|.
You are always going to get the exact contents of the line.
The buffer is bigger and contains the line followed by garbage.
I see. I still don’t full understand capture groups so that is good to know. I thought it was just something in for loops. Does that mean in any conditional statement that returns a value, that value can be captured?
ex. (ignoring potential syntax errors)
fn is_odd(x:isize) bool {
if (x % 2 == 0) return false;
return true;
}
if (is_odd(x)) |output| {
print("{d} is odd = {any}\n", .{x, output});
}
Bookmarking that for future reference when dealing with this more.
So it looks like more reading for me, but I think I’m getting the use case for captures a bit better.