How to modify a file in-place?

I have a file foo.txt, which has the following content:

line 1
line2
line_3
line-4

I’m trying to remove the text “line2” from the file and save changes to it.

So far I have:

// doesn't work
test "read file into buffer, search and delete text, overwrite file with modified buffer" {
    const input_file = try std.Io.Dir.cwd().openFile(io, "./data/foo.txt", .{ .mode = .read_write });
    defer input_file.close(io);

    var read_buffer: [512]u8 = undefined;
    var reader = input_file.reader(io, &read_buffer);
    const file_reader = &reader.interface;

    var write_buffer: [512]u8 = undefined;
    var writer = input_file.writer(io, &write_buffer);
    var file_writer = &writer.interface;

    while (try file_reader.takeDelimiter('\n')) |line| {
        if (std.mem.eql(u8, line, "line2")) continue;
        try file_writer.print("{s}\n", .{line});
    }

    try testing.expectEqualStrings("line 1\nline_3\nline-4\n", file_writer.buffered());

    try file_writer.print("{s}", .{file_writer.buffered()});
    try file_writer.flush();
}

which passes the test and updates the file to:

line 1
line_3
line-4
line 1
line_3
line-4

instead of:

line 1
line_3
line-4

To overwrite a file in general, you’ll need to either

  1. Write to a new (perhaps temp) file and rename new to old.
  2. Read the old file into memory, then truncate and write the old file by processing the in-memory data.

P.S. Option 2 is simplest. Option 1 may be needed when you don’t want to allocate enough memory to hold the entire file and you can filter line-by-line or a buffer’s worth at at time.

2 Likes

Truncating the file like try input_file.setLength(io, 0); makes it:

  • Empty when declared between const file_reader = &reader.interface; and var write_buffer: [512]u8 = undefined;

  • Like the incorrect output from the above, when declared after the test case

write the old file by processing the in-memory data.

I’ve been looking for an example of this, but haven’t been able to find it

Super-simple example:

var file_lines: std.ArrayList([]const u8) = .empty;
defer file_lines.deinit(std.testing.allocator);

while(try file_reader.takeDelimiter('\n')) |line| {
    if(std.mem.eql(u8, line, "line2")) continue;
    try file_lines.append(std.testing.allocator, line);
}

// Write the original lines back into the file.
// Close it first so we don't get filesystem errors.
input_file.close(io);

// Create a new file to override the old one.
// This will actually open the old file and truncate its length to 0, exactly like we wanted to do.
var output_file = try std.Io.Dir.cwd().createFile(io, "./data/foo.txt", .{});
defer output_file.close(io);

// We create our writer at this point.
var output_buffer: [512]u8 = undefined;
var output_writer = output_file.writer(io, &output_buffer);
var writer = &output_writer.interface;

// Finally, we write the lines back into the file.
for(file_lines.items) |line| {
     try writer.print("{s}\n", .{line});
}
try writer.flush();

I also recommend checking line[0..5] instead of line in your std.mem.eql line, because I tested this code on Windows and the carriage return in my text file was causing the comparison to fail.

2 Likes

cheating.

But, yeah probably there’s no way to remove a line with seeks and writes. Maybe you can only replace it :thinking:

1 Like

One way that could potentially work is setting the file’s length, which is what std.Io is doing anyway to truncate it:

  • Store the std.Io.Reader’s seek at the start and end of the line we want to remove
  • Set the input file’s length to the seek at the start of the line we want to remove
  • Make a std.Io.Writer for the input file, and stream all remaining data from the reader to the writer

I haven’t tested this out, but it should be perfectly possible with std.Io.File.setLength().

EDIT: Actually, having a reader and writer simultaneously active for the same file probably isn’t a very smart idea.

In terms of modifying the file, it produces the expected output but also results in a crash.

complete code
test "modify a file in-place" {
  const input_file = try std.Io.Dir.cwd().openFile(io, "./data/foo.txt", .{ .mode = .read_write });
  defer input_file.close(io);

  var read_buffer: [512]u8 = undefined;
  var reader = input_file.reader(io, &read_buffer);
  const file_reader = &reader.interface;

  var file_lines: std.ArrayList([]const u8) = .empty;
  defer file_lines.deinit(std.testing.allocator);

  while (try file_reader.takeDelimiter('\n')) |line| {
      if (std.mem.eql(u8, line, "line2")) continue;
      try file_lines.append(std.testing.allocator, line);
  }

  // Write the original lines back into the file.
  // Close it first so we don't get filesystem errors.
  input_file.close(io);

  // Create a new file to override the old one.
  // This will actually open the old file and truncate its length to 0, exactly like we wanted to do.
  var output_file = try std.Io.Dir.cwd().createFile(io, "./data/foo.txt", .{});
  defer output_file.close(io);

  // We create our writer at this point.
  var output_buffer: [512]u8 = undefined;
  var output_writer = output_file.writer(io, &output_buffer);
  var writer = &output_writer.interface;

  // Finally, we write the lines back into the file.
  for (file_lines.items) |line| {
      try writer.print("{s}\n", .{line});
  }
  try writer.flush();
}

Error:

thread 100471 panic: reached unreachable code
/home/mishra/source/zig/build/stage3/lib/zig/std/posix.zig:309:18: 0x1063fe9 in close (std.zig)
.BADF => unreachable, // Always a race condition.
^
/home/mishra/source/zig/build/stage3/lib/zig/std/Io/Threaded.zig:8132:35: 0x104d975 in fileClose (std.zig)
for (files) |file| posix.close(file.handle);
^
/home/mishra/source/zig/build/stage3/lib/zig/std/Io/File.zig:313:31: 0x1060e17 in close (std.zig)
return io.vtable.fileClose(io.userdata, (&file)[0..1]);
^
/home/mishra/projects/zero/examples/temp.zig:55:27: 0x102f95b in test.modify a file in-place (temp.zig)
defer input_file.close(io);
^
/home/mishra/source/zig/build/stage3/lib/zig/compiler/test_runner.zig:255:25: 0x11f8162 in mainTerminal (test_runner.zig)
if (test_fn.func()) |_| {
^
/home/mishra/source/zig/build/stage3/lib/zig/compiler/test_runner.zig:70:28: 0x11f4bd2 in main (test_runner.zig)
return mainTerminal(init);
^
/home/mishra/source/zig/build/stage3/lib/zig/std/start.zig:680:88: 0x11f12a7 in callMain (std.zig)
if (fn_info.params[0].type.? == std.process.Init.Minimal) return wrapMain(root.main(.{
^
/home/mishra/source/zig/build/stage3/lib/zig/std/start.zig:190:5: 0x11f0cd1 in _start (std.zig)
asm volatile (switch (native_arch) {
^
error: the following test command terminated with signal ABRT:
.zig-cache/o/70ef4370fddb8c381f2078561b21ed03/test --seed=0x57e1979c

That’s because you’re trying to close a file which has already been closed:
defer input_file.close(io);

// Write the original lines back into the file.
// Close it first so we don't get filesystem errors.
input_file.close(io);

yep, sorry

I would expect:

line 1
line2
line_3
line-4
line 1
line_3
line-4

i.e. the original contents followed by the written contents.
The reason is that there is a single position in the file that reads and writes happen. After reading the entire file the position is at the end of the file, and the writes happen there.

I’m on 0.16.0-dev.2314+9d63dfaa8 if it helps

Isn’t the line invalidated after the next takeDelimiter()? What if it calls rebase()? Test it with a file bigger than the buffer.

Isn’t the line invalidated after the next takeDelimiter()?

No.

What if it calls rebase()? Test it with a file bigger than the buffer.

This does indeed cause issues.

An incredibly simple solution is to just use try std.testing.allocator.dupe(u8, line);, and write an extra routine to free each line before deinitting the std.ArrayList([]const u8).

Documentation reads otherwise:

Invalidates previously returned values

It seems like discussion went to option 2 but just in case. std.Io.File.Atomic is exactly made for usage in option 1. Here is usage of new api #30686 - std: rework atomic file / temp file API - ziglang/zig - Codeberg.org.

2 Likes