I am trying to delete a specific line in a file by specifying the line number but cannot figure out how to.
Hello @RuinedMango
Welcome to ziggit
This is not an easy task. From unix/linux command line sed -i '42d' filename
can delete line 42 from filename.
To do this in zig, a strategy might been to:
- read the entire file in memory, keeping its size.
- locate where is line N and line N+1 (by counting newlines)
- move the memory from the start of line N+1 to the start of line N, reducing the size.
- write the memory back to the file
Useful Functions:
Yep, It is really not very easy.
An alternative might be reading a file line-by-line and output back only those lines that are not in “skip list”.
Here is quick and dirty example:
const std = @import("std");
fn help(prog: []const u8) void {
std.debug.print (
"usage: {s} <line-number-to-skip> < input-file > output-file\n",
.{prog}
);
}
pub fn main() !void {
const prog = std.mem.sliceTo(std.os.argv[0], 0);
if (std.os.argv.len != 2) {
help(prog);
return;
}
const lnstr = std.mem.sliceTo(std.os.argv[1], 0);
const ln = try std.fmt.parseInt(u32, lnstr, 10);
var i = std.io.getStdIn();
var is = i.reader();
var o = std.io.getStdOut();
var os = o.writer();
var buf: [1024]u8 = undefined;
var cnt: u32 = 0;
while (try is.readUntilDelimiterOrEof(&buf, '\n')) |line| {
cnt += 1;
if (cnt == ln) continue;
_ = try os.write(buf[0..line.len]);
_ = try os.write("\n");
}
}
$ cat a.txt
111
222
333
444
555
666
777
888
999
$ ./skip-line 3 < a.txt > b.txt
$ cat b.txt
111
222
444
555
666
777
888
999
EDIT:
By some reason
os.write(line);
does not work (zero output file size)
(
but
std.debug.print(line) works!!!
)
so I used
os.write(buf[0..line.len]);
The something was most likely trying to read and write to the same file. The shell opens both for stdin and stdout at the same time, before launching the program, and opening for stdout truncates the file, so there’s nothing for stdin to read.
Edit:
Ok, then it wasn’t that. This topic is still quite relevant to “how would I delete a specific line in a file”, though.
Btw, hi @RuinedMango! Welcome to Ziggit.
This phenomenon is what makes deleting one line from a file tricky, in fact. If you do the obvious thing: open a file for reading, open the same file for writing, then stream the in file to the out file while skipping the lines you don’t want, it will just delete the file.
That’s because opening it for writing truncates the file, instantly. Which is basically never what you want.
One thing you can do is read the entire file into memory, and then open the same file for output, but we can do better than that. The problem with that approach is that there is a period where your original file only exists in memory: any syscall failure or other bug which brings the program down will destroy the file contents.
The maximally-paranoid way to do it is like this:
- open the file “a_file.txt” for reading
- open a temporary file “tmp/out_maybe_a_timestamp.txt” for writing
- stream the input file to the output file, flush, close the handle for the in file and the out file
- move the in file in the same directory, to “a_file.txt.old”
- move the temp file to the intended location “a_file.txt”
- then delete “a_file.txt.old”
The advantage of this approach is that the worst case outcome is that the new file is stranded in the tmp directory, and the old file is renamed a_file.txt.old
, which is a recoverable situation. A lot of old-school Unix tools do more-or-less exactly this.
It’s a lot of coding just to delete some lines! But it works.
Additional things to consider here are:
- On which platform do you want this? Some OSs treat files as a bunch of bytes (most modern ones), in other (older) OSs a line has a stricter definition.
- Assuming you plan to define “line” based on a line terminator, which terminator applies to your OS?
\n
is the most common one, but it could also be\r
,\r\n
, etc. - What should the behavior be if your file has no real lines? For example, a binary file (an image, sound, movie, etc)?
So, I will restate @dimdin’s initial comment, and say again: this is not an easy task.
I think @dimdin meant that it’s no so easy to delete something from a middle of a bunch of those “somethings”, already stored in a, well… some “storage”, be it a RAM or permanent storage, even with a strict definition of what a "line "actually is.
Take a sheet of paper and write some strings on it.
How to delete the string number X?
My example assumes that files are regular files on some electronic storage with “file system” abstraction above it and that “lines” are sequences of bytes terminated by byte with contents == 0x0A
, of course.
+1 - especially if you are dealing with files that have strings inside of them…
const str = "hello\nworld!"
That string would be counted as 2 lines if we’re going by newline alone. Now, this is clearly not impossible so I don’t want the to discourage the OP from trying. Almost every text editor has to over come this problem. Helix, for instance, knows what line number things are
Are you really sure?..
'\n'
is 0x5C6E
and LF
is 0x0A
pub fn main() !void {
const prog = std.mem.sliceTo(std.os.argv[0], 0);
if (std.os.argv.len != 2) {
help(prog);
return;
}
const lnstr = std.mem.sliceTo(std.os.argv[1], 0);
const ln = try std.fmt.parseInt(u32, lnstr, 10);
var i = std.io.getStdIn();
var is = i.reader();
var o = std.io.getStdOut();
var os = o.writer();
// var o = std.io.getStdOut();
// var bw = std.io.bufferedWriter(o.writer());
// var os = bw.writer();
var buf: [1024]u8 = undefined;
var cnt: u32 = 0;
while (try is.readUntilDelimiterOrEof(&buf, '\n')) |line| {
cnt += 1;
if (cnt == ln) continue;
std.debug.print("line #{}\n", .{cnt});
// _ = try os.write(buf[0..line.len]);
// _ = try os.write(line[0..]);
_ = try os.write(line);
_ = try os.write("\n");
}
}
$ cat 1.txt
const str = "hello\nworld!"
const str = "hello\nworld!"
const str = "hello\nworld!"
$ ./skip-line 100 < 1.txt > 2.txt
line #1
line #2
line #3
Yes, if you use a standard utility then you’ll be good - we both agree on that
What I was referring to is a naive string split if someone wasn’t aware of the edge cases (specifically, if they tried to read a file into a flat buffer and then tried to split up lines with a delimiter). That’s what I was referring to with:
This makes more sense… to me. So that we’re sure it makes sense to everyone, you mean that this:
Hello
World!
is not the same as this:
Hello
World!
↲
Rather than talking about a file which contains this sort of thing as its actual contents:
const str = "hello\nworld!";
// vs.
const str2 = "hello\nworld!\n";
Yes sir
Yes thank you all