Working with "strings" in zig?

Hey all. Newbie here from a python / go background. I am currently using advent of code as a way to dive into learning zig. I completely understand that the language isn’t 1.0 yet, and that the documentation is all on a best effort at the moment and things change quickly. I also found multiple github issues (that were talking about nicer ways to work with strings) where andrew has explicitly mentioned that a string type will not be coming to zig. Fair enough. So with that in mind, im looking some advice for how someone who has very little experience or knowledge of using pointers or strings as numbers in memory locations, is able to grok how to work with “strings” in zig.

A little context for you (try not to read too much into this code snippet, im not asking for you to solve my AOC problem, im just trying to give context around how my brain works and the current state of my zig knowledge)

const std = @import("std");

pub fn main() !void {
    const file = try std.fs.cwd().openFile("file.txt", .{ .mode = .read_only });
    defer file.close();
    var buffer: [std.mem.page_size]u8 = undefined;
    var sum: u32 = 0;

    _ = try file.read(&buffer);
    var splits = std.mem.split(u8, &buffer, "\n");
    while (splits.next()) |line| {
        var first: u32 = 0;
        var last: u32 = 0;
        std.debug.print("{s}", .{line});
        for (line) |char| {
            if (first == 0) {
                first = char;
            }
            last = char;
            std.debug.print("{any}\n", .{@TypeOf(char)});
        }
        sum += (first + last);
    }
    std.debug.print("{any}", .{sum});
}

This code outputs the following (for one iteration of the split file)

pppgfivesu8
u8
u8
u8
u8
u8
u8
u8
u8
u8
21815

The intent of the code is to try and read a file full of lines of text which contain numbers, and add up the first and last numbers of each line into an end sum. But again, lets not dwell on the problem too much.

So even though im a complete novice, within an hour i was able to more or less get the above code compiling and running, purely just based on google searching, and reading the very good comments in the zig source (particularly how to use print and formatting, that was very easy to find and understand the in source comments for! likewise for opening a file, it was easy enough to find what the heck the flags are for a file, haha). But, i’ve been stuck on this string problem for 4 days now, because this isn’t intuitive to me at all, and i can’t really find any good sources explaining exactly how to manage strings in zig.

The code output is clearly showing me what im iterating over is NOT a char, its a number (pointer? ascii character number?). and so ofcourse my sum is not going to add up correctly, because every char itself is some number, but also the actual int number in the string, isnt the value i expect (eg 8 == int(8)).

So with all of the above said, its clear that i fundamentally do not understand what strings are in zig, and do not know how to work with them, and how to build a string, or look for a specific character or character representing a number inside of a string, and how to operate on that. What is the best way to go about learning this? Is there any clear learning guides i’ve missed you can point me to? or is this some completely different topic i need to learn first? is this not understanding how types work for example?

any help you can give me would be greatly appreciated!

1 Like

Hey, I’m also someone with a Go background who’s been doing some AoC in Zig.

For general learning resources, I’d recommend checking out Ziglings if you haven’t already. I found it helpful to go through every exercise once, but I think exercises 6, 7 and 76 might be more helpful to get a better idea of what strings are. Also, if you have a question on a specific topic, sometimes searching Zig News for the topic can be fruitful. It has a bunch of articles aggregated from the community. Searching it for “strings” led me to this article by @dude_the_builder which looks like exactly what you’re looking for! Beware that Zig changes pretty fast, so the older the article is, the more likely it is to be out of date.

(edit: I forgot to mention, the Learn section of the official website has a bunch of stuff. I reference Zig Learn and the Zig Language Reference frequently).

For your code here, you are adding the ASCII code points (technically Unicode code points, but in this case they’re equivalent) of the characters instead of the integers themselves, because each line you’re iterating over is just a slice of bytes (kind of like strings in Go) Check out std.fmt.parseInt. Try using it to convert first and last to ints.

8 Likes

Hi ben. Thanks for the reply!

I had a quick look over some of the code of ziglings and that looks really useful! I will absolutely have a go at that. I’ll take your advice and skip ahead to problem 6 and 7, but in general will see how far i can get through that.

I should have mentioned that i’ve absolutely been reading the language reference, infact a few times i’ve run through most of it now, but i definitely missed the parseint function from the fmt package (wrong terminology?).

As for the problem at hand, right i figured thats probably whats happening here, in the sense that those numbers im getting out of the lines is an ascii number. This has been great information and i really appreciate you replying. I think i have some great information here that will give me a better understanding of working with strings in zig and how to move forward with my code.

Thanks so much!

1 Like

Note that if you know a single ASCII character is a digit (std.ascii.isDigit), then you can just do a subtraction to get its int value:

const digit_int_value = ascii_digit_as_u8 - '0';
4 Likes

Some shameless plugs here but they do have to do with strings and Zig:

  1. Zigstr - A UTF-8 string data structure. It maebe overkill for simple cases like this AoC pruzzle, but maybe looking at the source code will further help you see how to manipulate strings in Zig.
  2. Ziglyph - Unicode text processing for Zig. This library provides many functions for processing Unicode text, like detecting letters, numbers, segmentation, etc.
  3. Strings episode of Zig in Depth - A video on the basics of strings in Zig.
4 Likes

Thank you all for the further replies! I really appreciate giving code examples of how to do something and posting further learning materials, its all great!

I would just like to additionally call out posting about zig.news. I did not come across that site from my own google searches, and having had that pointed out in the first reply, i’ve since been poking around there a fair bit and there is huge amounts of valuable learning material there!

So thanks again this has been super valuable to me!

3 Likes

This absolutely worked for my use case. But why? What is subtracting 0 in quotes actually doing against the ascii number?

just because ASCII code of ‘0’ is 0x30.

Single quotes mean “give me the ASCII code for this character”. In ASCII, all digits are sequential, starting from 0x30, which is ‘0’.
Another trick, the distance between a lower-case letter and its capital-case variant is always 0x20. So to turn capital into lower case, add 0x20, and for the opposite subtract 0x20.

1 Like

Or use std.ascii.toLower and std.ascii.toUpper.

2 Likes

I think it was also made so that just by flipping a bit you can switch between upper and lower, so 'A' ^ 32 == 'a' and 'a' ^ 32 == 'A' .

7 Likes