Entity name collision and scope of variables, constants and functions

Sorry @vulpesx :slight_smile: I have to disagree again. If I cannot enjoy coding and feel comfortable that it is well organised, with good labels on the wiring, readable welcome mat, etc then I am will not only sell my soul, but I will also make things very uncomfortable for myself in a year when I have to pick it up again.

Again, if I want a house with methods called DecPoint__ContainsRect then I should just stick with C. I am exaggerating, but I think you get the point.

It really confuses me that you keep talking about readability and organisation, yet dislike descriptive names and want a way to use the same name in overlapping contexts. Isn’t that hypocritical?

I think we have very different definitions of organisation and readability.

@chung-leong -

The origin of this was really count. I am doing a big nice opinionated stack-based string implementation (like old pascal). I do counts in everything, whether it be manipulation, indexing etc - all over the show. And in my “newbyness” I used the word count.

As size and len/length already had other implied meaning, my gut told me, create a count function. I was then bowled over by the fact that I could not simply give context by had to change count over many other places (or that the compiler was scope driven, but confused count as a local and count as a function - which I understand why now)… with whatever change tracking etc comes with it.

Readability here is in calling a rope a rope and not a rope_str… lets not go down that path again.

And sadly c is not a count, cnt could cut it, but you make calculations much less readable by having too large variables - which becomes counter productive.

And on the hypocrisy, who are not. Ye be the first stone to cast.

Let’s rest it. You have made your point. I have made my point.

The only way that you are going to convince me totally, is if you come and do my job for me, otherwise I will consider your opinions and test it out, but I kind of already did.

The bottom line still being, I didn’t think this would be an issue in 2025 in something brand new. Please don’t take offense whomever is involved in the language design - you have such a lot going for you that I have chosen Zig over many other candidates to invest my time in for future projects; of which I have not much.

Be kind… :slight_smile: herewith a noob class as per the suggestions ( @vulpesx _arg) that, upon first look, that I can definitely follow for myself.

Data members data start with m_* (snake)
Method members, camelCase (as per reference)
Struct Name, PascalCase (as per reference)
Constant members, camelCase
Parameters, _arg appended (as per @vulpesx - quite like it)
Snake case (as per reference)
Local variable, start with _ then snake case (as per reference)
res (as result) and self are “special”

I think I can live with this. Especially since:
All locals come up when I press underscore.
All parameters are “near natural”
Even though self is involved, m_ for members clear not only avoid collision, but also is clear otherwise.

Not all methods tested, finalized but here is a rough draft example.

const std = @import("std");
const print = std.debug.print;
const math = std.math;

pub const PascalString = struct {
    pub const Self = @This();
    pub const maxSize: u8 = 255;

    m_size: u8,
    m_buffer: [maxSize]u8,
    m_overflow: bool,

    pub fn new(source_arg: []const u8) PascalString {
        var res = PascalString{
            .m_overflow = false,
            .m_size = @intCast(source_arg.len),
            .m_buffer = undefined,
        };

        if (source_arg.len > res.m_buffer.len) {
            res.m_overflow = true;
            res.m_size = @intCast(res.m_buffer.len);
        }
        @memcpy(
            res.m_buffer[0..res.m_size],
            source_arg[0..res.m_size],
        );

        return res;
    }

    pub fn toString(self: *Self) []u8 {
        return self.m_buffer[0..self.m_size];
    }

    fn availableSlice(self: *Self) []u8 {
        return self.m_buffer[self.m_size..];
    }

    pub fn available(self: *Self) u8 {
        return maxSize - self.m_size;
    }

    pub fn indexOf(self: *Self, find_arg: []const u8) isize {
        if (std.mem.indexOf(
            u8,
            &self.m_buffer,
            find_arg,
        )) |pos| {
            return @bitCast(pos);
        } else {
            return -1;
        }
    }

    pub fn count(self: *Self, find_arg: []const u8) usize {
        const res = std.mem.count(
            u8,
            &self.m_buffer,
            find_arg,
        );

        return res;
    }

    pub fn reset(self: *Self) *Self {
        self.m_size = 0;
        self.m_overflow = false;
        return self;
    }

    pub fn dump(self: *Self) void {
        print("\n*size={}", .{&self.m_size});
        print("\n*buffer={}", .{&self.m_buffer[0]});
        print("\noverflow={}", .{self.m_overflow});

        print("\n", .{});
    }

    // pub fn addAny(self: @This(), other: anytype) !*Self {
    //     var buffer: [maxSize]u8 = undefined;
    //
    //     switch (@TypeOf(other)) {
    //         @TypeOf(u32) => try std.fmt.bufPrint(&buffer, "{d}", .{other}),
    //         else => std.debug.panic("Failed", .{}),
    //     }
    //
    //     return self;
    // }

    pub fn addPascal(self: *Self, append_arg: *Self) *Self {
        return self.addStr(append_arg.toString());
    }

    pub fn addStr(self: *Self, append_arg: []const u8) *Self {
        var _count: u8 = @truncate(
            if (append_arg.len > self.m_buffer.len) self.m_buffer.len else append_arg.len,
        );

        // Make a backup of current size before ruining it
        const _target = self.m_size;

        if (self.m_size + append_arg.len > self.m_buffer.len) {
            _count = @as(u8, self.m_buffer.len) - self.m_size;
            self.m_overflow = true;
            self.m_size = @intCast(self.m_buffer.len);
        } else {
            self.m_size += @as(u8, @intCast(append_arg.len));
        }
        @memcpy(
            self.m_buffer[_target..self.m_size],
            append_arg[0.._count],
        );

        return self;
    }

    // Cases:
    // 1. index > size, just append the string
    // 2. other overflows self from the index
    // 3. other overflows maxSize
    pub fn overtype(self: *Self, overtype_arg: []const u8, index_arg: u8) *Self {
        if (index_arg > self.m_size) {
            return self.addStr(overtype_arg);
        }

        const _end_offset = @as(usize, index_arg) + overtype_arg.len;
        var _copy_size: u8 = @truncate(overtype_arg.len);

        if (_end_offset > self.m_size) {
            if (_end_offset < maxSize) {
                self.m_size = @truncate(_end_offset);
            } else {
                _copy_size = maxSize - index_arg;
                if (self.m_size < maxSize) {
                    self.m_overflow = true;
                    self.m_size += _copy_size;
                }
            }
        }

        @memcpy(
            self.m_buffer[index_arg .. index_arg + _copy_size],
            overtype_arg[0.._copy_size],
        );

        return self;
    }

    ///
    pub fn insertStr(self: *Self, source_arg: []const u8, index_arg: u8) *Self {
        if (index_arg > self.m_size) {
            return self.addStr(source_arg);
        }

        var _buffer: [maxSize]u8 = undefined;

        // This is size safe
        @memcpy(
            _buffer[0..index_arg],
            self.m_buffer[0..index_arg],
        );

        // other copy should not go over right side
        const _length = if (index_arg + source_arg.len > maxSize)
            maxSize - index_arg - source_arg.len
        else
            source_arg.len;

        @memcpy(
            _buffer[index_arg .. index_arg + _length],
            source_arg[0.._length],
        );

        // Tail copy
        const _tail_offset = index_arg + _length;

        if (_tail_offset < maxSize) {
            const _available = maxSize - _tail_offset;
            const _residual = self.m_size - index_arg;

            const _current = index_arg + _length;
            const _tail_size = if (_residual > _available) _available else _residual;

            @memcpy(
                _buffer[_current .. _current + _tail_size],
                self.m_buffer[index_arg .. index_arg + _tail_size],
            );
        }

        if (@as(usize, self.m_size) + source_arg.len > maxSize) {
            self.m_overflow = true;
            self.m_size = maxSize;
        } else {
            self.m_size += @truncate(source_arg.len);
        }

        // Copy back the temp buffer to self.buffer
        @memcpy(
            self.m_buffer[0..self.m_size],
            _buffer[0..self.m_size],
        );

        return self;
    }

    pub fn left(self: *Self, len_arg: u8) []u8 {
        const res = self.m_buffer[0..if (len_arg > self.m_size) self.m_size else len_arg];
        return res;
    }

    pub fn right(self: *Self, len_arg: u8) []u8 {
        const l = if (len_arg > self.m_size) self.m_size else len_arg;
        const res = self.m_buffer[self.m_size - l .. self.m_size];
        return res;
    }

    pub fn deleteIter(self: *Self, index_arg: u8, count_arg: u8) *Self {
        var _offset: u8 = 0;

        for (index_arg + count_arg..self.m_size) |source| {
            if (source < self.m_size) {
                self.m_buffer[index_arg + _offset] = self.m_buffer[source];
                _offset += 1;
            }
        }

        self.m_size = index_arg + _offset;
        return self;
    }

    /// 1234567890  delete(5, 3)
    ///             HEAD: copy(temp, self.buffer[0..5]
    /// tail_from = 5+3 = 8
    /// tail_to = 10  (self.size)
    ///             TAIL: copy(temp, self.buffer[tail_from..tail_to]
    /// tail_size = 10 - 8 = 2
    /// 1234567890  delete(5, 8)
    ///             HEAD: copy(temp, self.buffer[0..5]
    /// tail_from = 5+8 = 13
    /// tail_to = 10 (self.size)
    ///             TAIL: No Copy as tail_to > tail_from
    /// tail_size = 0
    pub fn delete(self: *Self, index_arg: u8, count_arg: u8) *Self {
        if (index_arg > self.m_size or count_arg == 0)
            return self;

        var temp: [maxSize]u8 = undefined;
        // Copy Head
        if (index_arg > 0) {
            @memcpy(temp[0..index_arg], self.m_buffer[0..index_arg]);
        }

        const tail_from = @as(u16, index_arg) + @as(u16, count_arg);
        const tail_to = self.m_size;
        var tail_size: u8 = undefined;

        if (tail_from < tail_to) {
            tail_size = @truncate(tail_to - tail_from);

            @memcpy(
                temp[index_arg .. index_arg + tail_size],
                self.m_buffer[tail_from..tail_to],
            );
        } else {
            tail_size = 0;
        }

        // Copy Tail
        @memcpy(
            self.m_buffer[0..self.m_size],
            temp[0..self.m_size],
        );

        self.m_size = index_arg + tail_size;

        return self;
    }

    pub fn clear(self: *Self) *Self {
        self.m_size = 0;
        return self;
    }

    pub fn sprintf(self: *Self, comptime format_arg: []const u8, any_arg: anytype) !*Self {
        const _str = try std.fmt.bufPrint(self.availableSlice(), format_arg, any_arg);
        self.m_size += @truncate(_str.len);
        return self;
    }
};


For many things that garner this reaction, I find there are others that will have the complete opposite view. Imagine this quote:

Personally, I don’t think either stance is that useful/persuasive on its own.

5 Likes

I agree. That said, any argument that you have a reasonably cheap workaround for is not that feasible, as being low level, it is functionality over aesthetics.

See my example above… I am sure most will hate it. I can live with it.

We should all learn Finnish. Then we can use nominative for struct fields, genitive for pointers, partitive for slice elements, and accusative plus the rest for function arguments.

2 Likes

You’re right, I do hate it :3

You’re taking this whole name thing to quite the extreme.
The only name conflicts that would have been encountered in that code are regarding count, available and overtype

These are all possible names, not all of these suggestions are needed to avoid conflicts

  • availableunusedCapacity
  • countcountStr or countSubStr
  • overtypereplace
  • in addStr, _countlen or length.
  • in deleteIter, count_argsource_offset
  • deleteIterreplaceFromSelf
  • in delete, count_argnum or n
  • in overtype, overtype_argstr, string or text
  • in insert, _availableremaining, unused

I think some of the names for other things are also too vague/unclear, but that’s beyond the scope of name conflicts.

You don’t have to follow my suggestions, but I thought this might make what I’ve said more clear

3 Likes

Agree with most, thanks. There are a few method names, lingo that I inherit from years of too much Java/Kotlin and DotNet… And like most habits (some bad) - it sticks with me.

overtype → replace , :slight_smile: Now that seems like a no brainer at first until your realize that replace is likely going to be a find/replace instead of a directly “typing over” it.

avail/available… vs unusedCapacity… Java
count vs countStr → agreed, again a remnant of my habit of having function overloading - which I don’t have anymore. But unless there are different variations -really again not much in it. The same issue with replace, with function overloading it would not have mattered.

@vulpesx, if conventions are extreme then so be it. I will give this a test run at night for a week.

PS: deleteIter is simply a temporary test method to test speed difference in looping vs memcpy on different platforms

overtype - pascal calls it StuffString → which I even like less.

And no - I am not advocating for function overloading in Zig. I, even as a week old baby, understand that with the comptime “generation” of functions that function overloading would not only be near impossible but simply a very bad idea.

i would not at all expect it to find/replace, because its not called findReplace, it also takes an index, which would make no sense if it did a find/replace

Sorry -maybe you misunderstood me.

pascal.replace(find_arg , replace_arg) as the simplest form of find/replace of occurrences of a string in another.

But now we are splitting hair. Thanks again for your feedback.

I think naming struct fields should be the simplest possible.
If functions or arg names collide there is always a simple solution.
And i think _arg is not a bad idea.

Just to continue a bit on readability…

I never ever use prefixes like m_thingy.
(what on earth is “m_” thinks the brain).

I never ever use camelcasing except for Types.
(ok, we need something to distinguish our types from vars).
The old pascal writers used “T” for each type, which is actually not so bad.

appendAssumeCapacity
is less readable than
append_assume_capacity
The eyes have to MoVeOveRtHeHiLls.
The brain has to waste energy to separate the 3 words.

This I find readable:

pub fn index_of(self: *Self, find_arg: []const u8) isize 
{
    if (std.mem.indexOf(u8, &self.buffer, find_arg)) |pos| 
    {
        return @bitCast(pos);
    } 
    else 
    {
        return -1;
    }
}

This hurts my eyes and brain:

pub fn indexOf(self: *Self, find_arg: []const u8) isize {
    if (std.mem.indexOf(
        u8,
        &self.m_buffer,
        find_arg,
    )) |pos| {
        return @bitCast(pos);
    } else {
        return -1;
    }
}

apologies already for my little rant :slight_smile:

why on earth are you using -1 to signify no index, the only valid reason I can think of would be to limit the size of the return type in a constrained environment, but even that’s a stretch imo.
the downsides to doing such a thing are:

  • removes type safety
  • limits the maximum returned index (ok so you can limit that to usize_max - 1 by casting it back to usize if its not -1, but still…)

edit: i have realised this is from gkell’s example code but still…

1 Like

I find using m_ in Zig really ugly, in C++ it is already ugly, but there I can sort of understand it, because fields and local variables are intermixed completely, because you don’t need this. to access fields in C++.

Using m_ in Zig just to avoid having to pick different names for fields and function names, seems like taking it to far.

It is fine to have data-structures that have a buffer and len (instead of m_size) field and then you can just use unusedCapacity (reusing naming conventions from the standard library) for the available method.

availableSliceunusedCapacitySlice standard library convention

I would call toString just slice let it take an anytype and accept both *Self and *const Self returning a []u8 or []const u8 respectively.

indexOf, count and possibly other methods seem like they could instead be in some other namespace and operate on just a given slice.

Instead of count_arg I would just call it amount.
index_argindex / idx / i
newinit (convention)

7 Likes

To elaborate on what @vulpesx said, returning -1 is a terrible ideia. It’s a magic number, and the caller needs to just know that that is the signal for “missing item”. At the very least, you should give it a public name, and document that in function doc.
Looking just at the function signature and documentation, it doesn’t say anything about missing elements. Reading this, I would assume that it’s undefined behavior for the searched element not exist in the string.
A much better ideia is to use Zig’s optional type. ?isize would not be great, because it would take an extra 64 bits just to store that single bit of information. But you have lots of leftover bits that you can use.
The resulting index is never going to be larger than max_size - 1, so here’s what I suggest:

const max_index = max_count - 1;
const Index = std.math.IntFittingRange(0, max_index);

pub fn indexOf(self: *Self, find_arg: []const u8) ?Index
1 Like

Also forgot to mention, you can’t index with isize, indexing requires usize because you can’t use a negative index.

1 Like