New LinkedList API footgun

You can read here about Zig new LinkedList API

In previous incarnation LinkedList was generic

test "basic SinglyLinkedList test" {
    const L = SinglyLinkedList(u32);
    var list = L{};

    try testing.expect(list.len() == 0);
    ...................................................

It means that during compilation you got error for non-compatible type.

New SinglyLinkedList completely anonymous, with requirement in the comment:

//! * Homogenous elements.

It’s comment only, it is not forced (compiler does not produce any warning/error - it simply has not any information).

Funny example:

const L = struct {
    data: u32,
    node: SinglyLinkedList.Node = .{},
};
var ll: L = .{ .data = 1234567 };

fn collectFromVendorA(list: *SinglyLinkedList) void {
    list.prepend(&ll.node);
}

const M = struct {
    data: u8,
    node: SinglyLinkedList.Node = .{},
};

var mm: M = .{ .data = 255 };

fn collectFromVendorB(list: *SinglyLinkedList) void {
    list.prepend(&mm.node);
}

test "additional" {
    var list: SinglyLinkedList = .{};
    try testing.expect(list.len() == 0);

    collectFromVendorB(&list);
    collectFromVendorA(&list);

    try testing.expect(list.len() == 2);

    const node = list.popFirst().?;

   // !!!! Actually it's *L !!!
    const x: *M = @fieldParentPtr("node", node);
    _ = x;
}

Looks that this change increases the likelihood of error :joy:

2 Likes

I like the new API, but you do have a point here.

Perhaps there should be an additional generic wrapper type you can use instead of SinglyLinkedList, which provides methods that safely act on an underlying SinglyLinkedList:

pub fn TypedSinglyLinkedList(comptime NodeParent: type, comptime node_field_name: []const u8) type {
    return struct {
        list: std.SinglyLinkedList,

        const Self = @This();

        pub const init: Self = .{ .list = .{} };

        pub fn prepend(self: *Self, new_node: *NodeParent) void {
            self.list.prepend(&@field(new_node, node_field_name));
        }

        pub fn remove(self: *Self, node: *NodeParent) void {
            self.list.remove(&@field(node, node_field_name));
        }

        pub fn popFirst(self: *Self) ?*NodeParent {
            const res_node = self.list.popFirst() orelse return null;

            return @fieldParentPtr(node_field_name, res_node);
        }

        pub fn len(self: *Self) usize {
            return self.list.len();
        }
    };
}
3 Likes

It is definitely a footgun, yeah.
Also, it looks like a perfect case for unions to me.

const ML = struct {
    data: union { m: u8, l: u32 },
    node: SinglyLinkedList.Node = .{},
};
1 Like

Actually it’s functionality of old SinglyLinkedList.

If the desire for change was so strong, why not call the new implementation UntypedSinglyLinkedList , while keeping the old one?

void* forever

The wrapper @milogreg made is still intrusive, the old implementation wasn’t intrusive.

because then there would be two ways two obvious ways to make a linked list which goes against the zig zen: Only one obvious way to do things.

The zig team has chosen the implementation that allows more control and optimisation over the implementation that is more type safe.

Possibly off the topic, but we are talking about LinkedList.

What happened in these tests?

test "SinglyLinkedList test1" {
    const L = std.SinglyLinkedList(u32);
    var list = L{};

    try testing.expect(list.len() == 0);

    var one = L.Node{ .data = 1 };

    list.remove(&one);
}

test "SinglyLinkedList test2" {
    const L = std.SinglyLinkedList(u32);
    var list = L{};

    try testing.expect(list.len() == 0);

    var one = L.Node{ .data = 1 };
    list.prepend(&one);

    var two = L.Node{ .data = 2 };
    list.remove(&two);
}

The same result we will get in the new implementation.

I checked SinglyLinkedList related issues on github
And found one - SinglyLinkedList remove function is missing documentation

Now I understand the reason :joy:

They should have the same result as they are doing the same thing: trying to remove a node that isn’t in the list.

Exactly the same

the following command terminated with signal 6

Check the implementation:

        /// Remove a node from the list.
        ///
        /// Arguments:
        ///     node: Pointer to the node to be removed.
        pub fn remove(list: *Self, node: *Node) void {
            if (list.first == node) {
                list.first = node.next;
            } else {
                var current_elm = list.first.?;
                while (current_elm.next != node) {
                    current_elm = current_elm.next.?;
                }
                current_elm.next = node.next;
            }
        }

I happen to agree with that assessment, which is (part of) why I wrote zelda, which does not have this issue.

Btw:

/// Find and remove `node` from the list.  This compares pointers,
/// not values.  It is valid  to 'remove' a node which is not in
/// the list, which does not make it a good idea.  If the node is
/// found in the list, the 'next' field will be `null`, if it is
/// not, the field will not change.
pub fn remove(list: *SinglyLinkedList, node: *Node) void {
2 Likes

positive result of discussion - #24373

btw I opened bug issue , but it was almost immedialy closed with comment:

Working as designed. This function asserts that node is in list, so violating this triggers Illegal Behavior.

Is it usual reaction?

Hi g41797 :smiley:

I’m am little confused by the wording in your issue. The illegal behavior you are hitting is “calling remove on empty LinkedList” but you talk about removing a node not present in the LinkedList.

My first take away from this is that the remove function need a comment that precisely states the behavior.

The current implementation crashes when the list is empty or if the element is not in the list, so both are IB at the moment. (I think mlugg got these two mixed up when he closed your issue)

IMHO, a PR that adds a nice comment and an explicit checks for both IB with @panic and an explaining message would be a great improvement to the current remove function.

EDIT: silly me read the code wrong. remove panics in both scenarios

My impression is that you are supposed to use the data structure in such a way that you don’t try to remove nodes from a list that doesn’t contain them.

To me this makes sense, intrusive lists are for cases where you do lowlevel programming and potentially have many different lists that work along side another and also along other data structures.

So the surrounding struct (or code) (wherever the node of the intrusive list is embedded) should have a set of invariants which ensure that you will never remove it from a list that didn’t contain it.

For example if you use the intrusive list to track whether the element is alive and used or dead and free, it should only ever be in one of those two lists.

The code that goes through all alive elements simply iterates through the alive list and when it finds an element that should be killed it removes it from the alive list and adds it to the dead list.
Similarly the code that looks for a free/dead element just picks the first free element removes it from the free list and adds it to the alive list.

I think the point of that implementation is that removing non existing elements from single linked lists is bad for performance anyway, so you shouldn’t do it.

1 Like

Something like this:

diff --git a/lib/std/SinglyLinkedList.zig b/lib/std/SinglyLinkedList.zig
index d118ce0395..b844c8a177 100644
--- a/lib/std/SinglyLinkedList.zig
+++ b/lib/std/SinglyLinkedList.zig
@@ -85,13 +85,15 @@ pub fn prepend(list: *SinglyLinkedList, new_node: *Node) void {
     list.first = new_node;
 }
 
+/// Finds and removes a node from the list. Nodes are compared based on their
+/// addresses. Panics if the node is not in the list.
 pub fn remove(list: *SinglyLinkedList, node: *Node) void {
     if (list.first == node) {
         list.first = node.next;
     } else {
-        var current_elm = list.first.?;
+        var current_elm = list.first orelse @panic("Tried to remove a node from an empty linked list.");
         while (current_elm.next != node) {
-            current_elm = current_elm.next.?;
+            current_elm = current_elm.next orelse @panic("Tried to remove a node that was not present in the linked list.");
         }
         current_elm.next = node.next;
     }
2 Likes

I called this issue

SinglyLinkedList remove function failes with signal 6 if removed node does not exist in the list"

Steps to Reproduce and Observed Behavior:

SinglyLinkedList remove function does not check existance of removed node

And provided two tests:

  • remove non-present from empty list
  • remove non-present from non-empty list
test "SinglyLinkedList test1" {
    const L = std.SinglyLinkedList(u32);
    var list = L{};

    try testing.expect(list.len() == 0);

    var one = L.Node{ .data = 1 };

    list.remove(&one); <==== Remove from empty list
}

test "SinglyLinkedList test2" {
    const L = std.SinglyLinkedList(u32);
    var list = L{};

    try testing.expect(list.len() == 0);

    var one = L.Node{ .data = 1 };
    list.prepend(&one);

    var two = L.Node{ .data = 2 };
    list.remove(&two); <==== Remove from non-empty list
}

Description of the failure:

result is “following command terminated with signal 6” for removing of non-existing in the list node
I checked the new intrusive version - looks the same problem
I did not check DoublyLinkedList via tests, but code also does not check existance of removed node.

Expected Behavior

Because remove does not return error, if node in question does not belong to the list, remove funstion shold do nothing.

The answer was:

Working as designed…

I think in safe modes failing with such an explicit panic would be nice, but in unsafe modes it should remain without the panic, because I think that results in faster code.

because linkedlist api was changed and old code should be also changed accordingly, may be better return error from remove ?

No, I think if you want that, you need to use some other/your own implementation.
I think this implementation is intended to work that way.

Not every data-structure wants to deal with every use case, with this one you are simply not allowed to use remove when you don’t know that the item is in the list, sometimes called an invariant.

It doesn’t make sense to give you an error for every possible way to misuse a data-structure, with hash maps you aren’t allowed to mutate the keys you have inserted, you don’t get an error if you do it, you get a malfunctioning program. These two cases are similar.

Sure it would be nice if some linter, etc. was able to tell you that you are misusing the data-structure, but currently that kind of tooling doesn’t really exist in Zig as far as I know.


Currently you have mostly made the point that you want and expect other behavior, but what you want is simply not what is implemented and intended here, as far as I can tell.

I think you could claim that you would rather use a different data-structure, but that doesn’t mean that other people wouldn’t want to continue using the current behavior. (That said, I think a more explicit panic in safe modes would be nice)

3 Likes

Generally speaking, yes. It’s documented that if “assumes” or “asserts” is in the documentation, it’s up to the user not to do those things.

Zelda just does things a bit differently. I don’t actually get why stdlib’s remove function can’t tolerate a non-list member, but really, code shouldn’t be “removing” something which isn’t there to begin with. It just happens that my implementation doesn’t care.