Pointers to optimize darkness propagation in a minecraft-style lighting grid

this is a BFS algorithm for propagating darkness in a minecraft-style lighting system, the lighting system consists of 6 channels, RGB for torch, RGB for sunlight

darkness propagation is as follows:
it utilizes a single queue, and loops until queue is empty: during which
it loops over the 6 cardinal directions, gets the lighting for each node position + direction, then loops over each channel, updating them according to conditionals.

updating light depends on channel conditions, hence looping over each channel.

i’m not sure how to optimize it any further, i am already using bit packing with BlockLight, packed struct, each channel consuming 5 bits
is there another algorithm that achieves the same effect at a better speed?


    var prev_chunk_pos: int3 = darkness_seed.world_pos.worldToChunk();
    var prev_chunk_neighbour: *LocalChunk = ChunkManager.getChunk(io, prev_chunk_pos).?;
    while (queue.count > 0) {
        const node: DarknessNode = queue.get().?;
        std.debug.assert(node.light.equals(BlockLight.min()) == false);

        // 6 cardinal directions
        for (Util.directions) |d| {
            const neighbour_world = node.world_pos.add(d);
            const neighbour_chunk_pos = neighbour_world.worldToChunk();

            // ChunkManager.getChunk() uses an expensive lock
            // cache the previously used pointer and reuse it
            // if neighbour_world_chunk_pos == previous_chunk_pos
            const n_c: *LocalChunk = if (neighbour_chunk_pos.equals(prev_chunk_pos))
                prev_chunk_neighbour
            else
                ChunkManager.getChunk(io, neighbour_chunk_pos) orelse continue;
            prev_chunk_neighbour = n_c;
            prev_chunk_pos = neighbour_chunk_pos;

            const n_index: usize = int3.worldToIndex(neighbour_world);
            const is_down: bool = d.y() == -1;

            const n_light_original: BlockLight = n_c.getLight(n_index);
            var updated_n_light: BlockLight = n_light_original;

            var neighbor_propagation_light: BlockLight = BlockLight.min();
            var did_remove_light: bool = false;

            // 3 channels for torchlight
            // 3 channels for sunlight
            //loop over all the light channels and set them one by one
            for (comptime std.enums.values(Channel)) |t| {
                const node_channel_light: u5 = node.light.get(t);

                if (node_channel_light == 0) continue;

                const neighbour_channel_light: u5 = n_light_original.get(t);
                const sunlight = t == .sun_red or t == .sun_green or t == .sun_blue;

                // propagate darkness if neighbour:
                //  - isnt 0
                //  - either of those two conditions
                //      - channel is sunlight and is down and current node channel == max_channel_Size
                //      - neigbour is less than current node channel
                if (neighbour_channel_light != 0 and
                    (sunlight and is_down and node_channel_light == BlockLight.max_channel_size) or
                    (neighbour_channel_light < node_channel_light))
                {
                    updated_n_light = updated_n_light.set(t, n_c.getBlock(n_index).blockData().light.default.get(t));
                    neighbor_propagation_light = neighbor_propagation_light.set(t, neighbour_channel_light);

                    did_remove_light = true;
                } else if (neighbour_channel_light >= node_channel_light) {
                    n_c.light_repair.setBit3D(n_index, true);
                    for (Util.directions) |nd| {
                        const nnw: int3 = neighbour_world.add(nd);
                        const nnnc: *LocalChunk = ChunkManager.getChunk(io, nnw.worldToChunk()) orelse continue;

                        try visited_blocks.put(allocator, nnw, {});
                        nnnc.light_repair.setBit3D(nnw.worldToIndex(), true);
                    }
                }
            }

            if (did_remove_light) {
                n_c.setLight(n_index, updated_n_light);
                try visited_blocks.put(allocator, neighbour_world, {});

                // light = 0, so skipped next iteration, so might as well not added.
                if (neighbor_propagation_light.equals(BlockLight.min()) == false) {
                    try queue.put(allocator, DarknessNode{
                        .world_pos = neighbour_world,
                        .light = neighbor_propagation_light,
                    });
                }
            }
        }
    }

The thing that sticks out to me is that you’ve folded the routines for populating a chunk with torchlight and populating it with sunlight together, when really they should be separate.
The torchlight routine needs to iterate in all directions and potentially consider neighbouring chunks; the sunlight routine only needs to iterate in one direction, and never needs to consider neighbouring chunks.
So, the natural and most optimised way to calculate sunlight is to simply do it entirely separately from torchlight.

Here’s an example sunlight routine:

/// Method to apply sunlight to a chunk's blocks for the current frame.
pub fn sunlight(self: *LocalChunk, sunlight_r: u5, sunlight_g: u5, sunlight_b: u5) void {
	const sunlight_colour: BlockLight = .{
		.sun_red = sunlight_r,
		.sun_green = sunlight_g,
		.sun_blue = sunlight_b,
		.torch_red = 0,
		.torch_green = 0,
		.torch_blue = 0,
	};
	
	// We bitwise AND this with the sampled block light from the block we want to write to, 
	// then bitwise OR with the sunlight block light and write that value to the block.
	// This overwrites the sunlight colours of the block without touching the torchlight.
	const mask: BlockLight = .{
		.sun_red = 0,
		.sun_green = 0,
		.sun_blue = 0,
		.torch_red = std.math.maxInt(u5),
		.torch_green = std.math.maxInt(u5),
		.torch_blue = std.math.maxInt(u5),
	};
	
	// I'm assuming a +Z=up coordinate system even though that's heresy in Minecraft :O
	for(0..CHUNK_SIZE_X) |x| {
		for(0..CHUNK_SIZE_Y) |y| {
			// Loop over the Z-values and go downwards until we find a solid block.
			const i: usize = blk: {
				for(0..CHUNK_SIZE_Z) |z| {
					if(self.getBlock(.{
						.x = x,
						.y = y,
						.z = CHUNK_SIZE_Z - (z + 1),
					}).blockData().blocks_sunlight){
						break :blk CHUNK_SIZE_Z - (z + 1);
					}
				}
				// If we never found a solid block, optimise by continuing the outer loop.
				continue;
			};
			
			// Regardless of how the loop terminated, we want to write our light values into the chunk.
			for(0..CHUNK_SIZE_Z) |z| {
				const old_colour = self.getLight(.{.x = x, .y = y, .z = z});
				if(z > i){ // Write sunlight - bitwise OR
					self.setLight(.{.x = x, .y = y, .z = z}, (old_colour & mask) | sunlight_colour);
				} else { // Clear sunlight - bitwise AND
					self.setLight(.{.x = x, .y = y, .z = z}, old_colour & mask);
				}
			}
		}
	}
}

The reason behind doing all these bit operation gymnastics is so that we can design the two routines to be run together - since the sunlight routine doesn’t overwrite the torchlight of the blocks, and the torchlight routine doesn’t overwrite the sunlight of the blocks, you can run them in any order you want - and potentially even concurrently!

1 Like

This gets kinda hard without any profiling data but still there are a few things.


You could split this into two phases to avoid “repairing” light while still in the process of “destroying” it.

In the first phase you’d remove the lighting source and iterate over the neighbors. If a node has exactly the light level you’d expect(current - 1 or so) you set it to 0 and add it to a removal_queue. If it had a higher light level, it was lighted from something else. So instead of zeroing it you add it to a relight_queue. Do this until removal_queue is empty.

Then in the second phase you take the relight_queue and propagate the light to the stuff you just zeroed out. Since relight_queue has only valid light levels you don’t need to check those again.


You could do something like SWAR for the colors to remove the for (comptime std.enums.values(Channel)) |t| loop. It should likely be faster.