Devlog 2025-02-07, need help understanding

msw · February 9, 2025, 10:54am

Quote:

/// This is executed only at compile-time to prepopulate a lookup table.
fn calculateSlotCount(size_class_index: usize) SlotIndex {
    const size_class = @as(usize, 1) << @as(Log2USize, @intCast(size_class_index));
    var lower: usize = 1 << minimum_slots_per_bucket_log2;
    var upper: usize = (page_size - bucketSize(lower)) / size_class;
    while (upper > lower) {

I thought page_size is now run-time known. How does it calculate the perfect table for the only-runtime-known table? Gotta bake pizza now, can’t dive into code

squeek502 · February 9, 2025, 1:04pm

The pending release notes for the runtime page size stuff can be found here, they might be helpful.

page_size is a field of the comptime Config struct given to DebugAllocator:

github.com

ziglang/zig/blob/933ba935c5ff85668c82606ba4ff2531c00b2e36/lib/std/heap/debug_allocator.zig#L160-L164


      
          /// The size of allocations requested from the backing allocator for
          /// subdividing into slots for small allocations.
          ///
          /// Must be a power of two.
          page_size: usize = default_page_size,

The default is defined here:

github.com

ziglang/zig/blob/933ba935c5ff85668c82606ba4ff2531c00b2e36/lib/std/heap/debug_allocator.zig#L93-L97


      
          const default_page_size: usize = @max(std.heap.page_size_max, switch (builtin.os.tag) {
              .windows => 64 * 1024, // Makes `std.heap.PageAllocator` take the happy path.
              .wasi => 64 * 1024, // Max alignment supported by `std.heap.WasmAllocator`.
              else => 128 * 1024, // Avoids too many active mappings when `page_size_max` is low.
          });

Related: std.heap.page_size_min and page_size_max (comptime known):

github.com

ziglang/zig/blob/933ba935c5ff85668c82606ba4ff2531c00b2e36/lib/std/heap.zig#L36-L60


      
          /// comptime-known minimum page size of the target.
          ///
          /// All pointers from `mmap` or `VirtualAlloc` are aligned to at least
          /// `page_size_min`, but their actual alignment may be bigger.
          ///
          /// This value can be overridden via `std.options.page_size_min`.
          ///
          /// On many systems, the actual page size can only be determined at runtime
          /// with `pageSize`.
          pub const page_size_min: usize = std.options.page_size_min orelse (page_size_min_default orelse if (builtin.os.tag == .freestanding or builtin.os.tag == .other)
              @compileError("freestanding/other page_size_min must provided with std.options.page_size_min")
          else
              @compileError(@tagName(builtin.cpu.arch) ++ "-" ++ @tagName(builtin.os.tag) ++ " has unknown page_size_min; populate std.options.page_size_min"));
          
          /// comptime-known maximum page size of the target.
          ///
          /// Targeting a system with a larger page size may require overriding
          /// `std.options.page_size_max`, as well as providing a corresponding linker
          /// option.
          ///

This file has been truncated. show original

which have their defaults set here:

github.com

ziglang/zig/blob/933ba935c5ff85668c82606ba4ff2531c00b2e36/lib/std/heap.zig#L695-L984


      
          const page_size_min_default: ?usize = switch (builtin.os.tag) {
              .driverkit, .ios, .macos, .tvos, .visionos, .watchos => switch (builtin.cpu.arch) {
                  .x86_64 => 4 << 10,
                  .aarch64 => 16 << 10,
                  else => null,
              },
              .windows => switch (builtin.cpu.arch) {
                  // -- <https://devblogs.microsoft.com/oldnewthing/20210510-00/?p=105200>
                  .x86, .x86_64 => 4 << 10,
                  // SuperH => 4 << 10,
                  .mips, .mipsel, .mips64, .mips64el => 4 << 10,
                  .powerpc, .powerpcle, .powerpc64, .powerpc64le => 4 << 10,
                  // DEC Alpha => 8 << 10,
                  // Itanium => 8 << 10,
                  .thumb, .thumbeb, .arm, .armeb, .aarch64, .aarch64_be => 4 << 10,
                  else => null,
              },
              .wasi => switch (builtin.cpu.arch) {
                  .wasm32, .wasm64 => 64 << 10,
                  else => null,

This file has been truncated. show original

msw · February 9, 2025, 3:59pm

So, this “perfect” table uses default values - aka comptime known guess - and then later on, look up the actual values and hopes the actual runtime value is congruent with the default in order for it to be “perfect”, did I understand that correctly?

If I look at the current min/max definition, that’s 4k on x86 for both; what about huge pages? There should be options for 2M and 1G as well, hence page_max should be 2 << 20 or 1 << 30, shouldn’t it?

msw · February 9, 2025, 4:09pm

First, this removes comptime-known std.mem.page_size, which is a nonsensical concept since the page size is in fact runtime-known (sorry about that), and replaces it with std.heap.page_size_min and std.heap.page_size_max for comptime-known bounds of possible page sizes. Uses of std.mem.page_size in pointer alignment properties, such as in mmap, are migrated to std.heap.page_size_min.

In places where the page size must be used, std.heap.pageSize() provides the answer. It will return a comptime-known value if possible,

In me this triggers cognitive dissonance. Can you help me understand this? If andrew is correct in the first bolded (by me) statement, how can the thing in the second bolded statement make sense? I was just told this is a nonsensical concept, yet here we are, computing tables based on nonsensical data and potentially even returning it. What is it now, is this nonsense (and if so, why dod we do this nonsense) or not (and if so, why is the first statement claiming it is)?

squeek502 · February 9, 2025, 4:55pm

Unfortunately not. I only have a surface level understanding of this stuff.

msw · February 9, 2025, 4:57pm

perfect, so let me ask you as a peer: Looking at the highlighted portions by me, do you think they’re two contradictory statements or is that sentiment just me?

cryptocode · February 9, 2025, 5:12pm

comptime-known just means that min and max page sizes are the same on the target system, otherwise pageSize() will do a query

squeek502 · February 9, 2025, 5:13pm

I’m not sure. If I was trying to figure out if they were contradictory, I would probably attempt to find a bug caused by the page size incorrectly being assumed to be comptime knowable.

For example, since (by default) page_size_min == page_size_max for x86_64 Linux, std.heap.pageSize() will return a comptime known value. But, as I said, I don’t know enough to understand what problems that could cause if the actual page size is different (or even if it’s possible for the actual page size to be different).

msw · February 9, 2025, 5:19pm

To me, this comment, free-standing, without diving into any code, should be updated.

It makes me think that knowing page size at comptime is nonsensical.

And then continues to say, well, if we can, we return comptime knowledge.

WRT comptime known x86_64, and e.g., “huge” pages on linux, my confusion continues.

I’d love for @andrewrk to update that comment (referring to runtime page size release notes - repeating link for andrew if he might jump in right here thinking what the heck) for more clarity with regards to comptime knowledge.

cryptocode · February 9, 2025, 5:27pm

I don’t see the confusion. The release notes clearly states that the comptime known page_size field is replaced by comptime known bounds, the min and max fields. If these are different, a runtime query is performed and memoized. If huge pages are enabled, then that value will be returned by the sysconf call for the Linux case.

(If the max size for Linux is wrong, as you indicate, then that’s a bug and should be reported)

msw · February 9, 2025, 6:17pm

Ok, so: I was wondering whether the huge tables are a level of abstraction beneath the user of this, the allocator, or not. Whether andrew would want to be a user of that, or not. They need a different use under linux AFAIUI. When ‘not comptime known’ came up, my first thought was of huge pages. In the case of huge pages, sysconf would not return those options. That whole logic would presumably live elsewhere wrt allocation chunk size selection and method.

I was left wondering: comptime knowing a value (the page size), and comptime knowing an upper (max page size) and lower bound (min page size) resulting in a singular value (hence: the page size) is the same piece of information (the page size), yet once it is being tagged as nonsensical since the page size is “in fact runtime-known”, and once as obviously the right thing to do. “if possible” with the information from prior paragraph should be “i.e., never, since the concept of knowing a runtime value at comptime obviously is faulty”. I cannot see how they’re not contradictory, yet for someone to explain to me how they’re not, I assume they’d need to understand how I see them as contradictory and if I can’t find someone to see that…

msw · February 9, 2025, 6:35pm

So Asahi Linux Progress Report September [LWN.net] - as andrew mentioned asahi by name, brings up that apple hardware that really wants a specific of the processor-possible page sizes. So, it is one runtime environment for arm64 builds that pins the choice? Since it’s run-time (are we on an M1/2/… or not?) potentially (we can build for the specific cpu, where it would become comptime known again, wouldn’t it?), this becomes the (sole?) use-case, doesn’t it? So the first statement refers to that, doesn’t it? Page size might be compile-time-wrong, ergo must be run-time checked; we tweak the conditions to only look at the platforms we know are problematic at run-time, but this is no longer a compile time constant in all cases.

Let me know in a PM if you see a distinction between what I wrote above and what andrew wrote and how that has thrown me off in trying to understand what he wrote so I maybe have a network of people who think in similar patterns to try and cooperate navigate issues as such in the future, thank you.

To bring closure then to the first message I wrote: is it correct in saying that the “page” here is not what the OS or CPU calls “page” in mmap / virtual address space mapping but what andrew named the unit of allocation the allocator used, hence the quoted function referring to setting up internals for the allocator, not for interaction with the host. Ergo, the thing we’re calling page_size here is not the thing that became potientially-runtime-only-known but the allocator’s chunk_size. If that understanding is correct, I’d suggest renaming the backing allocation unit of the allocator to something different than “page”.

Sze · February 9, 2025, 11:33pm

I don’t really understand why you are making it more complicated than it is, os page sizes can be different, min and max sizes can be different, some os architecture or freestanding program (basically custom os), may set its own min and max sizes.

Then when min == max size, there is only one possible size, thus it becomes comptime known, otherwise we can query stuff at runtime or fallback to pick a default option that is one of the valid values within that range.

It is called page size because ultimately some allocator (usually the PageAllocator) uses memory mapping syscalls to map some pages of memory and for that it needs to know some valid page size. (Before that was always comptime determined, even where it didn’t make sense)

The rest is pedantics and nitpicking, sure we can improve how things are described or change how it is worded, but then just suggest how you want to reword it to make it clearer.