Idiomatic way to querry Elf sections?

Hi, so I’m currently working on a side project on and off, which is to build a very basic garbage collector, I’m just at the beginning, but after some exploring I finally got to the point where I’m looking at parsing the program’s Elf sections do get the different sections needed for a gc. Currently I have this small piece of code here :

const std = @import("std");

pub fn main() !void {
    var gpa_instance: std.heap.DebugAllocator(.{}) = .init;
    defer _ = gpa_instance.deinit();
    const gpa = gpa_instance.allocator();

    var section_array = std.debug.Dwarf.null_section_array;
    var elf_module = try std.debug.SelfInfo.readElfDebugInfo(
        gpa,
        null,
        null,
        null,
        &section_array,
        null,
    );
    defer elf_module.deinit(gpa);

    var elf_mmem = elf_module.mapped_memory;
    var elf_mmem_stream = std.io.fixedBufferStream(elf_mmem);
    var elf_header = try std.elf.Header.read(&elf_mmem_stream);
    var elf_header_section_iter = elf_header.section_header_iterator(&elf_mmem_stream);

    var sh_str_iter = elf_header.section_header_iterator(&elf_mmem_stream);
    var sh_str: std.elf.Elf64_Shdr = undefined;
    var index: usize = 0;
    while (try sh_str_iter.next()) |section| : (index += 1) {
        if (index == elf_header.shstrndx) {
            sh_str = section;
            break;
        }
    }

    const sh_str_slice = elf_mmem[sh_str.sh_offset .. sh_str.sh_offset + sh_str.sh_size];
    elf_header_section_iter = elf_header.section_header_iterator(&elf_mmem_stream);
    std.debug.print("| {s: ^16} | {s: ^16} | {s: ^16} |\n", .{ "section name", "section offset", "section size" });
    while (try elf_header_section_iter.next()) |section| {
        const len = std.mem.indexOf(u8, sh_str_slice[section.sh_name..][1..], ".");
        const name = if (len) |l| sh_str_slice[section.sh_name..][0..l] else sh_str_slice[section.sh_name..][0..];
        std.debug.print("| {s: ^16} | {d: ^16} | {d: ^16} |\n", .{ name, section.sh_offset, section.sh_size });
    }
}

which produces this output :

|   section name   |  section offset  |   section size   |
|                  |        0         |        0         |
|     .rodata      |       576        |      30597       |
|  .eh_frame_hdr   |      31176       |       908        |
|    .eh_frame     |      32088       |       5284       |
|      .text       |      37376       |      178833      |
|      .tbss       |      216216      |        13        |
|       .got       |      216216      |        8         |
|  .relro_padding  |      216224      |       864        |
|      .data       |      216224      |        20        |
|       .bss       |      216244      |      12656       |
|    .debug_loc    |      216244      |      900512      |
|  .debug_abbrev   |     1116756      |       2396       |
|   .debug_info    |     1119152      |      455668      |
|  .debug_ranges   |     1574820      |      230480      |
|    .debug_str    |     1805300      |      167937      |
| .debug_pubnames  |     1973237      |      44262       |
| .debug_pubtypes  |     2017499      |      42094       |
|   .debug_line    |     2059593      |      222085      |
|     .comment     |     2281678      |       103        |
|     .symtab      |     2281784      |       5208       |
|    .shstrtab     |     2286992      |       217        |
|     .strtab      |     2287209      |      11112       |

But I often find myself realizing much later that I implement things that already exist in the std. So my question is, are there any way do to this that is already in the std that I’m not aware of ?

Not really an answer to your question, but what I’ve decided running into a similar problem writing a debugger. Most of the stuff in the std Lib around Dwarf (and i expect ELF) is purpose built for the compiler and the needs of the language. While there are some useful things in the std library, I have finally decided to write my own implementations of most of these things, because my objectives just don’t align with those of the compiler. I suspect you might be running into similar issues.

1 Like

Hum that was the response I feared lol. My initial plan was to use whatever they have in the std as a black box to get my offsets and stuff and implement the most bare bone functioning garbage collector (linear scanning / stop the world gc), fuzz/test it and use that to test/implement a more sophisticated one.

Do you have any advice/repo related to Elf that I could look at ?

Also in C I know that you can declare extern symbols to get some info/offset about the different sections like extern char* __data_start, _edata, _end; stuff like that, is is also possible in Zig ?

Hmm, no I don’t have any good resources to look at.

Also in C I know that you can declare extern symbols to get some info/offset about the different sections like extern char* __data_start, _edata, _end; stuff like that, is is also possible in Zig ?

I’m not sure I understand the question.

Yes. Loris did it with his OS in 1000 lines:

2 Likes

Oh thanks that’s super nice, I tried a few different thing but couldn’t figure how to do it, so I did it in C and exported the symbol to zig but this is much nicer :slight_smile: