Zomino - HTML "server"

Https://codeberg.org/jmcaine/zomino

I hesitatingly post this. There’s not much to “showcase”, honestly, and the number of people who will actually value this is probably quite low. My real interests for zig lie elsewhere, but this project is motivated from my Python webserver projects, and my age-old trusty use of dominate, combined with interest in practicing the likes of @matklad‘s NewType and some other idioms, and interest in “serving” most/all HTML over websockets for client rendering. I may someday rewrite some of my many Python projects to try using this scheme “for real”, but if the efficiency advantages were actually attractive (I haven’t profiled anything), the results would surely be overkill for my little lightweight projects.

Anyway, be gentle with this beginner, but feel free to poke around and offer advice, unless it’s just, “who would ever use something like this?” - that, I get. :slight_smile:

(This is also my first use of Codeberg, migrating over from github days)

P.S. - yes, I know there’s no such thing as an “HTML server”; there’s more to that moniker than I care to babble about now. And NO, this is not an HTTP server.

4 Likes

That was a fun 5 minutes read. I didn’t dive very deep. Personally I don’t like this way of constructing html, but since you dislike templating I can see why this approach with its lack of magic and hidden control flows becomes appealing. It certainly makes it easier to validate that all tags are properly nested and closed, something that can get messed up with templating.

My only real critique would be the fact that you pass along the allocator all the time. In the scope of generating an html document I would consider those bytes well spent for the convenience of not passing the allocator all the time.

I think the std.ArrayList removing this should be seen in the light that some applications would have thousands or even millions of them and suddenly the cost adds up. To have this overhead per html document seems reasonable to me.

2 Likes

I think you’ll be happier if your builders take a *Writer directly. You can use std.Io.Writer.Allocating for string building, or send it down the wire directly, plus you don’t need calls like those bufFmt ones, since the writer can do it for you. That should mean mostly just keeping around that stack of tags you need to close.

XML builders have a long lineage, seems like a nice start to a project. Thanks for sharing!

2 Likes

Thank you both. @mnemnion, can I interpret what you mean as something like, “since you’re only building with the purpose of serializing or rendering, and there’s no use-case that involves modifying anything in the middle of the data tree, then why build the tree, then serialize; instead, rather, just pass writers directly as you build.” ? I think this is a reasonable proposal that probably wasn’t given much consideration in my mind in part due to my (ingrained) experience with my patterns in creating constructs like this in python, with Dominate, where the norm is to just render at the end. In that case, since everything is “dynamic” (no attempt at contiguous indexed storage, etc.), one CAN modify (‘insert’ into) the middle of the tree. However, I’ll need to ply my mind at the ergonomics of write()ing out bits at a time, and read()ing in a mirroring way that is also meaningful for keeping track of the tree depth (“tags you need to close”, as you say). I can see, roughly, how this might work, and offer some advantages… but I’ll have to dwell some more on the tradeoffs. I can also see how it might be a more typical approach; I honestly don’t have a lot of experience analyzing a family of solutions that accomplishes similar goals. So, thanks for the thoughts.

1 Like

Yes. I think, Zig being what it is, that a ‘streaming’ XML builder (ok, HTML, but, genus/species) is a more natural fit for the application.

Python doesn’t allow for decisions about when to allocate, and it’s just not that kind of language generally: you do things the Python way and hope Moore’s Law holds out.

I doubt there’s an ergonomic way to use this kind of builder and “stuff” content lower in the stack, and that’s all you’d get by storing snippets as you go, some kind of hypothetical “oops, better find a div and add a class to it” facility. I deem it preferable to simply not support that.

It’s a different architecture in ways which you’ll uncover if you go that route: you’ll discover there are perhaps two states which are ambiguous, and will call for a tie-breaker (either “close here” or “continue here”, probably the former). Still, I think you’ll be happier with the result if you make it streaming. Just a hunch.

I’ll have to think more on your points before I can respond reasonably, but a few inspirations come, and I’d enjoy the feedback on the lower ones that follow….

First, I can identify my use cases: if I re-write my old python stuff using httpz or zzz or std, then I’d likely use this little zomino in much the way I used (python-)dominate, which is very similar to the example you see in the readme: building tables and “snippets” of that sort, based on client-fed data, and sending the snippets back to the client over websockets (so, I guess zzz isn’t a webserver option right now). Future use cases are actually very similar, but the server would likely be running on a device (embedded os), and we’re not talking about serving millions of users, but serving in resource-gentle ways, and yet, possibly a good bundle of simultaneous users (dozens). Yes, the data could be exchanged as JSON or XML (meh) or whatever. Right now, client→server is always JSON, and (python) server→client is always HTML. This zomino idea just sends the “proprietary” serialized data, for a (hypothetical) WASM recipient to unpack to real HTML.

So, I’m inclined to consider pros of your proposal (to just “stream”) to the pros I might be hoping for in this build-then-send scheme. Two obvious “pros” for streaming come to mind:

  1. no need to build, just send (saving memory, logic, etc.).
  2. as soon as some data is ready, it might be receivable and processable client-side, while the server continues to work on the remainder.

A couple of the “pros” for my build-first scheme (which I’ve not stated yet, so one could say “well, you didn’t say that”) are

  1. compression-decompression (since 100% of string data (text node content as well as, e.g., tag attribute values) is a contiguous array, compression with the whole bulk available could be more efficient).
  2. there are some race scenarios which can lead the server, upon processing, let’s say, an “interrupt”, may scrap the build and re-start; if it’s already streamed half the data, it would take a slightly different approach to telling the client, “never mind, this instead”. Not a big deal, but slight consideration.
  3. it’s normal in at least one of my bigger projects that there are multiple clients intended to receive the result of a built snippet. I’ve always built it in the past, then looped through the registered clients, to broadcast over the websockets to each. (So there’s intertia - not that that’s ever a good reason alone.) But there might be an argument for the ergonomics of a logic block that builds, then a sender that broadcasts the result, over a block that builds-and-strems as it goes (to each client), and has to send a little extra data (node terminators/end-tags) along the way. Especially when combined with #2, when the build is interrupted with updated data (this might happen 10% or less of the time), and possibly #1, wanting to compress the text prior to sending, there starts to be a case for the scheme. Maybe. :slight_smile:
1 Like

There are, or could be, Writers for each of those scenarios. But maybe the balance of factors points to build-first. A bit of transient allocation is not the end of the world. :slight_smile:

But I’ll make that a paragraph instead of a sentence: Writer is an interface. It can be an interface to streaming compression, or to “sponge then compress” compression, it can do build-first, and it can be the contact point for multicasting a string. You can make your own! It needs to respond to four commands, any which don’t make sense can be dummies. It’s rather good engineering in my opinion.

1 Like

Ah, good point about Writer. Thanks for the great input.