Zeppelin - cross-platform 2D graphics

It’s been a long road but finally I’ve tagged the first release of Zeppelin, a 2D graphics and window library in pure* Zig for linux (wayland) and windows. :partying_face::partying_face:

It features more or less complete vector graphics, hardware accelerated through vulkan, a window context and events, keyboard mapping, etc. All this with almost zero dependencies.

It’s still in alpha state but I’ve put quite some work into it and I’m excited to share it with you, hope you’ll like it!

30 Likes

Very cool, looks like a good start! It built/ran right away on my windows machine. A bit “fuzzy” (see my notes on DPI below) but a good starting point.

I’ve been playing around with my own Windowing library and have some insights to share if you’re interested.

When it comes to cross-platform APIs it can be hard to create a good abstraction. In general you have to accomodate the “worst-case” and I’ve found that for windowing, the worst case is “Microsoft Windows” which unfortunately is…callback based. My understanding right now is if you try to abstract away the callback-based nature of the underlying system API it can become a headache to make work properly and most solutions end up sacrificing features/functionality because of it. I actually ran into the same problem with regards to Audio APIs…macOS is callback-based and in order to make an abstraction that works well on all platforms, it became much simpler once I gave up trying to abstract this away and just gave in to making my API callback-based as well. Callback based API’s are harder to use, but, trying to abstract one away is a nightmare.

So right now when I run your code on Windows, if I move/resize the window, the application essentially pauses because the wndproc recurses in on itself an never returns control back to the application loop to update the frames. If the application itself is also callback based then this isn’t a problem because you’ll continue to get your “paint” callbacks even if the main loop doesn’t return. There are ways to trick/workaround this design but you end up losing out on builtin features like window snapping/shake gestures etc, and these features are specific to each version of Windows, attempting to override them and re-implement them for every windows version is a herculean task. Also note that this isn’t just a problem with moving/resizing a window, it’s also a problem with various win32 APIs. Functions like MessageBox or ShellExecute also start their own message loop until they are finished. So be careful not to customize or at least depend on a custom message loop.

Your API is similar to those I’ve seen that are usually meant for games, and, many times games can get away with this simpler API because it’s ok if they don’t interact well with the Desktop Compositor. It’s ok if they don’t update their window content right away because they’re just going to update it in 10 ms anyway, a little bit of artifacting while resizing/moving the window isn’t a huge deal because most of the time the user is just playing the game, not managing the window. These problems become more noticeable/important when you’re making a general-purpose UI framework for non-gaming applications.

I’ve learned the SDL3 has added a new callback-based API to address the same problems (Main callbacks in SDL3), and, it’s not just Windows that uses this kind of API but I guess there are some new web-based platform that also do?

On another note, it looks like you’ve yet to implement dpi scaling so here’s some guidance if/when you get to it. In general to support DPI awareness you’ll want to scale all your UI by a factor of GetDpiForWindow() / 96.0. Note that DPI can be different for each monitor, and, I like to keep two monitors side-by-side that are at different DPI’s and quickly move a window between them to test that I’m handling the change correctly. (Changing the DPI on a monitor just means changing the Monitor Scale, see Win32DPI and MonitorScale). Next, note that all modern windows applications should be as dpi-aware as possible (PerMonitorV2 if windows is new enough). There are ways you can enable this programmatically but in my experience this always seems to break in subtle ways, the only full-proof way I’ve found is by adding a manifest file so the OS can determine the DPI awareness before jumping into the executable’s entry point. Note that you’ll want to handle the WM_GETDPISCALEDSIZE and WM_DPICHANGED. You get the first when DPI is about to change and Windows needs to know the size of the new window, and the later is where you actually change the window size after the DPI change occurs. In the latter you never want to re-calculate/change the size you gave the OS in WM_GETDPISCALEDSIZE, otherwise, that window size change could place the window on another monitor and change the DPI again, you can end up in an infinite DPI change loop. WM_DPICHANGED should only ever call SetWindowPos with the new suggested rect, and, it’s important that you do call SetWindowPos yourself, DefWindowProc doesn’t.

P.S. I bring up the points about the callback-based API not to say that your framework should change, just to point out the tradeoffs. I think it’s good to have cross-platform based APIs that sacrifice polish for the sake of simplicity, and maybe that’s the tradeoff you want to make with your library.

9 Likes

Thanks for the insights!

I’m aware of both issues, the move/resize event loop and the DPI scaling, and want to tackle both of them but didn’t yet come to it.

The callback nature of the windows API is IMO a mistake but I don’t believe the solution is to obey it but instead to take control or more specifically in this case, handle the non-client events and control the move/resize ourselves. I know you already tried such a solution and the performance wasn’t great but I’m not convinced it has to be this way, given that it already works the same on wayland (handle events first, render afterwards).

I’m a fan of “user-defined” control flow paradigms, that’s why this is a library and not a framework and also doesn’t provide a main loop. But let’s see how hard windows will make it.

The most important thing is to drive the Zig GUI ecosystem forward and to also have some fun in the meantime.

Btw, you really hate git histories, do you :sweat_smile:

5 Likes

I really dislike gui apis that force you to put your logic into a bunch of separate methods that get called from code you have no control over, so I think it is great that you explore something that gives the user more freedom.

3 Likes

The callback nature of the windows API is IMO a mistake but I don’t believe the solution is to obey it but instead to take control or more specifically in this case, handle the non-client events and control the move/resize ourselves. I know you already tried such a solution and the performance wasn’t great but I’m not convinced it has to be this way, given that it already works the same on wayland (handle events first, render afterwards).

At this point I’m convinced you’re up against an impossible task…but I genuinely hope you prove me wrong :slight_smile:

2 Likes

By the way, you mentioned Microsoft’s callback-based design was a mistake, I’m curious what you’d replace it with/what alternative designs you’d consider and how you think they would compare and potentially improve things?

The other real-world windowing interface I’m familiar with is X11. With that everything is serialized and sent over a socket. I’m not sure I’d say one is superior over the other though? It seems like they’d both have tradeoffs? The benefits of the callback based one is it allows the platform to easily implement message priorities/ordering guarantees. With a serialized system once you send something you can’t really unsend it and send a new higher priority operation in front of it. A callback interface also has the opportunity to interact with the application on the same thread, something a serialized-based interface like X11 can’t do at all. To me the callback based one in general gives more control to the underlying platform…meaning…there’s always going to be things a non-callback based one can’t do that a callback-based one could. Maybe that’s OK though? Maybe the things a serialized interface can’t do aren’t important/necessary? Or…maybe you know of another design besides callback/serialized that works better for a windowing system?

2 Likes

While I don’t happen to know of a GUI which uses it, a viable alternative would be a message-passing architecture with priority queues.

Since I’m not aware of an example, it’s kind of a handwaving statement, but I bet it would work and it has nice properties. Actor-message architectures are pleasant to work with, I’m always looking for a way to organize complex code that way and am happy when I find one.

The biggest thing is that changes are local: every state change happens when the actor (a window or whatever) reads the mailbox. Callbacks reach out and touch each other, and that gets arbitrarily painful to reason about.

1 Like

Actually Windows combines both message passing/priority queues along with a callback system. At the top level, any thread that is servicing “windows” will have it’s own message queue. Typically any process that is creating these windows will have one GUI thread servicing all the windows with it’s message queue. However, when it comes time to process/handle the messages, you call a function DispatchMessage which gives control back to the platform until it calls your various callbacks. Also once inside your callback, it’s also common to recurse even deeper and pass control back to the system yet again via a call to DefWindowProc. Here’s kinda what that looks like:

A. App pops message from queue, calls DispatchMessage to give control to the OS
B.     -> OS does some stuff then forwards the message back to the app via WndProc (the app's callback function)
C.         -> App handles the message, sometimes calls back into the OS via DefWindowProc
D.             -> OS might call WndProc again or even start a new message loop where it starts popping messages off the stack
               ...
E.         <- App returns from WndProc, control given back to OS
F.     <- OS does more stuff in response to app finishing its message handler, then it returns control back to the app (DispatchMessage finally returns).

App is now free to pop more messages of the queue or do whatever else it wants.  Note that the queue may not be in the same state as it was when we last interacted with it, it's likely messages could have been added or removed while we were handling the last message.

Note that the important piece to keep in mind when considering the callback system is parts “E” and “F” labeled above. By preventing an application from participating in this callback system, there’s no way for an application to insert themselves in step E. The reason this becomes a problem is step F sometimes requires that something specific be done (for example update the contents of the window). If you don’t conform to this then you start getting the weird behavior/artifacting because you’ve violated the OS’s assumptions about what your app is doing.

Having gone over all this to explain it…it now seems silly to me that windows has this callback interface at all. It’s already gone to the trouble of creating/storing all the messages in a queue which means it’s already given up all the benefits/control you get from a callback system. It’s like the worst of both worlds, no benefits from a callback-based API yet you still force the application to conform to this more complicated interface.

5 Likes

This looks really nice. The native linux build did panic on startup (I opened an issue), but the win32 build cross compiled and ran fine on wine. Very impressive!

I recorded an example that shows the use case that’s hard to support without a callback-based system in the app. Let me know if you figure out how to implement it with your API!

1 Like

@neurocyte Thanks again for the hint, hope it runs smoothly now also under wayland!

@marler8997 Assuming that the performance issues aren’t inherent, I’d probably handle the whole event tree for resizing/moving myself and drop some extra features like snapping or shake-to-minimize. For those features, that have corresponding API, one could optionally use those to emulate the expected behaviour but as it seems, this isn’t the case for e.g. window snapping. Another idea I’d like to try is to generate “fake” events and call the default handler on them, at least for those events that don’t block.

I’d say that is smooth, yes!

1 Like

Wow, didn’t expect one could get more out of it than 1000-1100 FPS, but that really is another level :grin:

Once I’m done with most of the roadmap, maybe I need to invest a little into faster anti-aliasing, MSAA tanks quite a bit of performance on lower/mid tier gpus.

Ah ok fair enough. Unfortunate, but it seems it’s the best we can do. I recently implemented this technique in raylib (as it uses the same kind of API) but ended up dropping it for raylib as it just felt bad adding a half-baked implementation that would need quite a bit more code to support all the possible features/windows versions for something that some raylib applications don’t even care about (not care about update during window move/resize). But I saved the code in a gist if you’d like to take a look. This one is specifically for moving the window, resizing the window will be similar, but, when I implemented it for resize, it looked/felt much “jankier” than the “move window” case: Win32 Silly Override MoveWindow Handler · GitHub

P.S. one thing raylib could do to address this is if applications want this functionality, they could add an optional draw callback. could be a solution for your library as well.

There’s something about callback-based designs that seem to push the api into an awkward place. I don’t quite have it figured out yet, but I think it has to do with composability and “who’s in charge”. Also historically entangled with object-oriented design.

For example, the dvui dropdown api was originally envisioned with a callback (to provide the labels in the dropdown). It seems easy initially, but then to have multiple instances of callbacks either they use separate functions, or pass some user data in to specialize each callback. And potentially a dropdown could be inside another dropdown, so now the callback is reentrant. Creating new functions for local stuff is a pain (unless your language has closures).

Making the api not use callbacks was a lot harder up front, but it gives the application a lot more control. Maybe in a few years we’ll know if that was a good choice.

Since most (all?) of the traditional OS GUI toolkits are callback based, my guess is the alternatives haven’t been explored enough to know the answer. I’m trying!

1 Like

I agree callbacks API’s are more complicated. However, I think I’ve come to the conclusion that if your underlying system/platform is callback-based, then most times you can’t abstract that away from the application without giving something up.

For me, if I’m writing an app and choosing a library, I generally prefer having that full-control even if it means a more complicated API. The simpler API is nice for small apps/demos and getting up and running quickly, but, I’m hesitant to invest time/effort into a library that doesn’t give me that control when I need it.

I think the easy/default route that any library takes here is to abstract away the platform and make a unified simple interface. What’s hard is making an abstraction that is easy to use but still allows the application to circumvent it when needed. An ideal API for this requires a deep knowledge of all platforms so as to only be as complicated as is necessary, but still leaves in proper hooks to be circumvented. It should avoid unnecessary coupling at all costs. I think when you have a good understanding of each platform the API starts to become clear, but, it goes against our initial tendency to want that simple unified interface. On top of this, I think Zig’s type system/compile-time reflection really open up some interesting ways to design an API around this. I’m excited to see what we all come up with over the next few years.

1 Like

Total agreement - I’m also excited by zig build-time and comp-time flexibility, I keep finding new ways to use them!

1 Like

A bit off-topic, but I noticed in your video that the animation does not freeze when clicking and holding the title bar. When I clone and build the zin example on my Windows 10 machine though, it will freeze animation for ~1 second when the title bar is clicked and held without moving the mouse. I’ve run into this DispatchMessage timer freeze in the past, but I never found a (single threaded) solution. Maybe this is an issue only with my setup, but I’ve seen it discussed online.

Ah very interesting. It does the same thing for me as well, if I click the caption titlebar without moving my mouse, the OS takes over for exactly 500 ms before it starts doing anything. I added some logging to my wndproc and here’s what we see:

info: 2531: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71)
info: 2531: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 2531: WM_SYSCOMMAND:274(type=0xf010)
info: 3031: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3031: WM_SYSCOMMAND:274(type=0xf010) > 3031: WM_CAPTURECHANGED:533(0)
info: 3031: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3031: WM_SYSCOMMAND:274(type=0xf010) > 3031: WM_GETMINMAXINFO:36
info: 3031: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3031: WM_SYSCOMMAND:274(type=0xf010) > 3031: WM_ENTERSIZEMOVE:561
info: 3031: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3031: WM_SYSCOMMAND:274(type=0xf010) > 3031: WM_TIMER:275
info: 3031: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3031: WM_SYSCOMMAND:274(type=0xf010) > 3031: WM_PAINT:15
info: 3031: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3031: WM_SYSCOMMAND:274(type=0xf010) > 3031: WM_MOVING:534
info: 3031: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3031: WM_SYSCOMMAND:274(type=0xf010) > 3031: WM_WINDOWPOSCHANGING:70
info: 3031: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3031: WM_SYSCOMMAND:274(type=0xf010) > 3031: WM_WINDOWPOSCHANGING:70 > 3031: WM_GETMINMAXINFO:36
info: 3047: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3047: WM_SYSCOMMAND:274(type=0xf010) > 3047: WM_WINDOWPOSCHANGED:71
info: 3047: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3047: WM_SYSCOMMAND:274(type=0xf010) > 3047: WM_WINDOWPOSCHANGED:71 > 3047: WM_MOVE:3
info: 3047: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3047: WM_SYSCOMMAND:274(type=0xf010) > 3047: WM_NCMOUSELEAVE:674
info: 3047: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3047: WM_SYSCOMMAND:274(type=0xf010) > 3047: WM_TIMER:275
info: 3047: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3047: WM_SYSCOMMAND:274(type=0xf010) > 3047: WM_PAINT:15
info: 3062: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3062: WM_SYSCOMMAND:274(type=0xf010) > 3062: WM_TIMER:275
info: 3062: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3062: WM_SYSCOMMAND:274(type=0xf010) > 3062: WM_PAINT:15
info: 3078: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3078: WM_SYSCOMMAND:274(type=0xf010) > 3078: WM_TIMER:275
info: 3078: WM_NCLBUTTONDOWN:161(hit=caption:2,point=173,71) > 3078: WM_SYSCOMMAND:274(type=0xf010) > 3078: WM_PAINT:15

this logs every time wndproc is called with both the message we received as well as any window messages that we are still inside. Also note the number on the left is a millisecond timestamp. When you click the toolbar we get the WM_NCLBUTTONDOWN message and as you can see from the log we never return from this call to the wndproc. It will eventually return when the user releases the mouse button but I cut the log short before we see that.

The current wndproc will just call DefWindowProc when it gets WM_NCLBUTTONDOWN, which calls wndproc again with WM_SYSCOMMAND with type 0xf010 which is SC_MOVE. So…it looks like the default windows behavior is just designed in such a way that if you click the titlebar without moving the mouse, it will pause for 500 ms before it starts rendering again. Maybe this is a Windows-10 specific behavior? Not sure. In any case it’s what the OS does and will do with every window so it’s not too concerning to me as it’s the builtin default behavior so likely what users would expect. I did try adding an invalidate call when I received WM_SYSCOMMAND just to see if that would cause it to re-draw the content but doesn’t look like it.

1 Like

Control is the exact reason why I want to keep the callback API stuff out of zeppelin. Though it’s probably control-over-flow vs control-to-use-all-windows-features. But in my experience, keep owning the control flow (“library vs framework”) pays off later, as you can easily combine multiple libraries with each other, harder so with multiple frameworks. Also I suspect that integrating with zig’s async, once it’s usable, would be easier compared to the callback-based approach. It’s hard to suspend a function if there are win32 stack frames inbetween. I want to see how far one can get with the fake-event+DefWindowProc approach before giving in to callbacks.

Thanks by the way for the code snippet, will tell you when I try that in zeppelin, I’m very curious about the performance.

1 Like