Ziggit.dev, utf8, code output & examples; What is the correct way to display code and output which legitimately contains pictures of cats?

Hi

I have zig code with u21 utf8 chars in it. They display fine in terminal. They display fine in vim. They display fine in zig. Ziggit.dev not so much.

Here is my output; All I can see in the output is funny squiggles

😺 😻 😼 😽 😾

I have tried using code md. But that does not work.

😺 😻 😼 😽 😾

So what is the correct way to display code and output which contains pictures of cats?

If you could answer this question that would be purrrrfect.

Sorry.

I imagine that has to do with browser support for unicode/font being used in the browser.

I don’t think it’s a browser problem as

  1. I used the browser to find the unicode in the first place
  2. I can load the program from a local file and it displays in the browser fine

Screenshot from 2024-06-19 12-58-53

Behold, cats: :smiley_cat: :smirk_cat: :smile_cat:

It can certainly display them, so I’m not sure what’s up? This probably belongs in Site Feedback. Are you trying to get them to display in code blocks? I get this result in that case:

:smiley_cat: :smirk_cat: :smile_cat:

There are ligature differences for code blocks too, though… for instance: → vs ->

Yes. and the output.

It’s actually part of the code. ie the ascii7 equivalent would be ‘A’ ie 65.

In this case I am dealing with u21 and utf8

If it’s okay with you I will submit my showcase in picture form rather than text.

1 Like

Sure - let’s give that a shot :+1:

1 Like

Does your HTML say this?

 <meta charset="UTF-8"> 

Because that looks like Latin-1 mojibake.

Sorry. Don’t quite understand what you are asking me here.

I am copying and pasting zig files using the cat command. (Oh The irony)

cat ziggit_showcase.txt

Now I can just paste it into my

    // Example 3 - UTF8_catface, Meow! Purrrr
    var utf8_catface = Range(u21,'😺','😾'){}; // Unicode into unsigned 21 bit
    while ( utf8_catface.step(1) ) |index| { std.debug.print("{u} ",.{index}); }
    std.debug.print("\n",.{});

No html on my end. AFAIK.

baffling. This is what I get when I copy and paste some Zig code run through zig fmt:

const cat = '😻';
const cats = "😹😻🙀😿";
1 Like

I am running Debian.

Perhaps if I just upload the file.

main.zig (3.7 KB)

If you cat it out and paste what happens?

I see cats:

    // Example 3 - UTF8_catface, Meow! Purrrr
    var utf8_catface = Range(u21,'😺','😾'){}; // Unicode into unsigned 21 bit
    while ( utf8_catface.step(1) ) |index| { std.debug.print("{u} ",.{index}); }
    std.debug.print("\n",.{});

Does the problem persist after rebooting?

Just before I reboot I am switching from putty to the inbuilt terminal with Debian.

    // Example 3 - UTF8_catface, Meow! Purrrr
    var utf8_catface = Range(u21,'😺','😾'){}; // Unicode into unsigned 21 bit
    while ( utf8_catface.step(1) ) |index| { std.debug.print("{u} ",.{index}); }
    std.debug.print("\n",.{});

Solved

It’s a putty thing! Which is weird. Sorry for the inconvenience.
Putty is pretty much industry standard where I come from.

Thanks for your help.

3 Likes

Might be a font thing? encoding - Can PuTTY be configured to display the following UTF-8 characters? - Server Fault

My putty is set to utf8 encoding and displays the cats fine.
It’s only when I copy and paste I have the issue.
I tried all the fonts anyway and no luck.

I then copied and pasted into K writer and it worked fine
I then copied and pasted the same text from K writer (which came from putty)
to Ziggit.dev

    // Example 3 - UTF8_catface, Meow! Purrrr
    var utf8_catface = Range(u21,'😺','😾'){}; // Unicode into unsigned 21 bit
    while ( utf8_catface.step(1) ) |index| { std.debug.print("{u} ",.{index}); }
    std.debug.print("\n",.{});

So clearly Putty vs Firefox problem.

So the work around is to use a different terminal or copy and paste into a utf-8 text editor

I will plug away. And see if I can resolve this.

Thanks

So it’s a clipboard thing? Really weird. I mean, if an app won’t use the system clipboard for some reason - ok, but messing up characters?.. Anyways, glad the cats finally made it to ziggit :stuck_out_tongue:

1 Like

Could be that you’re experiencing a bug in how clipboard content is being converted between applications.

Assuming you’re using X11, see the “content type and conversion” section:
https://www.uninformativ.de/blog/postings/2017-04-02/0/POSTING-en.html

1 Like

Interesting reading, Thank you.

It would appear that the same UTF8 char, can be encoded in different ways. (See below).
Interestingly, one reason I use putty, is that the copy and paste is so good. None of this ctrl C, ctrl V business. It’s just high light and then middle mouse press. I have a work around so I will solve this when I have more free time.

Thanks for all your help. And sorry for troubling you.

Overlong encodings

In principle, it would be possible to inflate the number of bytes in an encoding by padding the code point with leading 0s. To encode the euro sign € from the above example in four bytes instead of three, it could be padded with leading 0s until it was 21 bits long – 000 000010 000010 101100, and encoded as 11110000 10000010 10000010 10101100 (or F0 82 82 AC in hexadecimal). This is called an overlong encoding.

https://en.wikipedia.org/wiki/UTF-8