You can see what’s going on on Unicode plus, it’s a search engine I find very helpful when dealing with Unicode (as I do quite often).
The UTF-8 encoding of Α is 0xce 0x91
. If you write the Zig string "\xce\x91"
you’ll get Α, or you can write it "Α"
, or \u{391}
, or \u{0391}
if you would like (note the curly braces, that’s important).
const expectEqualStrings = std.testing.expectEqualStrings;
test "ways to write Α" {
try expectEqualStrings("Α", "\xce\x91");
try expectEqualStrings("\u{391}", "\xce\x91");
}
In JSON, you can just add Α
directly to your string, because like Zig source code, it’s natively UTF-8 encoded. If you want to use an escape, it does have to be "\u0391"
, unlike Zig strings, JSON strings do not allow arbitrary byte sequences.
If you want to encoding-escape an emoji, or any other sequence not found in the Basic Multilingual Plane, you have to use surrogate pairs, like a savage. It’s Microsoft’s fault. In Zig you just write the codepoint in hexadecimal, which is why the curly braces are used, so you can write Α0
as "\u{391}0"
and you won’t get 㤐
.
Welcome to Ziggit! Hope this helps.