A little update to the bar receipt encoding mystery: I was looking at the wrong code page! While I’ve studied Ancient Greek at school, we didn’t learn about ancient Greek 8-bit encodings – thanks for nothing, German education system!
It turns out that code page 737 is the common Greek 8-bit encoding, not code page 869. Using that, we can reconstruct better what happened to the receipts full of question marks.
Example
Let’s compare a good and a messed-up receipt: Φ.Π.Α.
turns into ”.?.€.
When we look at the code pages involved:
Code page 737 | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
80 | Α | Β | Γ | Δ | Ε | Ζ | Η | Θ | Ι | Κ | Λ | Μ | Ν | Ξ | Ο | Π | 8F |
90 | Ρ | Σ | Τ | Υ | Φ | Χ | Ψ | Ω | α | β | γ | δ | ε | ζ | η | θ | 9F |
Code page 1252 | |||||||||||||||||
80 | € | ‚ | ƒ | „ | … | † | ‡ | ˆ | ‰ | Š | ‹ | Œ | Ž | 8F | |||
90 | ‘ | ’ | “ | ” | • | – | — | ˜ | ™ | š | › | œ | ž | Ÿ | 9F |
…we can trace the conversions of each character.
- Letters
Φ
andΑ
are encoded as 94 and 80 (hexadecimal) in code page 737 - When bytes 94 and 80 get parsed as 1252 data, they map to
”
and€
- The dot
.
is at 2E in both code pages and stays intact - Letter
Π
is 8F in 737 - But 8F is not assigned in code page 1252 (red gap in the table above)
- It gets replaced with a
?
- Result:
”.?.€.
Something is still missing
Or rather: Too much is missing! In other examples of good v. mixed-up texts, there are more question marks than we would expect.
Original | Ξ Ε Ν Ο Δ Ο Χ Ε Ι Α Κ Ω (Ν ) |
Σ Υ Ν Ο Λ Ο |
---|---|---|
Expected | ? „ Œ Ž ƒ Ž • „ ˆ € ‰ — |
‘ “ Œ Ž Š Ž |
Actual | ? „ ? ? ƒ ? ? „ ? € ? ? |
? “ ? ? ? ? |
So the “target” code page cannot be the 1252 encoding we know today. It must be a variant with more gaps, i.e. unassigned byte positions, leading to more question marks in the output.
Uppercase Greek letters | ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ |
---|---|
737 interpreted as 1252 | €?‚ƒ„…†‡ˆ‰Š‹Œ?Ž??‘’“”•–— |
Observed in the examples (incomplete) | € ƒ„ ‡???????????“”? ? |
737 interpreted as 1253 (wild guess) | €?‚ƒ„…†‡?‰?‹?????‘’“”•–— |
While code page 1253 (1252-variant for Greek)
matches somewhat better, it’s not a full match.
Capital Kappa Κ
maps
to ‰
, but it should be a ?
, etc.
Phew! So we’re searching for a 1252-like encoding…
- that contains
€ ƒ ‡ “ ”
- but not
Œ Š ‹ Ž ‰ • — ‘ ’
I think I’ll start looking at the beach, with a drink that has ‰! :)
Blog
- Dec 2024 – 3
- Nov 2024 – 5
- Oct 2024 – 9
- Sep 2024 – 6
- Aug 2024 – 5
- Jul 2024 – 6
- Jun 2024 – 6
- May 2024 – 7
- Apr 2024 – 8
- Mar 2024 – 4
- Feb 2024 – 9
- Jan 2024 – 10
- Dec 2023 – 8
- Nov 2023 – 1
- Oct 2023 – 5
- Sep 2023 – 5
- Aug 2023 – 8
- Jul 2023 – 1
- Apr 2023 – 1
Recently updated pages
- new art
- hrrngh!
- Knob-Out
- B.S.I. – Byte Scene Investigation
- Coppenheimer
- Hold mode minutiae
- Shall we play a game?
- Worms VBI
- strss
- Modding the Amiga boot hand
- More…