Possible solution to the Windows AtomicParsley crashes?

David Woodhouse dwmw2 at infradead.org
Sat Oct 8 11:20:17 EDT 2011


On Sat, 2011-10-08 at 13:18 +0100, dinkypumpkin wrote:
> Long Version: The overall problem is that get_iplayer doesn't anticipate 
> programme metadata will contain character entities that are outside the 
> the ISO 8859-1 character set. 

Ick. That's kind of broken. We should be using UTF-8 *everywhere*, never
legacy 8-bit crap.

>  When that happens, the metadata is automatically decoded as UTF-8,
> and the "Cannot decode string" error stems from attempting to decode
> it again. 

That doesn't make a lot of sense. Characters such as 'a' 'b', 'é', '♥'
etc. are encoded as a sequence of bytes (in UTF-8 they would be 0x61,
0x62, 0xC3 0xA9, and 0xE2 0x99 0xA5 respectively).

If you *decode* UTF-8, that implies turning it into actual letters...
but that's not meaningful unless you're actually rendering it as
graphics on a screen. We're not decoding it.

What you might mean is that we are recoding it, or converting from one
encoding to another. The word 'naïve' in UTF-8 would be represented 
by the bytes 6e 61 c3 af 76 65, and you could convert that to the legacy
ISO8859-1 encoding 6e 61 ef 76 65.

But that's not "decoding". That's converting from one encoding to
another. A set of numbers *without* a code is just a set of numbers; it
doesn't mean anything.

>  The "Wide character" warning 
> stems from attempting to write a UTF-8 string to a non-Unicode Perl
> file handle, in this case the handle for the download history file.

That seems wrong. Why do we have a file handle that *isn't* expecting
normal UTF-8 text written to it? It can't be just "non-Unicode". It has
to be interpreted using *some* encoding, and quite frankly using any
encoding other than UTF-8 in the 21st century is completely insane :)

-- 
dwmw2
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/get_iplayer/attachments/20111008/c605e6ef/attachment.bin>


More information about the get_iplayer mailing list