Possible solution to the Windows AtomicParsley crashes?
dinkypumpkin
dinkypumpkin at gmail.com
Sat Oct 8 13:37:15 EDT 2011
On 08/10/2011 16:20, David Woodhouse wrote:
> If you *decode* UTF-8, that implies turning it into actual letters...
A different "decoding" in this case. What I mean here is that
HTML::Entities::decode_entities() returns a "decoded" string with the
utf8 flag set due to the presence of the expanded curly quotes. The
tagging falls over because I didn't check the utf8 flags before using
the metadata. An unfortunate mistake, I know.
> That seems wrong. Why do we have a file handle that *isn't* expecting
> normal UTF-8 text written to it? It can't be just "non-Unicode". It has
What I mean here is that the handle for the download history file is
opened without a Unicode-capable I/O output layer on it, so a warning is
generated when the string with expanded curly quotes is written to a
history record. As to why, I can't say. This probably just never came
up when get_iplayer was developed. I've never seen curly quotes or
similar entities in iPlayer metadata before, but they are certainly here
now. Somebody must have started doing cut-and-paste from MS Word.
More information about the get_iplayer
mailing list