parser error
Bernard Peek
bap at shrdlu.com
Sat Oct 28 10:12:48 PDT 2017
On 27/10/2017 21:47, RS wrote:
> On 27/10/2017 19:06, Bernard Peek wrote:
>>>
>>>> It is then up to the calling script (get_iplayer.pl) to decide what
>>>> action to take in response the action taken by the parser. It is not
>>>> adequate just to allow XML::LibXML to display "parser error" and take
>>>> no further action.
>>>
>>> Even though that's what the XML standard says IS the correct action?
>>>
>>
>> PMFJI
>>
>> I built data transfer standards for the UK's outdoor advertising
>> industry. I deliberately chose to use XML based standards because it
>> enabled automatic validation of data files. The standards were quite
>> specific. All automated systems were required to refuse any files not
>> compatible with the DTD I had on my web server. Data providers were
>> expected to prevalidate any files they sent to any other company.
>>
>> This was my main argument for switching to XML from flat-files.
>>
> If you are both right about the strictness of the standard, and I have
> to defer to your superior knowledge, why does XML::LibXML have options
> for recovery and validation?
If you are particularly masochistic you can write code to recover data
from files that you already know are corrupt. Sometimes you can't just
throw the problem back at the data provider. The nice thing about
failing to validate is that it's a boolean value. It unambiguously
points the finger of blame at the data provider. Whether you can use
that to force them to fix the problem is a political issue not a
technical one.
> According to
> http://search.cpan.org/dist/XML-LibXML/lib/XML/LibXML/Parser.pod#PARSER_OPTIONS
> and
> http://search.cpan.org/dist/XML-LibXML/lib/XML/LibXML/Error.pod
> it also has a choice of Verbose and Quiet error handlers. Authors can
> use their own error handlers, or remove the error handler altogether.
> An example given is recovery from a missing closing tag. I have not
> seen a definition of fatal error. Is a spurious NUL a fatal error? I
> suspect it is less serious than a missing closing tag. It is easy to
> recover from; you just ignore it. Subject to what anyone may tell me,
> I would have thought non-matching tags would be more likely to be a
> fatal error.
>
> It must be remembered that an important function of XML, in contrast
> to other mark up languages, is that it is human readable as well as
> machine readable.
>
Making XML human-readable was a compromise. The drawback is that it
encourages tinkerers to believe that they can or should attempt to fix
problems when, in most cases, the only sensible thing to do is kick them
back to the provider. What you end up with is multiple people in
different places putting in lots of time fixing someone else's mistakes.
Allowing that to continue is a disservice to other data users and should
be a last resort. Just because something is doable doesn't make doing it
a good idea.
--
Bernard Peek
bap at shrdlu.com
More information about the get_iplayer
mailing list