New radio PIDs, more than 8 characters
James Scholes
james at jls-radio.com
Tue Aug 15 04:14:12 PDT 2017
C E Macfarlane wrote:
> Thinking about this a bit more, I wouldn't wish to claim a spurious hit was
> more likely with no upper limit, but nevertheless I would still regard it
> better programming practice to have one - with normal written English, the
> potential for spurious hits would be low, and in the event of one it would
> be delimited quickly by the next space, but if you were trawling raw HTML or
> similar code, which might contain longs strings of pseudo-random characters
> as not just PIDs, but also GUIDs, session keys, and the like, then the
> potential for spurious hit would be very much increased, so more would be
> found, and in the interests of program efficiency you'd want them to be
> delimited sooner rather than later.
This is reasonable. The regexp without an upper limit sourced from the
BBC's code is used to confirm that a given string is formed only of
characters from an acceptable set to make up a PID. In most cases the
string which is passed in is explicitly extracted from the request URL,
as the application in question is a server-side, web-based one. For
such purposes I think the lack of an upper limit is completely
acceptable, but if you're writing code to extract a valid PID from text
of unknown length or complexity, the regexp probably is not very efficient.
--
James Scholes
http://twitter.com/JamesScholes
More information about the get_iplayer
mailing list