[PATCH 1/5] string: add strtok/strtokv
Tobias Waldekranz
tobias at waldekranz.com
Thu Sep 4 06:35:30 PDT 2025
On tor, sep 04, 2025 at 13:00, Ahmad Fatoum <a.fatoum at pengutronix.de> wrote:
> Hello Tobias,
>
> On 8/28/25 5:05 PM, Tobias Waldekranz wrote:
>> Add an implementation of libc's standard strtok(3), which is useful
>> for tokenizing strings.
>
> strtok was previously removed in favor of strsep as it doesn't suffer
> from re-entrancy issues (poller and bthreads can run during delays). If
> you want to allow escapes, there's also strsep_unescaped.
Aha, my bad. I did not realize that there was more than one thread of
execution.
strsep() is not quite the same thing though, I am really after the
strtok()'s behavior of skipping empty tokens. How would you feel about
adding strtok_r() instead?
>> Also, add a version that will collect all tokens from a string into an
>> array, which is useful in situations where you need to know how many
>> tokens there are, and when a token's relative position in the order is
>> significant.
>
> We have the inverse as strjoin, but not this. Maybe call it strsplit
> instead?
If you accept my strtok_r() suggestion, do you still think strsplit() is
a better name, or is there value in signaling the underlying strtok()
behavior?
> Cheers,
> Ahmad
>
>>
>> Signed-off-by: Tobias Waldekranz <tobias at waldekranz.com>
>> ---
>> include/string.h | 2 ++
>> lib/string.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 68 insertions(+)
>>
>> diff --git a/include/string.h b/include/string.h
>> index 71affe48b6..c8df8540d8 100644
>> --- a/include/string.h
>> +++ b/include/string.h
>> @@ -8,6 +8,8 @@
>> void *mempcpy(void *dest, const void *src, size_t count);
>> int strtobool(const char *str, int *val);
>> char *strsep_unescaped(char **, const char *, char *);
>> +char *strtok(char *str, const char *delim);
>> +int strtokv(char *str, const char *delim, char ***vecp);
>> char *stpcpy(char *dest, const char *src);
>> bool strends(const char *str, const char *postfix);
>>
>> diff --git a/lib/string.c b/lib/string.c
>> index 73637cd971..be7e65eb45 100644
>> --- a/lib/string.c
>> +++ b/lib/string.c
>> @@ -593,6 +593,72 @@ char *strsep_unescaped(char **s, const char *ct, char *delim)
>> return sbegin;
>> }
>>
>> +/**
>> + * strtok - extract tokens from string
>> + * @str: string to split
>> + * @delim: set of delimiter characters
>> + *
>> + * The strtok() function breaks up a string into zero or more nonempty
>> + * tokens. On the first call, the string to be parsed should be
>> + * specified in @str. In each subsequent call that should parse the
>> + * same string, @str must be NULL.
>> + *
>> + * @delim specifies a set of bytes that delimit the tokens in the
>> + * string.
>> + *
>> + * Each call to strtok() returns a pointer to a string containing the
>> + * next token. This is done by replacing the first delimiter with a
>> + * NUL character, the operation is thus destructive to the string. If
>> + * no more tokens are found, strtok() returns NULL.
>> + */
>> +char *strtok(char *str, const char *delim)
>> +{
>> + static char *cursor;
>> +
>> + if (str)
>> + cursor = str;
>> +
>> + if (!cursor)
>> + return NULL;
>> +
>> + cursor += strspn(cursor, delim);
>> + if (*cursor == '\0') {
>> + cursor = NULL;
>> + return NULL;
>> + }
>> +
>> + return strsep(&cursor, delim);
>> +}
>> +EXPORT_SYMBOL(strtok);
>> +
>> +/**
>> + * strtokv - split string into array of tokens based on a delimiter set
>> + * @str: string to split
>> + * @delim: set of delimiter characters
>> + * @vecp: array of tokens
>> + *
>> + * Split @str into tokens delimited by @delim, using strtok(), and
>> + * store the allocated token array in @vecp, which the caller is
>> + * responsible for freeing.
>> + *
>> + * Return: The number of tokens in the array.
>> + */
>> +int strtokv(char *str, const char *delim, char ***vecp)
>> +{
>> + char *tok, **vec = NULL;
>> + int cnt = 0;
>> +
>> +
>> + for (tok = strtok(str, delim); tok; tok = strtok(NULL, delim)) {
>> + vec = xrealloc(vec, (cnt + 1) * sizeof(*vec));
>> + vec[cnt++] = tok;
>> + }
>> +
>> + *vecp = vec;
>> + return cnt;
>> +}
>> +EXPORT_SYMBOL(strtokv);
>> +
>> #ifndef __HAVE_ARCH_STRSWAB
>> /**
>> * strswab - swap adjacent even and odd bytes in %NUL-terminated string
>
> --
> Pengutronix e.K. | |
> Steuerwalder Str. 21 | http://www.pengutronix.de/ |
> 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
More information about the barebox
mailing list