Selected parts of string
token = strtok(str)
token = strtok(str, delimiter)
[token, remain] = strtok(str, ...)
token = strtok(str)
parses str
from
left to right, returning part or all of the text in token
.
Using the white-space character as a delimiter, the token
output
begins at the start of str
, skipping any delimiters
that might appear at the start, and includes all characters up to
either the next delimiter or the end of str
. White-space
characters include space (ASCII 32), tab (ASCII 9), and carriage return
(ASCII 13).
The str
argument can be a character vector
enclosed in single quotation marks, a cell array of character vectors,
or a string array. If str
is a cell array or string
array containing N
pieces of text, then token
is
an array of N
tokens, with token{1}
derived
from str{1}
, token{2}
from str{2}
,
and so on.
token = strtok(str, delimiter)
is
the same as the above syntax except that you specify the delimiting
character(s) yourself with delimiter
. White-space
characters are not considered to be delimiters when using this syntax
unless you include them in the delimiter
argument.
If the delimiter
input specifies more than one
character, MATLAB® treats each character as a separate delimiter;
it does not treat the whole piece of text as one delimiter. The number
and order of characters in the delimiter
argument
is unimportant. Do not use escape sequences as delimiters. For example,
use char(9)
rather than '\t'
for
tab.
[token, remain] = strtok(str, ...)
returns
in remain
that part of str
,
if any, that follows token
. The delimiter is included
in remain
. If no delimiters are found in str
,
then the whole of str
(excluding any leading delimiting
characters) is returned in token
, and remain
has
no characters. If str
is a cell array of character
vectors, token
is a cell array of tokens and remain
is
a cell array of the remainders. If str
is a string
array, token
is a string array of tokens and remain
is
a string array of the remainders.
This example uses the default white-space delimiter. Note that
space characters at the start of the character vector are not included
in the token
output, but the space character that
follows token
is included in remain
:
s = ' This is a simple example.'; [token, remain] = strtok(s) token = This remain = is a simple example.
Take a character vector of HTML code and break it down into
segments delimited by the <
and >
characters.
Write a while
loop to parse
the character vector and print each segment:
s = sprintf('%s%s%s%s', ... '<ul class=continued><li class=continued>', ... '<pre><a name="13474"></a>token = strtok', ... '(''str'', delimiter)<a name="13475"></a>', ... 'token = strtok(''str'')'); remain = s; while true [str, remain] = strtok(remain, '<>'); if isempty(str), break; end disp(sprintf('%s', str)) end
Here is the output:
ul class=continued li class=continued pre a name="13474" /a token = strtok('str', delimiter) a name="13475" /a token = strtok('str')
Using strtok
on a cell array of character
vectors returns a cell array of character vectors in token
and
a character array in remain
:
s = {'all in good time'; ... 'my dog has fleas'; ... 'leave no stone unturned'}; remain = s; for k = 1:4 [token, remain] = strtok(remain); token end
Here is the output:
token = 'all' 'my' 'leave' token = 'in' 'dog' 'no' token = 'good' 'has' 'stone' token = 'time' 'fleas' 'unturned'