MATLAB Function Reference
regexp

Match regular expression

Syntax

• ```start = regexp(str,expr)
[start,finish] = regexp(str,expr)
[start,finish,tokens] = regexp(str,expr)
[...] = regexp(str,expr,'once')
```

Description

```start = regexp(str,expr) ``` returns a row vector, `start`, containing the indices of the substrings in `str` that match the regular expression string, `expr`.

When either `str` or `expr` is a cell array of strings, `regexp` returns an `m`-by-`n` cell array of row vectors of indices, where `m` is the the number of strings in `str` and `n` is the number of regular expression patterns in `expr`.

```[start,finish] = regexp(str,expr) ``` returns an additional row vector `finish`, that contains the indices of the last character of the corresponding substrings in `start`.

```[start,finish,tokens] = regexp(str,expr) ``` returns a `1`-by-`n` cell array, `tokens`, of beginining and ending indices of tokens within the corresponding substrings in `start` and `finish`. Tokens are denoted by parentheses in the expression, `expr`.

```[...] = regexp(str,expr,'once') ``` finds just the first match. (By default, `regexp` returns all matches.) If no matches are found, then all return values are empty.

Remarks

See Regular Expressions, in the MATLAB documentation, for a listing of all regular expression metacharacters supported by MATLAB.

`regexp` does not support international character sets.

Examples

Example 1

Return a row vector of indices that match words that start with `c`, end with `t`, and contain one or more vowels between them:

• ```str = 'bat cat can car coat court cut ct caoueouat';
regexp(str, 'c[aeiou]+t')
ans =
5    17    28    35
```

Example 2

Return a cell array of row vectors of indices that match capital letters and whitespaces in the cell array of strings, `str`:

• ```str = {'Madrid, Spain' 'Romeo and Juliet' 'MATLAB is great'};
s = regexp(str, {'[A-Z]' '\s'});
```

Capital letters, '`[A-Z]`', were found at these `str` indices:

• ```s{:,1}
ans =
1     9
ans =
1    11
ans =
1     2     3     4     5     6
```

Space characters, '`\s`', were found at these `str` indices:

• ```s{:,2}
ans =
8
ans =
6    10
ans =
7    10
```

Example 3

Return the starting and ending indices of words containing the letter `x`:

• ```str = 'regexp helps you relax';
[s,f] = regexp(str, '\w*x\w*')
s =
1    18
f =
6    22
```

Example 4

Return the starting and ending indices of substrings contained by the letter `s`. Also return the starting and ending indices of the token defined within the parentheses:

• ```str = 'six sides of a hexagon';
[s,f,t] = regexp(str, 's(\w*)s')
s =
5
f =
9
t =
[1x2 double]

t{:}
ans =
6     8
```

`regexpi`, `regexprep`, `strfind`, `findstr`, `strmatch`, `strcmp`, `strcmpi`, `strncmp`, `strncmpi`