The origins of regular expressions lie in automata theory and formal language theory, both of which are part of theoretical computer science. These fields study models of computation (automata) and ways to describe and classify formal languages.
In the 1950s, mathematician Stephen Cole Kleene described these models using his mathematical notation called regular sets. The SNOBOL language was an early implementation of pattern matching, but not identical to regular expressions. Ken Thompson built Kleene’s notation into the editor QED as a means to match patterns in text files.
A regular expression is entered as part of a command and is a pattern made up of symbols, letters, and numbers that represent an input string for matching (or sometimes not matching). Matching the string to the specified pattern is called pattern matching. Pattern matching either succeeds or fails. If a regular expression can match two different parts of an input string, it will match the earliest part first.
Cisco configurations uses regular expression pattern matching in several implementations. The following is a list of some of these implementations:
- The ‘show’ output command
- BGP IP AS-path and X.29 access lists
- Modem (or chat) and system scripts
- X.25 route substitute destination feature
- Protocol translation ruleset scripts
Below the basic Cisco IOS regular expression characters and their functions.
Regular Expression Character | Function | Examples |
. | Matches any single character. |
|
\ | Matches the character following the backslash. Also matches (escapes) special characters. |
|
[ ] | Matches the characters or a range of characters separated by a hyphen, within left and right square brackets. |
|
^ | Matches the character or null string at the beginning of an input string. |
|
? | Matches zero or one occurrence of the pattern. (Precede the question mark with Ctrl-V sequence to prevent it from being interpreted as a help command.) |
|
$ | Matches the character or null string at the end of an input string. |
|
* | Matches zero or more sequences of the character preceding the asterisk. Also acts as a wildcard for matching any number of characters. |
|
+ | Matches one or more sequences of the character preceding the plus sign. |
|
() [] | Nest characters for matching. Separate endpoints of a range with a dash (-). |
|
| | Concatenates constructs. Matches one of the characters or character patterns on either side of the vertical bar. |
|
_ | Replaces a long regular expression list by matching a comma (,), left brace ({), right brace (}), the beginning of the input string, the end of the input string, or a space. | The characters _1300_ can match any of the following strings:
|
Note: The order for matching using the * or + character is longest construct first. Nested constructs are matched from the outside in. Concatenated constructs are matched beginning at the left side.
Note: To use the characters listed in the table, remove the special meaning by preceding each character with a backslash (\). The following examples are single-character patterns matching a dollar sign, an underscore, and a plus sign, respectively:
- \$
- \_
- \+
Note: You can reverse the matching of the range by including a caret (^) sign at the start of the range. The following example matches any letter except the ones listed:
- [^a-dqsv]
Examples:
The following example matches anything except a right square bracket (]) or the letter d:
- [^\]d]
The following example matches any number of occurrences of the letter a, including none:
- a*
The following pattern requires that at least one letter a be present in the string to be matched:
- a+
The following string matches any number of asterisks: (*):
- \**
The following pattern matches any number of the multiple-character string ab:
- (ab)*
The following pattern matches one or more instances of alphanumeric pairs:
- ([A-Za-z][0-9])+
The following regular expression matches an input string only if the string starts with abcd:
- ^abcd
The following expression is a range that matches any single letter, as long as it is not the letters a, b, c, or d:
- [^abcd]
The following example, the regular expression matches an input string that ends with .12:
- \.12$
This regular expression matches the letter a followed by any character (call it character #1) followed by bc, followed by any character (character #2), followed by character #1 again, followed by character #2 again. In this way, the regular expression can match aZbcTZT. The software identifies character #1 as Z and character #2 asT, and then uses Z and T again later in the regular expression:
- a(.)bc(.)\1\2
Remember: Alternation allows you to specify alternative patterns to match against a string. Separate the alternative patterns with a vertical bar (|). Exactly one of the alternatives can match the input string.
Â
References:
nice, I’m working with RegEx on file servers but for some reason never made the connection to the Cisco world…