Sep
29
2010

Cisco regular expressions

The origins of regular expressions lie in automata theory and formal language theory, both of which are part of theoretical computer science. These fields study models of computation (automata) and ways to describe and classify formal languages.

In the 1950s, mathematician Stephen Cole Kleene described these models using his mathematical notation called regular sets. The SNOBOL language was an early implementation of pattern matching, but not identical to regular expressions. Ken Thompson built Kleene’s notation into the editor QED as a means to match patterns in text files.

A regular expression is entered as part of a command and is a pattern made up of symbols, letters, and numbers that represent an input string for matching (or sometimes not matching). Matching the string to the specified pattern is called pattern matching. Pattern matching either succeeds or fails. If a regular expression can match two different parts of an input string, it will match the earliest part first.

Cisco configurations uses regular expression pattern matching in several implementations. The following is a list of some of these implementations:

  • The ‘show’ output command
  • BGP IP AS-path and X.29 access lists
  • Modem (or chat) and system scripts
  • X.25 route substitute destination feature
  • Protocol translation ruleset scripts

 

Below the basic Cisco IOS regular expression characters and their functions.

Regular Expression Character Function Examples
. Matches any single character.
  • 0.0 matches 0×0 and 020
  • t..t matches strings such as test, text, and tart
\ Matches the character following the backslash. Also matches (escapes) special characters.
  • 172\.1\.. matches 172.1.10.10 but not 172.12.0.0
  • \. allows a period to be matched as a period
[ ] Matches the characters or a range of characters separated by a hyphen, within left and right square brackets.
  • [02468a-z] matches 0, 4, and w, but not 1, 9, or K
^ Matches the character or null string at the beginning of an input string.
  • ^123 matches 1234, but not 01234
? Matches zero or one occurrence of the pattern. (Precede the question mark with Ctrl-V sequence to prevent it from being interpreted as a help command.)
  • ba?b matches bb and bab
$ Matches the character or null string at the end of an input string.
  • 123$ matches 0123, but not 1234
* Matches zero or more sequences of the character preceding the asterisk. Also acts as a wildcard for matching any number of characters.
  • 5* matches any occurrence of the number 5 including none
  • 18\..* matches the characters 18. and any characters that follow 18.
+ Matches one or more sequences of the character preceding the plus sign.
  • 8+ requires there to be at least one number 8 in the string to be matched
() [] Nest characters for matching. Separate endpoints of a range with a dash (-).
  • (17)* matches any number of the two-character string 17
  • ([A-Za-z][0-9])+ matches one or more instances of letter-digit pairs: b8 and W4, as examples
| Concatenates constructs. Matches one of the characters or character patterns on either side of the vertical bar.
  • A(B|C)D matches ABD and ACD, but not AD, ABCD, ABBD, or ACCD
_ Replaces a long regular expression list by matching a comma (,), left brace ({), right brace (}), the beginning of the input string, the end of the input string, or a space. The characters _1300_ can match any of the following strings:

  • ^1300$
  • ^1300space
  • space1300
  • {1300,
  • ,1300,
  • {1300}
  • ,1300,

Note: The order for matching using the * or + character is longest construct first. Nested constructs are matched from the outside in. Concatenated constructs are matched beginning at the left side.
Note: To use the characters listed in the table, remove the special meaning by preceding each character with a backslash (\). The following examples are single-character patterns matching a dollar sign, an underscore, and a plus sign, respectively:

  • \$
  • \_
  • \+

Note: You can reverse the matching of the range by including a caret (^) sign at the start of the range. The following example matches any letter except the ones listed:

  • [^a-dqsv]

 

Examples:
The following example matches anything except a right square bracket (]) or the letter d:

  • [^\]d]

The following example matches any number of occurrences of the letter a, including none:

  • a*

The following pattern requires that at least one letter a be present in the string to be matched:

  • a+

The following string matches any number of asterisks: (*):

  • \**

The following pattern matches any number of the multiple-character string ab:

  • (ab)*

The following pattern matches one or more instances of alphanumeric pairs:

  • ([A-Za-z][0-9])+

The following regular expression matches an input string only if the string starts with abcd:

  • ^abcd

The following expression is a range that matches any single letter, as long as it is not the letters a, b, c, or d:

  • [^abcd]

The following example, the regular expression matches an input string that ends with .12:

  • \.12$

This regular expression matches the letter a followed by any character (call it character #1) followed by bc, followed by any character (character #2), followed by character #1 again, followed by character #2 again. In this way, the regular expression can match aZbcTZT. The software identifies character #1 as Z and character #2 asT, and then uses Z and T again later in the regular expression:

  • a(.)bc(.)\1\2

Remember: Alternation allows you to specify alternative patterns to match against a string. Separate the alternative patterns with a vertical bar (|). Exactly one of the alternatives can match the input string.

 

References: