Sun Release 4.1    Last change: 2 October 1989              ED(1)
Regular Expressions

     ed supports a limited form of  regular-expression  notation,
     which can be used in a line address to specify lines by con-
     tent. A regular expression (RE) specifies a set of character
     strings  to  match  against - such as "any string containing
     digits 5 through 9"  or  "only  lines  containing  uppercase

     letters."  A  member  of  this  set of strings is said to be
     matched by the regular expression.  Regular  expressions  or
     patterns  are  used  to  address  lines  in  the buffer (see
     Addresses , above), and also for  selecting  strings  to  be
     replaced using the s (substitute) command.

     Where multiple matches are present  in  a  line,  a  regular
     expression  matches  the  longest  of  the leftmost matching
     strings.

     Regular expressions can  be  built  up  from  the  following
     "single-character" RE's:

     c    Any ordinary character not listed below.   An  ordinary
          character matches itself.

     \    Backslash.  When followed by a special  character,  the
          RE  matches  the  "quoted" character.  A backslash fol-
          lowed by one of <, >, (, ),  {,  or  },  represents  an
          operator in a regular expression, as described below.

     .    Dot.  Matches any single character except NEWLINE.

     ^    As the leftmost character, a caret (or circumflex) con-
          strains the RE to match the leftmost portion of a line.
          A match of this type  is  called  an  "anchored  match"
          because  it  is  "anchored"  to a specific place in the
          line.  The ^ character loses its special meaning if  it
          appears in any position other than the start of the RE.

     $    As the rightmost character, a  dollar  sign  constrains
          the RE to match the rightmost portion of a line.  The $
          character loses its special meaning if  it  appears  in
          any position other than at the end of the RE.

     ^RE$ The construction ^RE$ constrains the RE  to  match  the
          entire line.

     \<   The sequence \< in an RE constrains  the  one-character
          RE  immediately following it only to match something at
          the beginning of a  "word";  that  is,  either  at  the
          beginning of a line, or just before a letter, digit, or
          underline and after a character not one of these.

     \>   The sequence \> in an RE constrains  the  one-character
          RE  immediately following it only to match something at
          the end of a "word."

     [c...]
          A nonempty string of  characters,  enclosed  in  square
          brackets  matches  any  single character in the string.
          For example, [abcxyz] matches any single character from
          the  set  `abcxyz'.   When  the  first character of the
          string is a caret (^), then the RE matches any  charac-
          ter  except  NEWLINE  and those in the remainder of the
          string.  For example, `[^45678]' matches any  character
          except  `45678'.   A  caret  in  any  other position is
          interpreted as an ordinary character.

     []c...]
          The  right  square  bracket  does  not  terminate   the
          enclosed  string if it is the first character (after an
          initial `^', if any), in the bracketed string.  In this
          position it is treated as an ordinary character.

     [l-r]
          The minus sign, between  two  characters,  indicates  a
          range  of  consecutive  ASCII characters to match.  For
          example, the range `[0-9]' is equivalent to the  string
          `[0123456789]'.   Such a bracketed string of characters
          is known as a character class.  The `-' is  treated  as
          an  ordinary  character  if  it  occurs first (or first
          after an initial ^) or last in the string.

     d    Delimiter character.  The character used to delimit  an
          RE  within  a  command is special for that command (for
          example, see how / is used in the g command, below).

     The following rules and special characters  allow  for  con-
     structing RE's from single-character RE's:

          A concatenation of RE's matches a concatenation of text
          strings,  each  of which is a match for a successive RE
          in the search pattern.

     *    A single-character RE,  followed  by  an  asterisk  (*)
          matches   zero  or  more  occurrences  of  the  single-
          character RE. Such a pattern is called a closure.   For
          example,  [a-z][a-z]* matches any string of one or more
          lower case letters.

     \{m\}
     \{m,\}
     \{m,n\}
          A  one-character  RE  followed  by  \{m\},  \{m,\},  or
          \{m,n\} is an RE that matches a range of occurrences of
          the one-character RE. The values of m  and  n  must  be
          nonnegative  integers  less  than  256;  \{m\}  matches
          exactly  m  occurrences;  \{m,\}  matches  at  least  m
          occurrences;  \{m,n\} matches any number of occurrences
          between  m  and  n,  inclusively.   Whenever  a  choice
          exists, the RE matches as many occurrences as possible.

     \(...\)
          An RE enclosed between the character sequences  \(  and
          \) matches whatever the unadorned RE matches, but saves
          the string matched by the enclosed  RE  in  a  numbered
          substring  register.  There can be up to nine such sub-
          strings in an RE,  and  parenthesis  operators  can  be
          nested.

     \n   Match the contents of the nth substring  register  from
          the  current RE. This provides a mechanism for extract-
          ing matched substrings.  For  example,  the  expression
          ^\(..*\)\1$  matches  a line consisting entirely of two
          adjacent non-null appearances of the same string.  When
          nested  parenthesized  substrings  are  present,  n  is
          determined by counting occurrences of \( starting  from
          the left.

     //   The null RE (//) is equivalent to the last  RE  encoun-
          tered.