regular expression
? 今天学习了perl的正则表达式,它重要吗?已经很久没有用过了,但是需要的时候就急了,因为每次用过就仍了,没有好好的理解过。
|
Character |
Pattern |
. |
Match any single character except newline. Can match newline in awk. |
* |
Match any number (or none) of the single character that immediately precedes it. The preceding character can also be a regular expression. For example,since . (dot) means any character,.* means "match any number of any character." |
^ |
Match the following regular expression at the beginning of the line or string. |
$ |
Match the preceding regular expression at the end of the line or string. |
/ |
Turn off the special meaning of the following character. |
[ ] |
Match any one of the enclosed characters. A hyphen (-) indicates a range of consecutive characters. A circumflex (^) as the first character in the brackets reverses the sense: it matches any one character not in the list. A hyphen or close bracket (]) as the first character is treated as a member of the list. All other metacharacters are treated as members of the list (i.e.,literally). |
{n,m} |
Match a range of occurrences of the single character that immediately precedes it. The preceding character can also be a metacharacter. {n} matches exactly n occurrences; {n,} matches at least n occurrences; and {n,m} matches any number of occurrences between n and m. n and m must be between 0 and 255,inclusive. |
/{n,m/} |
Just like {n,m},but with backslashes in front of the braces. |
/( /) |
Save the pattern enclosed between /( and /) into a special holding space. Up to nine patterns can be saved on a single line. The text matched by the subpatterns can be "replayed" in substitutions by the escape sequences /1 to /9. |
/n |
Replay the nth sub-pattern enclosed in /( and /) into the pattern at this point. n is a number from 1 to 9,with 1 starting on the left. 在perl regularEX中在()中的内容可以是一个group,也可以将其中匹配内容保存在一个变量中,变量名可以在serach pattern中使用,也可以在replacement pattern中使用。 |
/< /> |
Match characters at beginning (/<) or end (/>) of a word. |
+ |
Match one or more instances of preceding regular expression. |
? |
Match zero or one instances of preceding regular expression. |
| |
Match the regular expression specified before or after. |
( ) |
Apply a match to the enclosed group of regular expressions. |
/w |
Word character |
/W |
Non-word character |
/d |
Digit character |
/D |
Non-digit character |
/s |
Whitespace character |
/S |
Non-whitespace character |
?
?
?
1.3.2.2 Replacement patterns
The characters in the following table have special meaning only in replacement patterns:
Character |
Pattern |
/ |
Turn off the special meaning of the following character. |
/n |
Restore the text matched by the nth pattern previously saved by /( and /). n is a number from 1 to 9,with 1 starting on the left. |
& |
Reuse the text matched by the search pattern as part of the replacement pattern. |
~ |
Reuse the previous replacement pattern in the current replacement pattern. Must be the only character in the replacement pattern (ex and vi). |
% |
Reuse the previous replacement pattern in the current replacement pattern. Must be the only character in the replacement pattern (ed). |
/u |
Convert first character of replacement pattern to uppercase. |
/U |
Convert entire replacement pattern to uppercase. |
/l |
Convert first character of replacement pattern to lowercase. |
/L |
Convert entire replacement pattern to lowercase. |
/E |
Turn off previous /U or /L. |
/e |
Turn off previous /u or /l. |
?
?
?
?
1.3.2 Regular Expression OperatorsPerl provides the built-in regular expression operators qr//,m//,and s///,as well as the split function. Each operator accepts a regular expression pattern string that is run through string and variable interpolation and then compiled. Regular expressions are often delimited with the forward slash,but you can pick any non-alphanumeric,non-whitespace character. Here are some examples: qr#...# m!...! m{...} s|...|...| s[...][...] s<...>/.../ A match delimited by slashes (/.../) doesn't require a leading m: /.../ #same as m/.../ Using the single quote as a delimiter suppresses interpolation of variables and the constructs /N{name},/u,/l,/U,/L,/Q,/E. Normally these are interpolated before being passed to the regular expression engine.
qr// (Quote Regex) |
qr/PATTERN/ismxo
Quote and compile PATTERN as a regular expression. The returned value may be used in a later pattern match or substitution. This saves time if the regular expression is going to be repeatedly interpolated. The match modes (or lack of),/ismxo,are locked in.
m// (Matching) |
m/PATTERN/imsxocg
Match PATTERN against input string. In list context,returns a list of substrings matched by capturing parentheses,or else (1) for a successful match or ( ) for a failed match. In scalar context,returns 1 for success or "" for failure. /imsxo are optional mode modifiers. /cg are optional match modifiers. /g in scalar context causes the match to start from the end of the previous match. In list context,a /g match returns all matches or all captured substrings from all matches. A failed /g match will reset the match start to the beginning of the string unless the match is in combined /cg mode.
s/// (Substitution) |
s/PATTERN/REPLACEMENT/egimosx
Match PATTERN in the input string and replace the match text with REPLACEMENT,returning the number of successes. /imosx are optional mode modifiers. /g substitutes all occurrences of PATTERN. Each /e causes an evaluation of REPLACEMENT as Perl code.
split |
split /PATTERN/,EXPR,LIMIT split /PATTERN/,EXPR split /PATTERN/ split
Return a list of substrings surrounding matches of PATTERN in EXPR. If LIMIT,the list contains substrings surrounding the first LIMIT matches. The pattern argument is a match operator,so use m if you want alternate delimiters (e.g.,split m{PATTERN}). The match permits the same modifiers as m{}. Table 1-8 lists the after-match variables.
?
?
1.3.4 Examples
Example 1-1. Simple match
# Match Spider-Man,Spiderman,SPIDER-MAN,etc. my $dailybugle = "Spider-Man Menaces City!"; if ($dailybugle =~ m/spider[- ]?man/i) { do_something( ); }
Example 1-2. Match,capture group,and qr
# Match dates formatted like MM/DD/YYYY,MM-DD-YY,... my $date = "12/30/1969"; my $regex = qr!(/d/d)[-/](/d/d)[-/](/d/d(?:/d/d)?)!; if ($date =~ m/$regex/) { print "Day= ",$1,"Month=",$2,"Year= ",$3; }
Example 1-3. Simple substitution
# Convert <br> to <br /> for XHTML compliance my $text = "Hello World! <br>"; $text =~ s#<br>#<br />#ig;
Example 1-4. Harder substitution
# urlify - turn URL's into HTML links $text = "Check the website,http://www.oreilly.com/catalog/repr."; $text =~ s{ /b # start at word boundary ( # capture to $1 (https?|telnet|gopher|file|wais|ftp) : # resource and colon [/w/#~:.?+=&%@!/-] +? # one or more valid # characters # but take as little as # possible ) (?= # lookahead [.:?/-] * # for possible punctuation (?: [^/w/#~:.?+=&%@!/-] # invalid character | $ ) # or end of string ) }{<a href="$1">$1</a>}igox;
?
Any word (a word is defined as a sequence of alphanumerics - no whitespace) that contains a
double letter,for example "book" has a double "o" and "feed" has a double "e".
?
??/([a-zA-Z])/1/注意这里的/1,他代表了前面()中匹配的内容
(编辑:李大同)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!