加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 百科 > 正文

正则表达式-学习笔记

发布时间:2020-12-14 01:01:15 所属栏目:百科 来源:网络整理
导读:Full regular expressions are composed of two types of characters. The special characters are called metacbaracter,while the rest are called literal,or normal text characters. It might help to consider regular expressions as their own langua
Full regular expressions are composed of two types of characters. The special characters are called metacbaracter,while the rest are called literal,or normal text characters. It might help to consider regular expressions as their own language,with literal text acting as the words and metacharacters as the grammer. The egrep command interprets the first command-line argument as a regular expression,and any remaining arguments as the file(s) to search. Note,however,that the single quotes are not part of regular expression,but are needed by command shell. ^ and $ which represent the start and end,respectively,of the line of text as it is being checked. [...],usually called a character class,lets you list the characters you want to allow at that point in the match. Within a character class,the character-class metacharacter '-' indicates a range of characters. Note that a dash is a metacharacter only within a character class - otherwise it matches the normal dash character class. If you use [^...] instead of [...],the class matches any character that isn't listed. The metacharacter . is a shorthand for a character class that matches any character. It can be convenient when you want to have an "any character here" placeholder in your expression. A very convenient metacharacter is |,which means "or". With the parenthese are required because without them,it will be different. Case-insensitive and case-sensitive is not a part of the regular-expression language,but is a related useful feature many tools provide. egrep's command-line option "-i" tells it to do a case-insensitive match. A common problem is that a regualr expression that matches the word you want can often also match where the "word" is embedded within a larger word. You can use the metasequnces &; and &; if your version happens to support them. You can think of them as word-based version of ^ and $ that match the position at the start and end of a word. The metacharacter ? means optional. It is placed after the character or string which is srounded by parenthese. It means that it is allowed to appear at that point in the expression,but whose existence isn't actually required to still be considered a successful match. Similar to the question mark are + and *. The metachacter + means "one or more of the immediately-preceding item",and * means "any number,including none,of the item". Some version of egrep support a metasequence for providing your own minimum and maximum times of repetition,it is {min,max} placed after the item. Backreferencing is a regular-expression feature that allows you to match new text that is the same as some text matched earlier in the expression. Finally,we replace the second word by the special metasequence 1(2,3...). For example,we can use &;[a-zA-Z]+ +1&; to find the double word.

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读