正则表达式中的基本正则规则详解01
本篇文章挺长的,读者可以通过上面的目录选择性的阅读。有什么不懂的也可以尽管提问!!! 字符组
例子: public class pa1{
public static void main(String args[]){
String sta="4";
String regex="[0123456789]";
System.out.println("本例子用于判读数字是不是十进制");
if(sta.matches(regex)){
System.out.println("不是十进制数字");
}else{
System.out.println("是十进制数字");
}
}
}
运行结果: 本例子用于判读数字是不是十进制 是十进制数字 连字符这个是连字符 “-” 可以测试下: public class pa1{
public static void main(String args[]){
String sta="4";
String regex="[0-9]";
System.out.println("本例子用于判读数字是不是十进制");
if(sta.matches(regex)){
System.out.println("不是十进制数字");
}else{
System.out.println("是十进制数字");
}
}
}
运行结果: 本例子用于判读数字是不是十进制 是十进制数字 注意事项
排除型字符组
例子: public class pa2{
public static void main(String args[]){
String stas[]={"1","8","狼","l","!","!"," ","n"};//最后几个从l自后开始算,分别是:英文感叹号
//中文感叹号、空格、制表符、换行
String regex="[^123]";
for(String i:stas){
System.out.println("字符["+i+"]匹配状态为:["+i.matches(regex)+"]");
}
}
}
运行结果: 字符[1]匹配状态为:[false]
字符[8]匹配状态为:[true]
字符[狼]匹配状态为:[true]
字符[l]匹配状态为:[true]
字符[!]匹配状态为:[true]
字符[!]匹配状态为:[true]
字符[ ]匹配状态为:[true]
字符[ ]匹配状态为:[true]
字符[
]匹配状态为:[true]
所以排除型表示的意思就是排除当前的字符,然后满足世界上有的字符。 字符组简记法
例子: public class pa3 {
public static void main(String[] args) {
String digitChar = "d";
String noDigitChar = "D";
String wordChar = "w";
String noWordChar = "W";
String spaceChar = "s";
String noSpaceShar = "S";
String[] strs = new String[] { "0","3","9","a","z","E","G","t","r","n","狼" };
for (String s : strs) {
if (regexMatch(s,digitChar)) {
System.out.println(""" + digitChar + "" can match "" + s
+ """);
} else {
System.out.println(""" + digitChar + "" can not match "" + s
+ """);
}
}
System.out.println("");
for (String s : strs) {
if (regexMatch(s,noDigitChar)) {
System.out.println(""" + noDigitChar + "" can match "" + s
+ """);
} else {
System.out.println(""" + noDigitChar + "" can not match ""
+ s + """);
}
}
System.out.println("");
for (String s : strs) {
if (regexMatch(s,wordChar)) {
System.out.println(""" + wordChar + "" can match "" + s
+ """);
} else {
System.out.println(""" + wordChar + "" can not match "" + s
+ """);
}
}
System.out.println("");
for (String s : strs) {
if (regexMatch(s,noWordChar)) {
System.out.println(""" + noWordChar + "" can match "" + s
+ """);
} else {
System.out.println(""" + noWordChar + "" can not match ""
+ s + """);
}
}
System.out.println("");
for (String s : strs) {
if (regexMatch(s,spaceChar)) {
System.out.println(""" + spaceChar + "" can match "" + s
+ """);
} else {
System.out.println(""" + spaceChar + "" can not match "" + s
+ """);
}
}
System.out.println("");
for (String s : strs) {
if (regexMatch(s,noSpaceShar)) {
System.out.println(""" + noSpaceShar + "" can match "" + s
+ """);
} else {
System.out.println(""" + noSpaceShar + "" can not match ""
+ s + """);
}
}
}
public static boolean regexMatch(String s,String regex) {
return s.matches(regex);
}
}
运行结果: "d" can match "0"
"d" can match "3"
"d" can match "8"
"d" can match "9"
"d" can not match "a"
"d" can not match "z"
"d" can not match "E"
"d" can not match "G"
"d" can not match " "
"d" can not match " "
"d" can not match " "d" can not match "
" "d" can not match "!" "d" can not match "!" "d" can not match "狼" "D" can not match "0" "D" can not match "3" "D" can not match "8" "D" can not match "9" "D" can match "a" "D" can match "z" "D" can match "E" "D" can match "G" "D" can match " " "D" can match " " "D" can match "
"D" can match " "
"D" can match "!"
"D" can match "!"
"D" can match "狼"
"w" can match "0"
"w" can match "3"
"w" can match "8"
"w" can match "9"
"w" can match "a"
"w" can match "z"
"w" can match "E"
"w" can match "G"
"w" can not match " "
"w" can not match " "
"w" can not match " "w" can not match "
" "w" can not match "!" "w" can not match "!" "w" can not match "狼" "W" can not match "0" "W" can not match "3" "W" can not match "8" "W" can not match "9" "W" can not match "a" "W" can not match "z" "W" can not match "E" "W" can not match "G" "W" can match " " "W" can match " " "W" can match "
"W" can match " "
"W" can match "!"
"W" can match "!"
"W" can match "狼"
"s" can not match "0"
"s" can not match "3"
"s" can not match "8"
"s" can not match "9"
"s" can not match "a"
"s" can not match "z"
"s" can not match "E"
"s" can not match "G"
"s" can match " "
"s" can match " "
"s" can match " "s" can match "
" "s" can not match "!" "s" can not match "!" "s" can not match "狼" "S" can match "0" "S" can match "3" "S" can match "8" "S" can match "9" "S" can match "a" "S" can match "z" "S" can match "E" "S" can match "G" "S" can not match " " "S" can not match " " "S" can not match "
"S" can not match " "
"S" can match "!"
"S" can match "!"
"S" can match "狼"
特殊的简记法:点号
例子: public class pa4 {
public static void main(String[] args) {
String[] strings = new String[] { "a","A","0","$","(",".","r"};
String normalDot = ".";
String escapedDot = ".";
String characterClassDot = "[.]";
for (String s : strings) {
if (regexMatch(s,normalDot)) {
System.out.println(""" + s + "" can be matched with regex ""
+ normalDot + """);
} else {
System.out.println(""" + s
+ "" can not be matched with regex "" + normalDot + """);
}
}
System.out.println("");
for (String s : strings) {
if (regexMatch(s,escapedDot)) {
System.out.println(""" + s + "" can be matched with regex ""
+ escapedDot + """);
} else {
System.out.println(""" + s
+ "" can not be matched with regex "" + escapedDot + """);
}
}
System.out.println("");
for (String s : strings) {
if (regexMatch(s,characterClassDot)) {
System.out.println(""" + s + "" can be matched with regex ""
+ characterClassDot + """);
} else {
System.out.println(""" + s
+ "" can not be matched with regex "" + characterClassDot + """);
}
}
System.out.println("");
}
public static boolean regexMatch(String s,String regex) {
return s.matches(regex);
}
}
运行结果: "a" can be matched with regex "."
"A" can be matched with regex "."
"0" can be matched with regex "."
"$" can be matched with regex "."
"(" can be matched with regex "."
"." can be matched with regex "."
" " can not be matched with regex "."
" can not be matched with regex "." "a" can not be matched with regex "." "A" can not be matched with regex "." "0" can not be matched with regex "." "$" can not be matched with regex "." "(" can not be matched with regex "." "." can be matched with regex "." "
" can not be matched with regex "." " can not be matched with regex "."
"a" can not be matched with regex "[.]"
"A" can not be matched with regex "[.]"
"0" can not be matched with regex "[.]"
"$" can not be matched with regex "[.]"
"(" can not be matched with regex "[.]"
"." can be matched with regex "[.]"
" " can not be matched with regex "[.]"
" can not be matched with regex "[.]"
量词
public class pa5 {
public static void main(String[] args) {
String[] strings = new String[] { "","aa","aaa"};
String regex = "a*";
String regex2 = "a?";
String regex3 = "a+";
for (String str : strings) {
if (str.matches(regex)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex
+ """);
} else {
System.out.println(""" + str
+ "" can not be matched with regex "" + regex
+ """);
}
}
System.out.println("");
for (String str : strings) {
if (str.matches(regex2)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex2
+ """);
} else {
System.out.println(""" + str
+ "" can not be matched with regex "" + regex2
+ """);
}
}
System.out.println("");
for (String str : strings) {
if (str.matches(regex3)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex3
+ """);
} else {
System.out.println(""" + str
+ "" can not be matched with regex "" + regex3
+ """);
}
}
}
}
运行结果: "" can be matched with regex "a*"
"a" can be matched with regex "a*"
"aa" can be matched with regex "a*"
"aaa" can be matched with regex "a*"
"" can be matched with regex "a?"
"a" can be matched with regex "a?"
"aa" can not be matched with regex "a?"
"aaa" can not be matched with regex "a?"
"" can not be matched with regex "a+"
"a" can be matched with regex "a+"
"aa" can be matched with regex "a+"
"aaa" can be matched with regex "a+"
区间量词
例子: public class pa6 {
public static void main(String[] args) {
String[] strings = new String[] { "","aaa","aaaa","aaaaa" };
String regex = "a{2,4}";
String regex2 = "a{2,}";
String regex3 = "a{3}";
for (String str : strings) {
if (str.matches(regex)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex + """);
} else {
System.out.println(""" + str
+ "" can not be matched with regex "" + regex + """);
}
}
System.out.println("");
for (String str : strings) {
if (str.matches(regex2)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex2 + """);
} else {
System.out
.println(""" + str
+ "" can not be matched with regex ""
+ regex2 + """);
}
}
System.out.println("");
for (String str : strings) {
if (str.matches(regex3)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex3 + """);
} else {
System.out
.println(""" + str
+ "" can not be matched with regex ""
+ regex3 + """);
}
}
}
}
运行结果: "" can not be matched with regex "a{2,4}"
"a" can not be matched with regex "a{2,4}"
"aa" can be matched with regex "a{2,4}"
"aaa" can be matched with regex "a{2,4}"
"aaaa" can be matched with regex "a{2,4}"
"aaaaa" can not be matched with regex "a{2,4}"
"" can not be matched with regex "a{2,}"
"a" can not be matched with regex "a{2,}"
"aa" can be matched with regex "a{2,}"
"aaa" can be matched with regex "a{2,}"
"aaaa" can be matched with regex "a{2,}"
"aaaaa" can be matched with regex "a{2,}"
"" can not be matched with regex "a{3}"
"a" can not be matched with regex "a{3}"
"aa" can not be matched with regex "a{3}"
"aaa" can be matched with regex "a{3}"
"aaaa" can not be matched with regex "a{3}"
"aaaaa" can not be matched with regex "a{3}"
量词的局限,括号的使用
例子: public class pa7 {
public static void main(String[] args) {
String[] strings = new String[] { "ac","acc","accc","acac","acacac"};
String regex = "ac+";
String regex2 = "(ac)+";
for (String str : strings) {
if (str.matches(regex)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex
+ """);
} else {
System.out.println(""" + str
+ "" can not be matched with regex "" + regex
+ """);
}
}
System.out.println("");
for (String str : strings) {
if (str.matches(regex2)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex2
+ """);
} else {
System.out.println(""" + str
+ "" can not be matched with regex "" + regex2
+ """);
}
}
}
}
运行结果: "ac" can be matched with regex "ac+"
"acc" can be matched with regex "ac+"
"accc" can be matched with regex "ac+"
"acac" can not be matched with regex "ac+"
"acacac" can not be matched with regex "ac+"
"ac" can be matched with regex "(ac)+"
"acc" can not be matched with regex "(ac)+"
"accc" can not be matched with regex "(ac)+"
"acac" can be matched with regex "(ac)+"
"acacac" can be matched with regex "(ac)+"
括号的用途:多选结构
例子: public class pa8 {
public static void main(String[] args) {
String[] strings = new String[] { "this","that","thit"};
String regex = "th[ia][st]";
for (String str : strings) {
if (str.matches(regex)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex
+ """);
} else {
System.out.println(""" + str
+ "" can not be matched with regex "" + regex
+ """);
}
}
}
}
运行结果: "this" can be matched with regex "th[ia][st]"
"that" can be matched with regex "th[ia][st]"
"thit" can be matched with regex "th[ia][st]"
而我们不想匹配错误的单词thit怎么办呢?? public class pa8 {
public static void main(String[] args) {
String[] strings = new String[] { "this","thit"};
String regex = "(this|that)";//当然也可以这么写th(is|at),这样把公共提出来后可以提高正则匹配效率。
for (String str : strings) {
if (str.matches(regex)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex
+ """);
} else {
System.out.println(""" + str
+ "" can not be matched with regex "" + regex
+ """);
}
}
}
}
运行结果: "this" can be matched with regex "(this|that)"
"that" can be matched with regex "(this|that)"
"thit" can not be matched with regex "(this|that)"
括号的用途:捕获分组
例子: import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class pa9 {
public static void main(String[] args) {
String email = "webmaster@itcast.net";
String regex = "(w+)@([w.]+)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(email);
if (m.find()) {
System.out.println("email add is:t" + m.group(0));//默认为整个正则表达式的匹配内容。
System.out.println("username is:t" + m.group(1));//第一个括号的匹配结果。
System.out.println("hostname is:t" + m.group(2));//第二个括号的匹配结果。
}
}
}
运行结果: email add is: webmaster@itcast.net
username is: webmaster
hostname is: itcast.net
需要强调的是:括号的先后顺序按照左括号的出现顺序编号,编号从1开始。编号0为整个正则表达式的匹配结果!! 捕获分组的注意事项
例子: import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class pa10 {
public static void main(String[] args) {
explainGroupNo();
System.out.println("");
explainGroupQuantifier();
}
public static void explainGroupNo() {
String email = "webmaster@itcast.net";
String regex = "((w+)@([w.]+))";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(email);
if (m.find()) {
System.out.println("match result:t" + m.group(0));
System.out.println("group No.1 is:t" + m.group(1));
System.out.println("group No.2 is:t" + m.group(2));
System.out.println("group No.3 is:t" + m.group(3));
}
}
public static void explainGroupQuantifier() {
String email = "webmaster@itcast.net";
String regex = "(w)+@([w.])+";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(email);
if (m.find()) {
System.out.println("match result:t" + m.group(0));
System.out.println("group No.1 is:t" + m.group(1));
System.out.println("group No.2 is:t" + m.group(2));
}
}
}
运行结果: match result: webmaster@itcast.net
group No.1 is: webmaster@itcast.net
group No.2 is: webmaster
group No.3 is: itcast.net
match result: webmaster@itcast.net
group No.1 is: r
group No.2 is: t
不捕获文本的括号
例子: import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class pa11 {
public static void main(String[] args) {
String email = "webmaster@itcast.net";
String regex = "(?:webmaster|admin)@([w.]+)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(email);
if (m.find()) {
System.out.println("match result:t" + m.group(0));
System.out.println("group No.1 is:t" + m.group(1));
}
}
}
运行结果: match result: webmaster@itcast.net
group No.1 is: itcast.net
因为只要出现了括号就会存在捕获分组,并且会保存捕获结果,但是使用(?:…)就不会保存捕获结果了。所以当要输出编号为1的时候就输出了第二个括号的内容。而不会输出第一个括号的捕获内容,因为第一个括号的捕获内容不会保存!!编号0依然是整个正则表达式匹配的内容。 括号的用途:反向引用
例子:验证html代码是否正确 public class pa12 {
public static void main(String[] args) {
String[] strings = new String[] { "<h1>good</h1>","<h1>bad</h2>"};
String regex = "<(w+)>[^<]+</1>";
for (String str : strings) {
if (str.matches(regex)) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex
+ """);
} else {
System.out.println(""" + str
+ "" can not be matched with regex "" + regex
+ """);
}
}
}
}
运行结果: "<h1>good</h1>" can be matched with regex "<(w+)>[^<]+</1>" "<h1>bad</h2>" can not be matched with regex "<(w+)>[^<]+</1>"
例子:去掉重复单词 public class pa13 {
public static void main(String[] args) {
String dupWords = "word word";
String dupWordRegex = "(w+)s+(1)";
System.out.println("Before replace:t" + dupWords);
System.out.println("After replace:t" + dupWords.replaceAll(dupWordRegex,"$1"));//美元符号可视为到如上所述已捕获子序列的引用
}
}
运行结果: Before replace: word word
After replace: word
到了这个阶段,我想读者也有一定能力了。来看看这个文章: 锚点
例子: import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class pa15 {
public static void main(String[] args) {
String[] strings = new String[] {
"This sentence contain word cat","This sentence contain word "cat"","This sentence contain word vacation","This sentence contain word "cate"",};
String regex = "bcatb";
for(String str : strings) {
System.out.println("Checking sentence:t" + str);
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
if(m.find()) {
System.out.println("Found word "cat"!");
}
else {
System.out.println("Can not found word "cat"!");
}
System.out.println("");
}
}
}
运行结果: Checking sentence: This sentence contain word cat
Found word "cat"!
Checking sentence: This sentence contain word "cat"
Found word "cat"!
Checking sentence: This sentence contain word vacation
Can not found word "cat"!
Checking sentence: This sentence contain word "cate"
Can not found word "cat"!
注意事项:
锚点二
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class pa16 {
public static void main(String[] args) {
String[] strings = new String[] { "start "," start "," end "," end" };
String[] regexes = new String[] { "^start","Astart","end$","endZ"};
for (String str : strings) {
for (String regex : regexes) {
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
if(m.find()) {
System.out.println(""" + str
+ "" can be matched with regex "" + regex
+ """);
}
else {
System.out.println(""" + str
+ "" can not be matched with regex "" + regex
+ """);
}
}
System.out.println("");
}
}
}
运行结果: "start " can be matched with regex "^start"
"start " can be matched with regex "Astart"
"start " can not be matched with regex "end$"
"start " can not be matched with regex "endZ"
" start " can not be matched with regex "^start"
" start " can not be matched with regex "Astart"
" start " can not be matched with regex "end$"
" start " can not be matched with regex "endZ"
" end " can not be matched with regex "^start"
" end " can not be matched with regex "Astart"
" end " can not be matched with regex "end$"
" end " can not be matched with regex "endZ"
" end" can not be matched with regex "^start"
" end" can not be matched with regex "Astart"
" end" can be matched with regex "end$"
" end" can be matched with regex "endZ"
正则表达式中的基本正则规则详解02 (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |