Matcher.find()

发布时间:2023-05-23

在Java的Regex包中,Matcher.find()是一个非常有用的方法,它可以在字符串中搜索一个正则匹配,从而创建一个Match对象,这个Match对象包含了所匹配的字符的位置和值。

一、搜索匹配

Matcher.find()是一个非常有用的工具,它可以帮助你在一个字符串中搜索你想要的匹配。例如,你有以下的字符串,你想要在其中搜索一个名为"hello"的单词:

String text = "hello world! This is a test.";
Pattern pattern = Pattern.compile("\\bhello\\b");
Matcher matcher = pattern.matcher(text);

在这个例子中,我们先定义了一个字符串,然后正则表达式中用\\b来匹配单词的边界。最后,我们创建了一个Matcher对象,用于搜索匹配。 现在,我们可以使用Matcher.find()来搜索匹配并输出。代码如下:

while (matcher.find()) {
   System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
   System.out.println("Match: " + matcher.group());
}

这里使用了一个while循环,因为你可以在同一个字符串中找到多个匹配。输出结果如下:

Match found at index 0 to 5
Match: hello

在这个例子中,我们只找到了一个匹配。matcher.start()返回匹配的起始位置,matcher.end()返回匹配的结束位置。同时,matcher.group()返回实际匹配的字符串。

二、正则表达式组

可以使用括号分组对Matcher.find()方法进行更高级的利用,例如将所有匹配的单词提取出来放到一个数组中。例如:

String text = "hello world, how are you?";
Pattern pattern = Pattern.compile("(\\w+)");
Matcher matcher = pattern.matcher(text);
List<String> matches = new ArrayList<String>();
while (matcher.find()) {
    matches.add(matcher.group(1));
}
System.out.println(matches);

这里,我们的正则表达式是"(\\w+)"。括号中的\\w+将匹配任何单词字符,而括号表明我们要将整个匹配作为一组。我们在循环中多次使用了Matcher.find()来找到多个匹配,并将每个匹配的字符串添加到数组中。

三、区分大小写

默认情况下,Matcher.find()是区分大小写的。这意味着如果你搜索"hello",它将只匹配"hello",而不是"Hello"或"HELLO"。如果你想要执行大小写不敏感的匹配,你需要在正则表达式模式中添加"(?i)"标志。例如:

String text = "HELLO world! This is a test.";
Pattern pattern = Pattern.compile("(?i)\\bhello\\b");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
    System.out.println("Match: " + matcher.group());
}

这里,我们在正则表达式模式中添加了"(?i)"标志,这表示执行不区分大小写的匹配。输出结果如下:

Match found at index 0 to 5
Match: HELLO

四、多行模式

有时候,你可能需要在整个文本字符串中搜索匹配,而不只是在单行中搜索匹配。默认情况下,Matcher.find()只搜索一行。 你可以使用"(?m)"标志来启用多行模式。多行模式允许在整个文本字符串中进行搜索匹配。例如:

String text = "hello world\nhow are you\ntoday?";
Pattern pattern = Pattern.compile("^h", Pattern.MULTILINE);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
    System.out.println("Match: " + matcher.group());
}

这里,我们的正则表达式是"^h",这将匹配以字母h开头的任何行。我们也传递了Pattern.MULTILINE参数来启用多行模式。输出结果如下:

Match found at index 6 to 7
Match: w
Match found at index 18 to 19
Match: t

五、贪婪模式和懒惰模式

Matcher.find()默认是贪婪模式的。这意味着它会尽可能多地匹配字符。例如,如果你要匹配字符串"aaaaaaaaaaaaaab"中的"a+"Matcher.find()将匹配整个字符串:

String text = "aaaaaaaaaaaaaab";
Pattern pattern = Pattern.compile("a+");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
    System.out.println("Match: " + matcher.group());
}

输出结果如下:

Match found at index 0 to 13
Match: aaaaaaaaaaaa
Match found at index 13 to 14
Match: b

你可以使用"?在正则表达式模式中表示懒惰模式,这意味着它会尽可能少地匹配字符。例如:

String text = "aaaaaaaaaaaaaab";
Pattern pattern = Pattern.compile("a+?");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
    System.out.println("Match: " + matcher.group());
}

输出结果如下:

Match found at index 0 to 0
Match: a
Match found at index 1 to 1
Match: a
Match found at index 2 to 2
Match: a
Match found at index 3 to 3
Match: a
Match found at index 4 to 4
Match: a
Match found at index 5 to 5
Match: a
Match found at index 6 to 6
Match: a
Match found at index 7 to 7
Match: a
Match found at index 8 to 8
Match: a
Match found at index 9 to 9
Match: a
Match found at index 10 to 10
Match: a
Match found at index 11 to 11
Match: a
Match found at index 12 to 12
Match: a
Match found at index 13 to 13
Match: a
Match found at index 13 to 14
Match: b

总结

Matcher.find()是一个非常有用的工具,可以帮助你搜索和提取字符串中的模式。你可以在不同的情况下使用括号分组、大小写敏感和多行模式来控制匹配。而贪婪和懒惰模式则允许你更好地控制匹配的细节。

代码示例:

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class MatcherExample {
    public static void main(String[] args) {
        // Search for a single word
        String text = "hello world! This is a test.";
        Pattern pattern = Pattern.compile("\\bhello\\b");
        Matcher matcher = pattern.matcher(text);
        System.out.println("Searching for: " + pattern.pattern());
        while (matcher.find()) {
            System.out.println("Match found at index " + matcher.start() + " to " + matcher.end());
            System.out.println("Match: " + matcher.group());
        }
        // Using groups
        String text2 = "hello world, how are you?";
        Pattern pattern2 = Pattern.compile("(\\w+)");
        Matcher matcher2 = pattern2.matcher(text2);
        List<String> matches = new ArrayList<String>();
        System.out.println("\nSearching for: " + pattern2.pattern());
        while (matcher2.find()) {
            matches.add(matcher2.group(1));
        }
        System.out.println("Matches: " + matches);
        // Case insensitive search
        String text3 = "HELLO world! This is a test.";
        Pattern pattern3 = Pattern.compile("(?i)\\bhello\\b");
        Matcher matcher3 = pattern3.matcher(text3);
        System.out.println("\nSearching for: " + pattern3.pattern());
        while (matcher3.find()) {
            System.out.println("Match found at index " + matcher3.start() + " to " + matcher3.end());
            System.out.println("Match: " + matcher3.group());
        }
        // Multi-line search
        String text4 = "hello world\nhow are you\ntoday?";
        Pattern pattern4 = Pattern.compile("^h", Pattern.MULTILINE);
        Matcher matcher4 = pattern4.matcher(text4);
        System.out.println("\nSearching for: " + pattern4.pattern());
        while (matcher4.find()) {
            System.out.println("Match found at index " + matcher4.start() + " to " + matcher4.end());
            System.out.println("Match: " + matcher4.group());
        }
        // Greedy and lazy matching
        String text5 = "aaaaaaaaaaaaaab";
        // Greedy match
        Pattern pattern5a = Pattern.compile("a+");
        Matcher matcher5a = pattern5a.matcher(text5);
        System.out.println("\nSearching for: " + pattern5a.pattern());
        while (matcher5a.find()) {
            System.out.println("Match found at index " + matcher5a.start() + " to " + matcher5a.end());
            System.out.println("Match: " + matcher5a.group());
        }
        // Lazy match
        Pattern pattern5b = Pattern.compile("a+?");
        Matcher matcher5b = pattern5b.matcher(text5);
        System.out.println("\nSearching for: " + pattern5b.pattern());
        while (matcher5b.find()) {
            System.out.println("Match found at index " + matcher5b.start() + " to " + matcher5b.end());
            System.out.println("Match: " + matcher5b.group());
        }
    }
}