Java anchored regex

I just discovered this today when doing some regex in Java. When I first started doing regex in Java, I was surprised to learn that Java seems to treat all regular expressions as anchored. That is, if you have a string foobar and search for “foo” it will not match. This is different than grep, perl, and other tools. In other words, for Java, the following regexes are equivalent:

"foo"  
"^foo$"

If you want to find foo within foobar you need to use

".*foo.*"

I discovered one more interesting tidbit. If you put explicit anchors in, leading and trailing parts of the regex are ignored.

Here are some examples:

// some tests which illustrate implicit anchoring  
"foobar".matches("foo"); //false - rewrite = "^foo$"  
"foobar".matches("bar"); //false - rewrite = "^bar$"  
"foobar".matches("foo.*"); //true - rewrite = "^foo.*$"  
"foobar".matches("bar.*"); //false - rewrite = "^bar.*$"  
"foobar".matches(".*foo.*"); //true - rewrite = "^.*foo.*$"  
"foobar".matches(".*bar.*"); //true - rewrite = "^.*bar.*$"  
"foobar".matches(".*oo.*"); //true - rewrite = "^.*oo.*$"  
// now some tests with optional characters before or after explicit anchors  
// optional characters before or after initial/final anchors have no effect  
"foobar".matches(".*^foo"); //false - rewrite = "^foo$"  
"foobar".matches(".*^foo.*"); //true - rewrite = "^foo.*$"  
"foobar".matches(".*^foo$.*"); //false - rewrite = "^foo$"  
"foobar".matches(".*^foobar$.*"); //true - rewrite = "^foobar$"  
"foobar".matches("[a-z]*^foobar$.*"); //true - rewrite = "^foobar$"  
"foobar".matches(".+^foobar$.*"); //false can't match a character before the beginning of the string
This entry was posted in java, regex. Bookmark the permalink.

Comments are closed.