Java anchored regex

I just discovered this today when doing some regex in Java. When I first started doing regex in Java, I was surprised to learn that Java seems to treat all regular expressions as anchored. That is, if you have a string foobar and search for “foo” it will not match. This is different than grep, perl, and other tools. In other words, for Java, the following regexes are equivalent:

"foo"
"^foo$"

If you want to find foo within foobar you need to use

".*foo.*"

I discovered one more interesting tidbit. If you put explicit anchors in, leading and trailing parts of the regex are ignored.

Here are some examples:

// some tests which illustrate implicit anchoring
"foobar".matches("foo"); //false - rewrite = "^foo$"
"foobar".matches("bar"); //false - rewrite = "^bar$"
"foobar".matches("foo.*"); //true - rewrite = "^foo.*$"
"foobar".matches("bar.*"); //false - rewrite = "^bar.*$"
"foobar".matches(".*foo.*"); //true - rewrite = "^.*foo.*$"
"foobar".matches(".*bar.*"); //true - rewrite = "^.*bar.*$"
"foobar".matches(".*oo.*"); //true - rewrite = "^.*oo.*$"
// now some tests with optional characters before or after explicit anchors
// optional characters before or after initial/final anchors have no effect
"foobar".matches(".*^foo"); //false - rewrite = "^foo$"
"foobar".matches(".*^foo.*"); //true - rewrite = "^foo.*$"
"foobar".matches(".*^foo$.*"); //false - rewrite = "^foo$"
"foobar".matches(".*^foobar$.*"); //true - rewrite = "^foobar$"
"foobar".matches("[a-z]*^foobar$.*"); //true - rewrite = "^foobar$"
"foobar".matches(".+^foobar$.*"); //false can't match a character before the beginning of the string

Join 165 other subscribers

archives

  • 2024 (10)
  • 2023 (8)
  • 2022 (15)
  • 2021 (19)
  • 2020 (1)
  • 2019 (1)
  • 2018 (2)
  • 2017 (1)
  • 2016 (2)
  • 2015 (5)
  • 2014 (5)
  • 2013 (2)
  • 2011 (7)
  • 2010 (10)
  • 2009 (50)
  • 2008 (28)
  • 2007 (31)
  • 2006 (8)

Category