1 year ago
Change in regex canonical equivalence matching between java 8 and 9
I have a big regex matching excel like coordinates in a text, which should ignore ranges. I noticed a change in behavior when updating java version. I simplified the regex and the code.
Here is the code :
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main
public static void main(String[] args) {
String regex = "((?<![\\w$:])\\$?[A-Z]{1,3}\\$?[1-9][0-9]{0,3}(?![\\w(:]))";
String input = "=A1:B2";
Pattern withCE = Pattern.compile(regex,Pattern.CANON_EQ);
Matcher cellReferenceswithCE = withCE.matcher(input);
Pattern withoutCE = Pattern.compile(regex);
Matcher cellReferenceswithoutCE = withoutCE.matcher(input);
System.out.println("Java version : " + System.getProperty("java.version"));
System.out.println("regex : " + regex);
System.out.println("String input : " + input);
System.out.println("w/ canon eq : " + cellReferenceswithCE.find() + "" + (cellReferenceswithCE.reset().find()?" => "+cellReferenceswithCE.group():""));
System.out.println("w/o canon eq : " + cellReferenceswithoutCE.find() + "" + (cellReferenceswithoutCE.reset().find()?" => "+cellReferenceswithoutCE.group():""));
And here is the result with different java versions :
Java version : 1.8.0_302
regex : ((?<![\w$:])\$?[A-Z]{1,3}\$?[1-9][0-9]{0,3}(?![\w(:]))
String input : =A1:B2
w/ canon eq : false
w/o canon eq : false
Java version : 9.0.1
regex : ((?<![\w$:])\$?[A-Z]{1,3}\$?[1-9][0-9]{0,3}(?![\w(:]))
String input : =A1:B2
w/ canon eq : true => B2
w/o canon eq : false
Anything prior to 1.8.0_302 have the same behavior than 1.8.0_302
Anything posterior to 9.0.1 have the same behavior than 9.0.1
What is the correct way to recover the behavior I had in java 8 ? Should I update the regex or should I remove the canonical equivalence ?
Which version have the expected behavior ?
0 Answers
Your Answer