1 year ago

#274690

test-img

Rudy

Change in regex canonical equivalence matching between java 8 and 9

I have a big regex matching excel like coordinates in a text, which should ignore ranges. I noticed a change in behavior when updating java version. I simplified the regex and the code.

Here is the code :

import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main
{
    public static void main(String[] args) {
        String regex = "((?<![\\w$:])\\$?[A-Z]{1,3}\\$?[1-9][0-9]{0,3}(?![\\w(:]))";
        String input = "=A1:B2";
        Pattern withCE = Pattern.compile(regex,Pattern.CANON_EQ);
        Matcher cellReferenceswithCE = withCE.matcher(input);
        
        Pattern withoutCE = Pattern.compile(regex);
        Matcher cellReferenceswithoutCE = withoutCE.matcher(input);
        
        System.out.println("Java version : " + System.getProperty("java.version"));
        System.out.println("regex        : " + regex);
        System.out.println("String input : " + input);
        System.out.println("w/ canon eq  : " + cellReferenceswithCE.find() + "" + (cellReferenceswithCE.reset().find()?" => "+cellReferenceswithCE.group():""));
        System.out.println("w/o canon eq : " + cellReferenceswithoutCE.find() + "" + (cellReferenceswithoutCE.reset().find()?" => "+cellReferenceswithoutCE.group():""));
    }
}

And here is the result with different java versions :

Java version : 1.8.0_302
regex        : ((?<![\w$:])\$?[A-Z]{1,3}\$?[1-9][0-9]{0,3}(?![\w(:]))
String input : =A1:B2
w/ canon eq  : false
w/o canon eq : false


Java version : 9.0.1
regex        : ((?<![\w$:])\$?[A-Z]{1,3}\$?[1-9][0-9]{0,3}(?![\w(:]))
String input : =A1:B2
w/ canon eq  : true => B2
w/o canon eq : false

Anything prior to 1.8.0_302 have the same behavior than 1.8.0_302
Anything posterior to 9.0.1 have the same behavior than 9.0.1


What is the correct way to recover the behavior I had in java 8 ? Should I update the regex or should I remove the canonical equivalence ?
Which version have the expected behavior ?

java

regex

java-8

java-9

0 Answers

Your Answer

Accepted video resources