New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Pattern instance variables and avoid re calculating each time. #3656
Make Pattern instance variables and avoid re calculating each time. #3656
Conversation
2d94c47
to
7bc206f
Compare
Initial/Pre-Review Thoughts It is a good cleanup, thanks. Questions I have:
Potential risks:
What could make the full review difficult:
|
…ernalg/liquibase into arturobernalg-feature/pattern_constant
@@ -15,6 +15,12 @@ | |||
import java.util.regex.Pattern; | |||
|
|||
public class ChangelogRewriter { | |||
|
|||
public static final String XSD_PATTERN_STRING = "([dbchangelog|liquibase-pro])-3.[0-9]?[0-9]?.xsd"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking maybe can append a REGX/REGEX at the end of these regex constants. Also, I would lightly change the name of some of the patterns changing XSD_PATTERN
for XSD_NAME_PATTERN
. What do you think about it?
public static final String XSD_PATTERN_STRING = "([dbchangelog|liquibase-pro])-3.[0-9]?[0-9]?.xsd"; | ||
public static final Pattern XSD_PATTERN = Pattern.compile(XSD_PATTERN_STRING); | ||
private static final String PATTERN_STRING = "(?ms).*<databaseChangeLog[^>]*>"; | ||
private static final Pattern PATTERN = Pattern.compile(PATTERN_STRING); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe changing this to something a bit more representative. Do you agree?
@@ -47,11 +47,13 @@ public class OfflineConnection implements DatabaseConnection { | |||
private boolean sendsStringParametersAsUnicode = true; | |||
private String connectionUserName; | |||
|
|||
private static final Pattern PATTERN = Pattern.compile("offline:(\\w+)\\??(.*)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, regarding naming. Also, I would follow the same convention of extracting the regex as a separate constant.
@@ -19,6 +19,8 @@ public class CockroachDatabase extends PostgresDatabase { | |||
private Integer databaseMajorVersion; | |||
private Integer databaseMinorVersion; | |||
|
|||
private static final Pattern VERSION_PATTERN = Pattern.compile("v(\\d+)\\.(\\d+)\\.(\\d+)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, I would extract the regex as a separate constant.
@@ -20,6 +20,7 @@ | |||
public class TimeType extends LiquibaseDataType { | |||
|
|||
protected static final int MSSQL_TYPE_TIME_DEFAULT_PRECISION = 7; | |||
public static final Pattern PATTERN = Pattern.compile("(\\(\\d+\\))"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, regarding naming and regex as a separate constant.
Hi, @arturobernalg! Thanks again for submitting another PR to enhance the readability of our code. I have left you a few comments all regarding the same (pattern naming and extracting regex as a separate constant). Let me know what do you think about it. Daniel. |
Hi @arturobernalg, thank you for taking the time to address my review comments. If it's not a time-consuming thing for you, would you mind updating other regex updated in this PR as a separate constant and renaming some of the remaining generic Thanks, |
Hi @MalloD12 |
c55aeb7
to
0d4d923
Compare
Split regex and petter constant.
01c0ac4
to
72673ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review and testing results:
Code looks good to me now. Thanks @arturobernalg for another PR and for applying my suggestions.
Things to be aware of:
- None
Things to worry about:
- None
@@ -15,6 +15,12 @@ | |||
import java.util.regex.Pattern; | |||
|
|||
public class ChangelogRewriter { | |||
|
|||
public static final String XSD_FILE_REGEX = "([dbchangelog|liquibase-pro])-3.[0-9]?[0-9]?.xsd"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to take *-latest
and version 4-
into account for any of these objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hasn't changed in the last two years. @arturobernalg just made the change of extracting this regular expression into a constant. I'm trying to understand whether this regular expression should be updated to include these versions (4-
and -latest
) or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to define a similar Pattern in multiple classes. In other places the regex is dbchangelog-[\\w\\.]+.xsd
or (?:-pro-|-)(?<version>[\\d.]*)\\.xsd
. Any chance we can use identical regex in all the different places? What about having the XSD pattern defined in one place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked @nvoxland about it, and he was pointing me out that changelogID is a changeset attribute that has been introduced in version 4-
so we should be checking to add this field (changelogID) for version 3-0 (not worth doing it also for earlier versions).
private static final String TIMESTAMP_REGEX = "^\\d{4}\\-\\d{2}\\-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d+$"; | ||
private static final Pattern TIMESTAMP_PATTERN = Pattern.compile(TIMESTAMP_REGEX); | ||
|
||
private static final String TIMES_REGEX = "^\\d{2}:\\d{2}:\\d{2}$"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be TIME_REGEX
instead of TIMES_REGEX
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
private static final String STANDARD_XSD_URL_REGEX = "http://www.liquibase.org/xml/ns/dbchangelog/(dbchangelog-[\\w\\.]+.xsd)"; | ||
private static final Pattern STANDARD_XSD_URL_PATTERN = Pattern.compile(STANDARD_XSD_URL_REGEX); | ||
|
||
private static final String OLD_STANDARD_XSD_URL_REGEX = "http://www.liquibase.org/xml/ns/migrator/(dbchangelog-[\\w\\.]+.xsd)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this must be a really old URL, for sure pre-3.0? Interesting new fact I learned from reading this!
...ibase-core/src/main/java/liquibase/parser/core/formattedsql/FormattedSqlChangeLogParser.java
Show resolved
Hide resolved
private static final String SINGLE_QUOTE_RESULT_REGEX = "^(?:expectedResult:)?'([^']+)' (.*)"; | ||
private static final String DOUBLE_QUOTE_RESULT_REGEX = "^(?:expectedResult:)?\"([^\"]+)\" (.*)"; | ||
|
||
private static final Pattern[] PATTERNS = new Pattern[]{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the Pattern[] was named patterns
in the previous code. That wasn't very confusing because patterns
was declared where it was being used. Now that it is moved to a different location, a more descriptive name would be useful. Perhaps something like WORD_AND_QUOTING_PATTERNS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arturobernalg, hi! I went regex by regex and copied and pasted to a text file to make sure nothing changed. I have to say, WOW! I did not find a single unintentional change to regex; the best I could come up with is what looks like a typo in one of the regex/pattern names (see comments).
I do have a couple of additional comments for you or @MalloD12 to address. The primary ask is to please be consistent in naming for "changeset" (one word, not two). The secondary ask is to double-check the regex used for XSD pattern matching. Seems like we have a few ways of defining the regex so I'm curious if we can settle on one regex (or even one place where this is defined).
As always, thank you so much for the code improvements, @arturobernalg !!!
@XDelphiGrl all that you request is done right? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Liquibase uses regular expressions in a multitude of places. As @arturobernalg mentions in the PR description, instance variables of regular expressions introduce a computational load compared to constant definitions for instance variables. This PR "converts" instance variables of regular expressions to constants.
- Functional and test harness executions did pass prior to this branch becoming out of date with master.
- Functional test failures are related to the new
history
commandformat
property. - Tests will pass when master merges to this branch.
- Functional test failures are related to the new
- No additional testing required.
APPROVED
Impact
Description
Java Pattern objects are thread safe and immutable (its the matchers that are not thread safe).
As such, there is no reason not to make them static if they are going to be used by each instance of the class (or again in another method in the class).
Making them instance variables, no matter how short (or long) their life time means that you are recompiling the regular expression each time you create an instance of a class.
One of the key reasons for this structure (Pattern being a factory for Matcher objects) is that compiling the regular expression into its finite automata is a moderately expensive action. However, one finds that often the same regular expression is used again and again in a given class (either through multiple invocations of the same method or different spots in the class).
Things to be aware of
Things to worry about
Additional Context
Add any other context about the problem here.