Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Pattern instance variables and avoid re calculating each time. #3656

Merged

Conversation

arturobernalg
Copy link
Contributor

Impact

  • Bug fix (non-breaking change which fixes expected existing functionality)
  • [X ] Enhancement/New feature (adds functionality without impacting existing logic)
  • Breaking change (fix or feature that would cause existing functionality to change)

Description

Java Pattern objects are thread safe and immutable (its the matchers that are not thread safe).

As such, there is no reason not to make them static if they are going to be used by each instance of the class (or again in another method in the class).

Making them instance variables, no matter how short (or long) their life time means that you are recompiling the regular expression each time you create an instance of a class.

One of the key reasons for this structure (Pattern being a factory for Matcher objects) is that compiling the regular expression into its finite automata is a moderately expensive action. However, one finds that often the same regular expression is used again and again in a given class (either through multiple invocations of the same method or different spots in the class).

Things to be aware of

  • Describe the technical choices you made
  • Describe impacts on the codebase

Things to worry about

  • List any questions or concerns you have with the change
  • List unknowns you have

Additional Context

Add any other context about the problem here.

@nvoxland
Copy link
Contributor

Initial/Pre-Review Thoughts

It is a good cleanup, thanks.

Questions I have:

  • None

Potential risks:

  • None

What could make the full review difficult:

  • Nothing

@MalloD12 MalloD12 added the SafeToBuild Indicates that a particular PR contains changes which are safe to build using GitHub actions label Jan 24, 2023
@MalloD12 MalloD12 changed the base branch from master to 1_9 January 24, 2023 21:38
@MalloD12 MalloD12 changed the base branch from 1_9 to master January 24, 2023 21:38
@github-actions
Copy link

github-actions bot commented Jan 24, 2023

Unit Test Results

  5 252 files  ±0    5 252 suites  ±0   39m 45s ⏱️ + 4m 48s
  4 852 tests ±0    4 634 ✔️ ±0     218 💤 ±0  0 ±0 
62 022 runs  ±0  55 998 ✔️ ±0  6 024 💤 ±0  0 ±0 

Results for commit 94cdc01. ± Comparison against base commit 6e433e5.

♻️ This comment has been updated with latest results.

@@ -15,6 +15,12 @@
import java.util.regex.Pattern;

public class ChangelogRewriter {

public static final String XSD_PATTERN_STRING = "([dbchangelog|liquibase-pro])-3.[0-9]?[0-9]?.xsd";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking maybe can append a REGX/REGEX at the end of these regex constants. Also, I would lightly change the name of some of the patterns changing XSD_PATTERN for XSD_NAME_PATTERN. What do you think about it?

public static final String XSD_PATTERN_STRING = "([dbchangelog|liquibase-pro])-3.[0-9]?[0-9]?.xsd";
public static final Pattern XSD_PATTERN = Pattern.compile(XSD_PATTERN_STRING);
private static final String PATTERN_STRING = "(?ms).*<databaseChangeLog[^>]*>";
private static final Pattern PATTERN = Pattern.compile(PATTERN_STRING);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe changing this to something a bit more representative. Do you agree?

@@ -47,11 +47,13 @@ public class OfflineConnection implements DatabaseConnection {
private boolean sendsStringParametersAsUnicode = true;
private String connectionUserName;

private static final Pattern PATTERN = Pattern.compile("offline:(\\w+)\\??(.*)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, regarding naming. Also, I would follow the same convention of extracting the regex as a separate constant.

@@ -19,6 +19,8 @@ public class CockroachDatabase extends PostgresDatabase {
private Integer databaseMajorVersion;
private Integer databaseMinorVersion;

private static final Pattern VERSION_PATTERN = Pattern.compile("v(\\d+)\\.(\\d+)\\.(\\d+)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I would extract the regex as a separate constant.

@@ -20,6 +20,7 @@
public class TimeType extends LiquibaseDataType {

protected static final int MSSQL_TYPE_TIME_DEFAULT_PRECISION = 7;
public static final Pattern PATTERN = Pattern.compile("(\\(\\d+\\))");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, regarding naming and regex as a separate constant.

@MalloD12
Copy link
Contributor

Hi, @arturobernalg!

Thanks again for submitting another PR to enhance the readability of our code. I have left you a few comments all regarding the same (pattern naming and extracting regex as a separate constant). Let me know what do you think about it.

Daniel.

@MalloD12
Copy link
Contributor

Hi @arturobernalg, thank you for taking the time to address my review comments. If it's not a time-consuming thing for you, would you mind updating other regex updated in this PR as a separate constant and renaming some of the remaining generic PATTERN names? Otherwise, if you don't mind I can do it later?

Thanks,
Daniel.

@arturobernalg
Copy link
Contributor Author

Hi, @arturobernalg!

Thanks again for submitting another PR to enhance the readability of our code. I have left you a few comments all regarding the same (pattern naming and extracting regex as a separate constant). Let me know what do you think about it.

Daniel.

Hi @MalloD12
You're right. I was probably lazy that day. I have changed the last ones that I have modified. But there are still others missing. If you can wait I'll look at them today.
I just changed. let me know what do you think
TY

Split regex and petter constant.
@MalloD12 MalloD12 removed the SafeToBuild Indicates that a particular PR contains changes which are safe to build using GitHub actions label Jan 26, 2023
@MalloD12 MalloD12 added the SafeToBuild Indicates that a particular PR contains changes which are safe to build using GitHub actions label Jan 26, 2023
Copy link
Contributor

@MalloD12 MalloD12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review and testing results:

Code looks good to me now. Thanks @arturobernalg for another PR and for applying my suggestions.

Things to be aware of:

  • None

Things to worry about:

  • None

@XDelphiGrl XDelphiGrl self-requested a review January 31, 2023 17:22
@MalloD12 MalloD12 added SafeToBuild Indicates that a particular PR contains changes which are safe to build using GitHub actions and removed SafeToBuild Indicates that a particular PR contains changes which are safe to build using GitHub actions labels Jan 31, 2023
@filipelautert filipelautert requested review from MalloD12 and removed request for MalloD12 February 6, 2023 19:36
@@ -15,6 +15,12 @@
import java.util.regex.Pattern;

public class ChangelogRewriter {

public static final String XSD_FILE_REGEX = "([dbchangelog|liquibase-pro])-3.[0-9]?[0-9]?.xsd";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to take *-latest and version 4- into account for any of these objects?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hasn't changed in the last two years. @arturobernalg just made the change of extracting this regular expression into a constant. I'm trying to understand whether this regular expression should be updated to include these versions (4- and -latest) or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to define a similar Pattern in multiple classes. In other places the regex is dbchangelog-[\\w\\.]+.xsd or (?:-pro-|-)(?<version>[\\d.]*)\\.xsd. Any chance we can use identical regex in all the different places? What about having the XSD pattern defined in one place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked @nvoxland about it, and he was pointing me out that changelogID is a changeset attribute that has been introduced in version 4- so we should be checking to add this field (changelogID) for version 3-0 (not worth doing it also for earlier versions).

private static final String TIMESTAMP_REGEX = "^\\d{4}\\-\\d{2}\\-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d+$";
private static final Pattern TIMESTAMP_PATTERN = Pattern.compile(TIMESTAMP_REGEX);

private static final String TIMES_REGEX = "^\\d{2}:\\d{2}:\\d{2}$";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be TIME_REGEX instead of TIMES_REGEX?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @arturobernalg,

Would you mind making this minor update?

Thanks,
Daniel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

private static final String STANDARD_XSD_URL_REGEX = "http://www.liquibase.org/xml/ns/dbchangelog/(dbchangelog-[\\w\\.]+.xsd)";
private static final Pattern STANDARD_XSD_URL_PATTERN = Pattern.compile(STANDARD_XSD_URL_REGEX);

private static final String OLD_STANDARD_XSD_URL_REGEX = "http://www.liquibase.org/xml/ns/migrator/(dbchangelog-[\\w\\.]+.xsd)";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this must be a really old URL, for sure pre-3.0? Interesting new fact I learned from reading this!

private static final String SINGLE_QUOTE_RESULT_REGEX = "^(?:expectedResult:)?'([^']+)' (.*)";
private static final String DOUBLE_QUOTE_RESULT_REGEX = "^(?:expectedResult:)?\"([^\"]+)\" (.*)";

private static final Pattern[] PATTERNS = new Pattern[]{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the Pattern[] was named patterns in the previous code. That wasn't very confusing because patterns was declared where it was being used. Now that it is moved to a different location, a more descriptive name would be useful. Perhaps something like WORD_AND_QUOTING_PATTERNS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@XDelphiGrl XDelphiGrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arturobernalg, hi! I went regex by regex and copied and pasted to a text file to make sure nothing changed. I have to say, WOW! I did not find a single unintentional change to regex; the best I could come up with is what looks like a typo in one of the regex/pattern names (see comments).

I do have a couple of additional comments for you or @MalloD12 to address. The primary ask is to please be consistent in naming for "changeset" (one word, not two). The secondary ask is to double-check the regex used for XSD pattern matching. Seems like we have a few ways of defining the regex so I'm curious if we can settle on one regex (or even one place where this is defined).

As always, thank you so much for the code improvements, @arturobernalg !!!

@filipelautert
Copy link
Collaborator

@XDelphiGrl all that you request is done right?

@XDelphiGrl XDelphiGrl added SafeToBuild Indicates that a particular PR contains changes which are safe to build using GitHub actions and removed SafeToBuild Indicates that a particular PR contains changes which are safe to build using GitHub actions labels Feb 15, 2023
Copy link
Contributor

@XDelphiGrl XDelphiGrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Liquibase uses regular expressions in a multitude of places. As @arturobernalg mentions in the PR description, instance variables of regular expressions introduce a computational load compared to constant definitions for instance variables. This PR "converts" instance variables of regular expressions to constants.

  • Functional and test harness executions did pass prior to this branch becoming out of date with master.
    • Functional test failures are related to the new history command format property.
    • Tests will pass when master merges to this branch.
  • No additional testing required.

APPROVED

@filipelautert filipelautert merged commit 1337cc6 into liquibase:master Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SafeToBuild Indicates that a particular PR contains changes which are safe to build using GitHub actions sprint2023-43 TypeEnhancement
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

6 participants