Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Merge #757

Merged
merged 18 commits into from
Feb 17, 2021
Merged

Generic Merge #757

merged 18 commits into from
Feb 17, 2021

Conversation

BarisSari
Copy link
Contributor

@BarisSari BarisSari commented Feb 9, 2021

🧰 Issue

Fixes #754

🔨Work carried out:

  • Allow merging Tasks and Tags
  • Allow merging reference table objects
  • Allow merging Sensors and Datafiles (by moving measurements)
  • Tests pass

I also fixed the Sphinx error, which was causing failure: sphinx-doc/sphinx#8885

Confirmations

  • I have chosen reviewers for my PR.
  • I have chosen an appropriate label for the PR, adding interactive_review if reviewers will need to see UI
  • I have extended/updated the documentation in \docs folder
  • Any database changes are recorded in the Log/Changes tables
  • I have completed the mandatory sections of this document.
  • I have deleted any unused sections.

📝 Developer Notes:

Ian suggested writing a test that merges nations, but I thought it would be better to use the most used reference table: Privacies. That's why I tested merge methods with it.

While I was working to find out the referencing foreign key fields of a table, I found sqlalchemy-utils. And I realised that I'm trying to reinvent the wheel. My function was almost identical to one of its helper functions. 😄 So, I also included sqlalchemy-utils in this PR. It has many useful functions, especially foreign key helpers: https://sqlalchemy-utils.readthedocs.io/en/latest/foreign_key_helpers.html .

You might ask why I didn't use merge_references(from_obj, to_obj) for all merge-operations. The reason is that we should handle dependent objects(Platform-Sensor, etc.) differently. If I tried to merge two platforms that have the same sensor name, it would throw a constraint error. So, I tried to use it as much as I can (for merging reference table objects), but I didn't use it for special cases.

@codecov
Copy link

codecov bot commented Feb 9, 2021

Codecov Report

Merging #757 (7d07b5a) into develop (36b0306) will increase coverage by 0.04%.
The diff coverage is 97.10%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #757      +/-   ##
===========================================
+ Coverage    90.98%   91.03%   +0.04%     
===========================================
  Files           84       84              
  Lines         7889     7933      +44     
===========================================
+ Hits          7178     7222      +44     
  Misses         711      711              
Impacted Files Coverage Δ
pepys_import/core/store/sqlite_db.py 100.00% <ø> (ø)
pepys_admin/maintenance/gui.py 58.46% <50.00%> (+0.16%) ⬆️
pepys_import/core/store/data_store.py 98.92% <98.48%> (+0.07%) ⬆️
pepys_import/core/store/common_db.py 99.42% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a9e9555...8cdb3c5. Read the comment docs.

@BarisSari BarisSari added the Task Package of work label Feb 11, 2021
@BarisSari BarisSari marked this pull request as ready for review February 11, 2021 09:30
@IanMayo
Copy link
Member

IanMayo commented Feb 11, 2021

So, I tried to use it as much as I can (for merging reference table objects), but I didn't use it for special cases.

Sounds perfect to me, @BarisSari :-)

@IanMayo
Copy link
Member

IanMayo commented Feb 11, 2021

I'm sure I made this comment before, @BarisSari - but now I can't find it.

I couldn't see this in the tests. You're testing merging entries in the Privacies table. It would be a good idea to load some State entries from file, and check their privacy gets changed to the new merged value. This will check that merge_references works in the way we're expecting it to.

@BarisSari
Copy link
Contributor Author

I'm sure I made this comment before, @BarisSari - but now I can't find it.

I couldn't see this in the tests. You're testing merging entries in the Privacies table. It would be a good idea to load some State entries from file, and check their privacy gets changed to the new merged value. This will check that merge_references works in the way we're expecting it to.

Sure, @IanMayo. I'm extending the related test.

@IanMayo IanMayo self-requested a review February 11, 2021 13:35
IanMayo
IanMayo previously approved these changes Feb 11, 2021
Copy link
Member

@IanMayo IanMayo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks. Will wait for @robintw review.

Copy link
Collaborator

@robintw robintw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @BarisSari.

The approach looks good, and I like the sqlalchemy-utils functions you're using.

I'm a little confused about a few bits though:

  • Can merge_generic merge platforms? It doesn't look like it can at the moment - it'd be really good if all merges could just go through the merge_generic function, so we didn't have to do platforms separately. I can see why the merge_platforms function hasn't been deleted at the moment (as it would break the GUI), but hopefully we could delete it once I'd updated the GUI to use merge_generic.
  • What about the other metadata tables that we might want to merge? I can see the code deals with Sensor and Datafile at the moment. What about Task, HostedBy, Participant etc. @IanMayo I assume I'm correct in saying these should be able to be merged too (I know we're not using them at the moment, but it'd be good if the code supported them for when we start using them shortly).

@BarisSari
Copy link
Contributor Author

Thanks for the review, @robintw.

  1. Yes, I didn't want to break the current flow. But, I'm going to update merge_generic. It makes sense to use it for merging Platforms as well.
  2. I did it on purpose because Ian didn't mention those tables in the issue. According to his answer, I can update the method.

@IanMayo
Copy link
Member

IanMayo commented Feb 12, 2021

@BarisSari - @robintw highlighted a couple of other tables that it would make sense to support merge:

  • Tasks (in case two people add the same exercise, resulting in duplicates)
  • Tags (for removing duplicates, or when granularity was too fine. Analyst creates tags for 2020 Jan, 2020 Feb, ... But then wishes to re-group into 2020 Q1, 2020 Q2, etc).

@BarisSari
Copy link
Contributor Author

@BarisSari - @robintw highlighted a couple of other tables that it would make sense to support merge:

  • Tasks (in case two people add the same exercise, resulting in duplicates)
  • Tags (for removing duplicates, or when granularity was too fine. Analyst creates tags for 2020 Jan, 2020 Feb, ... But then wishes to re-group into 2020 Q1, 2020 Q2, etc).

Okay, @IanMayo. I'm modifying the PR.

Copy link
Collaborator

@robintw robintw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BarisSari, this looks good.

I see you've replaced most calls to merge_platforms with merge_generic - but there are a couple that have been missed in the tests. Could you switch those over too?

Also, given that we've now replaced all calls to merge_platforms - I think we should be ok to delete merge_platforms (I see you've replaced the call in the GUI too).

@@ -215,7 +218,9 @@ def test_merge_platforms_with_different_sensor_names(self):
assert len(states_before_merge) == 0
assert len(contacts_before_merge) == 0

self.store.merge_platforms([platform_2.platform_id], platform.platform_id)
self.store.merge_platforms(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the other calls to merge_platforms seem to have been switched to merge_generic, apart from this one and a couple of others.

@BarisSari
Copy link
Contributor Author

Okay, @robintw. I'll update the tests. But actually, I'm still calling the merge_platforms inside of merge_generic:
https://github.com/debrief/pepys-import/pull/757/files#diff-438b06426ef37127a7f6f793bc66ffc87ca4752f3f5f653d77fb1e9eb223e53bR1851

I just removed duplicated codes, and simplified the merge_platform. But I didn't delete it completely because merge_generic would be more complex.

@robintw
Copy link
Collaborator

robintw commented Feb 17, 2021

Ah yes of course @BarisSari - that makes sense. I'll approve now.

@robintw robintw self-requested a review February 17, 2021 11:13
Copy link
Collaborator

@robintw robintw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good now, thanks for this Baris.

@IanMayo IanMayo merged commit 9b49d17 into develop Feb 17, 2021
@IanMayo IanMayo deleted the generic_merge branch February 17, 2021 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Task Package of work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4.2 Merge (Generic)
3 participants