You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have two data frames having same same schema, Is there way to compare the two data frames ? so that it provide the added , deleted and modified rows. It may take some single/group of Key columns and Ignore columns.
The text was updated successfully, but these errors were encountered:
Hi!
We don't have such functionality at the moment, but it might be a handy addition.
Tracking additions, deletions, and modifications, similar to how git would do it, requires a special algorithm. I suppose Myer's Differencing Algorithm could help.
I just tried this algorithm via https://github.com/andrewbailey/Difference on two dataFrames (as List<DataRow<*>>) which correctly provides the remove/move/add operations that likely occurred between the two dataframes.
We could wrap a library like that in the future to introduce this behavior to DataFrame natively, but in the meantime, you could try that library as well :)
I have two data frames having same same schema, Is there way to compare the two data frames ? so that it provide the added , deleted and modified rows. It may take some single/group of Key columns and Ignore columns.
The text was updated successfully, but these errors were encountered: