Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Parallel Uniq and Find #401

Open
steve-hb opened this issue Nov 27, 2023 · 0 comments
Open

Proposal: Parallel Uniq and Find #401

steve-hb opened this issue Nov 27, 2023 · 0 comments

Comments

@steve-hb
Copy link

steve-hb commented Nov 27, 2023

While working with millions (up to billions) of structs in a big slice, I found myself in the position of wanting to remove duplicates (due to database limitations related to transactions and duplicate updates/inserts).
This would, at given speeds, take hours to process. Before building more complex structures using hashes etc., I'd prefer to just run my task in parallel - given I don't care about order, this isn't that big of a task.

Therefore my proposal is to implement this:
func UniqByParallel[T comparable](slice []T, numThreads int, comparator func(item T, other T) bool) []T

I would also propose my (very simplistic and not optimised) version of this, but first I would like to know what others think and what problems they might see that I don't.

PS: I searched for other libraries and solutions, but didn't find easy alternatives - maybe someone knows a thing :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant