Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing the requirement to use a task_id #834

Open
Andrew-S-Rosen opened this issue Jul 27, 2023 · 5 comments
Open

Removing the requirement to use a task_id #834

Andrew-S-Rosen opened this issue Jul 27, 2023 · 5 comments

Comments

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Jul 27, 2023

As I discussed with @munrojm, there are instances in maggma where the user has to supply a "task_id" where it is arguably not necessary.

Consider this example:

from maggma.stores import MontyStore

store = MontyStore("my_db")
d = {"test": "hi"}
with store:
    store.update(d)

You get back the following traceback:

--------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[27], line 3
      1 d = {"test": "hi"}
      2 with store:
----> 3     store.update(d)

File [c:\users\asros\github\maggma\src\maggma\stores\mongolike.py:1040](file:///C:/users/asros/github/maggma/src/maggma/stores/mongolike.py:1040), in MontyStore.update(self, docs, key)
   1038     search_doc = {k: d[k] for k in key}
   1039 else:
-> 1040     search_doc = {key: d[key]}
   1042 self._collection.replace_one(search_doc, d, upsert=True)

KeyError: 'task_id'

If the user supplies, say, d = {"task_id": 1, "test": "hi"} it works and you get:

[{'_id': ObjectId('64c2c5e507a03e712c848446'), 'task_id': 1, 'test': 'hi'}]

But this is awkward. If it's already going to assign an _id to the entry, there should not be a need to have the user pass a task_id. If no task_id is specified, it should fallback to using the automatically generated _id.

@rkingsbury
Copy link
Collaborator

Good point @arosen93 . Maybe we could implement this in Store.__init__ such that if you set key=None then the _id field is used by default?

This would require us to manually create ObjectID whenever not using an actual MongoDB - backed store.

@Andrew-S-Rosen
Copy link
Member Author

That seems like a logical approach to me!

@Andrew-S-Rosen
Copy link
Member Author

Motivation for not defaulting to _id: not all Stores have or use _id.

from maggma.stores import JSONStore

with JSONStore("test.json", read_only=False) as store:
    store.update({"test": "hi"})

doesn't work and

from maggma.stores import JSONStore

with JSONStore("test.json", read_only=False) as store:
    store.update({"task_id": "hello", "test": "hi"})

does work and yields a file that contains

[{"task_id": "hello", "test": "hi"}]

with no _id.

@rkingsbury
Copy link
Collaborator

Motivation for not defaulting to _id: not all Stores have or use _id.

I'm not sure I understand your point in this example. Is the recommended behavior still to default to an automatically generated _id field if the user passes task_id=None? Obviously we would need to implement this in the base Store class so that it would always be assigned if needed.

@Andrew-S-Rosen
Copy link
Member Author

Apologies for not being clear. I was knee deep in debugging and being too brief.

Yes, I would recommend having an automatically generated _id field if the user passed task_id=None. In the original example with MontyStore, and _id was automatically generated already. For the JSONStore, there is no such _id field. But I would support this automatic implementation in Store.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants