Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add append and extend #1050

Merged
merged 26 commits into from Dec 6, 2022
Merged

Add append and extend #1050

merged 26 commits into from Dec 6, 2022

Conversation

PK-Neptune
Copy link
Contributor

No description provided.

@github-actions
Copy link

github-actions bot commented Oct 19, 2022

Test Results

    1 files   -      11      1 suites   - 11   35s ⏱️ - 12m 56s
522 tests +     19  519 ✔️ +     16  0 💤 ±  0  3 +3 
522 runs   - 5 514  519 ✔️  - 5 423  0 💤  - 94  3 +3 

For more details on these failures, see this check.

Results for commit d6f6563. ± Comparison against base commit 17e9c61.

♻️ This comment has been updated with latest results.

@PK-Neptune PK-Neptune force-pushed the pk/NPT-12400 branch 2 times, most recently from 6676a7f to 3fda587 Compare October 20, 2022 12:41
@PK-Neptune
Copy link
Contributor Author

src/neptune/new/internal/utils/__init__.py Outdated Show resolved Hide resolved
tests/neptune/new/test_handler.py Outdated Show resolved Hide resolved
src/neptune/new/handler.py Outdated Show resolved Hide resolved
tests/neptune/new/test_handler.py Outdated Show resolved Hide resolved
with self.assertRaises(ValueError):
exp["x"].append([])
with self.assertRaises(ValueError):
exp["x"].append([5, "str"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you fix this code by replacing .append with .extend you'll still have some problems.
Here in this line, the error will be raised when you'll try to log str to FloatSeries, but then you'll have defined FloatSeries under path x which is used later int he code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think now it is ok

src/neptune/new/handler.py Outdated Show resolved Hide resolved
self._container.set_attribute(self._path, attr)
attr.log(value, step=step, timestamp=timestamp, wait=wait, **kwargs)

def _create_new_attribute(self, value):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should be moved to MetadataContainer to be consistent with existing define used in assign.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to assign - check it

src/neptune/new/handler.py Outdated Show resolved Hide resolved
e2e_tests/standard/test_series.py Outdated Show resolved Hide resolved
e2e_tests/standard/test_series.py Outdated Show resolved Hide resolved
@@ -73,7 +91,9 @@ def to_dict(self) -> Dict[str, Any]:
return result

def assign(self, value: Union[NamespaceVal, dict, Mapping], wait: bool = False):
if not isinstance(value, NamespaceVal):
if isinstance(value, argparse.Namespace):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bugfix for assigning argparse.Namespace type as neptune.Namespace when attribute is already defined.

Due to changes I've made in Handler.assign I had to fix it now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact I had to change Handler.assign implementation again, but bugfix left...

attr.process_assignment(value, wait)
else:
self._container.define(self._path, value, wait)
if not attr:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did two things here:

  • removed dependency of MetadataContainer.define which now is defined with this very Handler.assign
  • Implemented flow to be consistent with what we have in: upload, log or _do_extend methods
    • If attribute does not exist, create attribute based on given value
    • then assign original (before we used casted value for assignment) value to newly created attribute

Copy link
Contributor

@shnela shnela Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then assign original (before we used casted value for assignment) value to newly created attribute

I had to give up this idea, so in case of creating attribute, we're assigning value casted by general cast_value function instead of passing responsibility for casting to attribute.

This will have to be discussed.

neptune_value = cast_value(value)
attr = ValueToAttributeVisitor(self._container, parse_path(self._path)).visit(neptune_value)
self._container.set_attribute(self._path, attr)
value = neptune_value # TODO: we should clean some rules regarding type casting in assignment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting value when type is not known (attribute does not exist) and when we already have attribute are completely different things.
We should address this problem soon.

... ts = df["timestamp"]
... run["data/example_series"].extend(ys, timestamps=ts)
"""
ExtendUtils.validate_values_for_extend(values, steps, timestamps)
Copy link
Contributor

@shnela shnela Nov 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling validate_values_for_extend on every level makes extend and append to work in O(n^2) where n is depth of nested structure number of attributes in nested structure.
We can easily fix it, but I don't think it's a problem.

src/neptune/new/attributes/series/series.py Show resolved Hide resolved
class Series(Attribute, Generic[Val, Data]):
class Series(Attribute, Generic[ValTV, DataTV, LogOperationTV]):
MAX_BATCH_SIZE = None
operation_cls: type(LogOperationTV) = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it some bad practice? Shouldn't we declare some abstract methods or classmethods instead?
Does access to these variables using self. even work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it some bad practice?
I don't know, I think it's common practice in Django, when you override values like template_name, response_class etc.

Does access to these variables using self. even work?
Yes that's what Python does by default:

  1. Tries to to obtain instance variable stored in slots or internal __dict__
  2. falls back to class attributes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't abstract or class method sound like a better idea for you? You don't even have default values here and nothing forces you to override them in derived class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your point, but what you're suggesting looks like Java in Python and I don't feel it.

Maybe I just used to fucked up Python patterns, because to me even overriding __init_sublcass__ like here looks like a solution in this case...

I don't know. Maybe @Raalsky has some opinion on writing Python.

Copy link
Contributor

@Raalsky Raalsky Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may miss something but the class properties are pretty useful things like the ground for descriptors and dataclasses fields.

Doesn't abstract or class method sound like a better idea for you? You don't even have default values
here and nothing forces you to override them in derived class.

You can as well override any abstract method, class method etc. (I think so :D).

I don't wanna dig deeply into this PR but it looks ok as a first impression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Raalsky I agree with class properties being pretty useful in general but here they are used as something to be overriden in subclasses but python does not force you to do it and they don't have any reasonable default. If you add new subclass in future would it be obvious for you to override them?

You can as well override any abstract method, class method etc. (I think so :D).

This is my point. We want them to be overriden.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here they are used as something to be overriden in subclasses but python does not force you to do it and they don't have any reasonable default. If you add new subclass in future would it be obvious for you to override them?

@aniezurawski

Are you ok with __init_sublcass__ as solution for your problem?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never heard of it earlier so I have no intuition about possible use cases and best practices but it looks like a good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok done. We'll get immediate runtime error when we try to run code with Series subclass without properly overridden values.

src/neptune/new/attributes/series/series.py Outdated Show resolved Hide resolved
src/neptune/new/attributes/series/series.py Show resolved Hide resolved
src/neptune/new/attributes/series/series.py Outdated Show resolved Hide resolved
src/neptune/new/attributes/namespace.py Outdated Show resolved Hide resolved
src/neptune/new/metadata_containers/metadata_container.py Outdated Show resolved Hide resolved
@shnela shnela force-pushed the pk/NPT-12400 branch 3 times, most recently from e654a2b to d6f6563 Compare December 2, 2022 11:33
src/neptune/new/attributes/series/file_series.py Outdated Show resolved Hide resolved
src/neptune/new/handler.py Outdated Show resolved Hide resolved
src/neptune/new/handler.py Show resolved Hide resolved
src/neptune/new/handler.py Show resolved Hide resolved
src/neptune/new/attributes/namespace.py Show resolved Hide resolved
if isinstance(value, Namespace) or is_dict_like(value):
return {k: ExtendUtils.validate_and_transform_to_extend_format(v) for k, v in value.items()}
elif is_collection(value):
raise NeptuneUserApiInputException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was custom exception a requirement from spec? Because I think ValueError is standard python exception for such cases and I think it allows you to include some custom message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was discussed between PK and Mycek and custom exception here looks reasonable, we're changing the API and want to explicitly inform user that he uses new endpoint the wrong way.

Can you confirm @Herudaio

neptune_value = cast_value(value)
attr.process_assignment(neptune_value, wait)
if isinstance(value, Handler):
value = ValueCopy(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not cast_value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we just override value when value is of instance Handler, if not we use original one.

step: Optional[float] = None,
timestamp: Optional[float] = None,
wait: bool = False,
**kwargs,
) -> None:
"""log is a deprecated method, this code should be removed in future"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want some warning?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

aniezurawski
aniezurawski previously approved these changes Dec 6, 2022
@shnela shnela merged commit 17b99cd into master Dec 6, 2022
@shnela shnela deleted the pk/NPT-12400 branch December 6, 2022 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants