Extend compare_metadata #770

WenzDaniel · 2023-10-26T08:16:22Z

We should add a fall back method to

Line 1682 in 5f9051b

def compare_metadata(self, run_id, target, old_metadata):

which only compares the lineages in case the current data-type is not stored. Currently, we are loading the metadata via:

# new metadata for the given runid + target; fetch from context
 new_metadata = self.get_metadata(run_id, target)

We should as fall back in case data is not stored:

new_lineage = st.key_for(run_id, data-type).linegae

One needs to only cast all tuples into lists or vise-versa or deactivate the type checking as otherwise one cannot compare to metadata files loaded from disk.__

KaraMelih · 2023-11-03T14:57:44Z

I started looking into this. It looks straightforward, and I think I just now understood what you meant by typecasting. Just to confirm;

a = st.get_metadata(run_id, target)['lineage']
b = st.key_for(run_id, target).lineage
strax.utils.compare_dict(a, b)

returns the same information but one of them returns them in a list while the other returns them in a tuple.

Then I need to check how deep the lists and tuples are nested to convert them.

KaraMelih · 2023-11-03T16:12:23Z

Should I make a PR branching from this commit 5f9051b ?
The version on the master is different

KaraMelih · 2023-11-06T11:11:38Z

I have refactored the code to check lineages in case the data is not stored. I also wrote the following utility function for converting tuples at varying-depth nested objects.

import copy
def convert_tuple_to_list(init_func_input):
    func_input = copy.deepcopy(init_func_input)
    # if it is a tuple convert it and reiterate
    if isinstance(func_input, tuple):
        _func_input = list(func_input)
        return convert_tuple_to_list(_func_input)
        
    # if it is a list, go over all the elements until all tuples are lists
    elif isinstance(func_input, list):
        new_func_inp = []
        # check each element
        for i in func_input:
            new_func_inp.append(convert_tuple_to_list(i))
            # iterates until everything is all depths are exhausted
        return new_func_inp
        
    # if it is a dict iterate
    elif isinstance(func_input, dict):
        for k,v in func_input.items():
            func_input[k] = convert_tuple_to_list(v)
        return func_input
    else:
        # if not a container, return. i.e. int, float, bytes, str etc.
        return func_input

I tested it and seems to be working as intended.

However, I think the lineage comparison might be a little misleading as the lineages of different runs do not differ if they are from the same context, any runid as long as the data type is the same will result in the same lineage e.g.;

st.key_for("000000", "event_basics").lineage == st.key_for("053675", "event_basics").lineage

is always True.
So by comparing the lineages (after straightening out the tuples and lists, if either of the compared parts is different) we can only say that they are created with the same context and they are looking at the same datatype. Is this the intended behavior?

WenzDaniel · 2023-11-06T16:38:52Z

So by comparing the lineages (after straightening out the tuples and lists, if either of the compared parts is different) we can only say that they are created with the same context and they are looking at the same datatype. Is this the intended behavior?

No then you did not understand my use case. In the current implementation we can only compare the lineage of the current context with the metadata of some stored data, if the metadata of the current context is also already stored somewhere. Because you load the metadata of the context via self.get_meta which only works if the data is stored somewhere. However, I think the nominal use case is rather: "I cannot load any data with my current context, why does my lineage differ with this piece of data stored in my directory."

You can just branch of the master for your changes.

WenzDaniel self-assigned this Oct 26, 2023

KaraMelih mentioned this issue Nov 27, 2023

Upgrade compare-metadata function #778

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend compare_metadata #770

Extend compare_metadata #770

WenzDaniel commented Oct 26, 2023 •

edited

KaraMelih commented Nov 3, 2023

KaraMelih commented Nov 3, 2023

KaraMelih commented Nov 6, 2023

WenzDaniel commented Nov 6, 2023

Extend compare_metadata #770

Extend compare_metadata #770

Comments

WenzDaniel commented Oct 26, 2023 • edited

KaraMelih commented Nov 3, 2023

KaraMelih commented Nov 3, 2023

KaraMelih commented Nov 6, 2023

WenzDaniel commented Nov 6, 2023

WenzDaniel commented Oct 26, 2023 •

edited