Add test for optimize_out_slice_nd #543

jotix16 · 2021-06-13T11:26:04Z

This is an example case which shows that slice_nd layer doesn't get properly optimized out of the loop.

This pull request is meant to fix that.

For clarification, we want to generalize SliceNdLayer to not only work on start of shape (batch,) but cover even the cases with several time axis.

tests/test_TFNetworkRecLayer.py

jotix16 · 2021-06-20T14:19:02Z

I've added the tf implementation of the new slice_nd2 in util/basic and some tests. I haven't yet cleaned the old ones out.

Before implementing the RETURNN layer, I wanted to clarify if we want to allow a slice_axis as input param to the function, similar to the GatherLayer. If we don't, the user has to make sure that the axis of input and start have the same order.

I.e.

"""
  :param tf.Tensor input: shape (B, T1, ..., Tn, D)
  :param tf.Tensor start: shape (B,T1 .., Tn-1), int32 which automatically indicates n as the slice-axis
"""

returnn/tf/layers/basic.py

tests/test_TFNetworkRecLayer.py

albertz · 2021-06-25T14:10:32Z

I don't really understand why you would want that? I would assume this layer (like GatherNdLayer) only makes sense like this here. Do not care about other cases, esp not if they don't really make sense. Mikel Zhobro ***@***.***> schrieb am Fr., 25. Juni 2021, 15:17:

…

***@***.**** commented on this pull request. ------------------------------ In tests/test_TFNetworkRecLayer.py <#543 (comment)>: > + "window": {"class": "slice_nd2", # no_opt: [B,4,D], opt: [B,T,4,D] + "from": "base:data", "start": "data:source", "size": 4, "is_output_layer": True}, Is there any way how to make it dependent? For example copy it into an intermediate layer or so? Otherwise, I don't know how making it work for all cases is possible. Maybe making sure that the input has one more time axis than start or tf.tile it otherwise? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#543 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAON7BRD77M6BH3TZBMJSTTUR6WXANCNFSM46TVFJYQ> .

returnn/tf/layers/basic.py

albertz · 2021-06-26T21:46:35Z

Can you post here what the actual problem/error is when you use these test cases with the original slice_nd? It is not really clear to me why there is actually a problem. (Just post the error with stack trace.)

jotix16 · 2021-06-28T09:46:31Z

(Just post the error with stack trace.)

The Error strace you asked. The thing is that the slice_nd we had, expects start to have only one batch axis [B], but if slice_nd gets pulled out, so does start and instead of shape [B], start has shape [B,T]. So, it requires a loop over T. That's what I added with this commit.

albertz · 2021-07-11T23:19:19Z

returnn/tf/layers/basic.py

+      assert x.get_dim_tag(start_axis).is_equal(start.get_dim_tag(start_axis), **is_equal_opts)
+
+    # Handle the case when layer is pulled out of rec loop but the input hasn't change
+    if self.optimized_out_of_loop_and_unchanged_input(x, start):


Change the name of the function and the comment. This layer is not related to the rec loop (RecLayer) in any way. No comment should mention anything about rec loop here.

I don't really understand what you are actually testing here.

If we only expect input_data comes from base_network, no check is required. We can always assume that the input_data stays the same both when being pulled out of the loop or not.

Should I follow this logic?

returnn/tf/layers/basic.py

albertz · 2021-07-11T23:26:45Z

returnn/tf/layers/basic.py

    self.output.size_placeholder = x.size_placeholder.copy()
    if isinstance(size, tf.Tensor):
-      self.output.size_placeholder[0] = tf.maximum(seq_lens - tf.reshape(start, tf.shape(seq_lens)), 0)
+      self.output.size_placeholder[slice_axis] = size


This is wrong. size is just a scalar. But you need a vector of shape [B] here. What's wrong with the old code?

But you need a vector of shape [B] here. What's wrong with the old code?

We cannot use start directly to calculate the size as it has shape [B,T0,..] instead of [B,].
So I am not sure how to set self.output.size_placeholder.

What is the meaning of self.output.size_placeholder if we have more than 1 spatial axis? Should any element of self.output.size_placeholder have size (B,)?

size_placeholder is a dict, mapping each spatial axis (counted without batch dim) to such [B] tensor.
So if you have a tensor [B,T1,T2,F], then for each batch entry b you find it's length of axis T1 in size_placeholder[0][b], and the one for T2 in size_placeholder[1][b].
However with this it's not possible that lengths depend on anything else then the batch entry, i.e. you cannot model that the some batch entry should have a different length in axis T2 for different time steps t1.

you cannot model that the same batch entry should have a different length in axis T2 for different time steps t1

This is what we have here, though.

What is self.output.size_placeholder used for anyways? Does it cause any problems if set wrong?

The size_placeholder is mostly used for masking, e.g which entries to ignore when calculating the loss, or reducing a sequence (e.g. take the average/min/max over the sequence), or also for layers which concat in time and such.
Many layers consider this, if this is set wrong then weird things can happen and your calculations will be wrong and depend on batching.
So yes, I'd say setting this is kind of important.

I don't know a way to set it how you want it. Maybe @albertz knows more?

albertz · 2021-07-11T23:27:30Z

returnn/tf/layers/basic.py

    self.output.placeholder = slices

+  @classmethod
+  def optimized_out_of_loop_and_unchanged_input(cls, input_data, start):


As explained above (and before), this name must be changed to what this function actually does/checks.

It is now called input_comes_from_base_network.

albertz · 2021-07-11T23:30:02Z

returnn/tf/layers/basic.py

+  def optimized_out_of_loop_and_unchanged_input(cls, input_data, start):
+    """
+    :rtype: bool
+    The idea is to check that the axis after the last common axis is a feature axis instead of spatial.


I don't really understand this. Why do you check this? Why is this relevant?
Also, what does this mean, "the idea"? Is this what this function returns, or how does this idea relate to what this function returns? If this is what the function returns, it should be like :returns: True iff axis after last ... is ... instead of ... or so.

I reformulated it. Should be more clear now.

""" The idea is to check if the axis after the last common axis is the feature axis instead of another spatial axis. Because the input_data should normally have one extra spatial axis compared to start. """

albertz · 2021-07-11T23:35:38Z

returnn/tf/util/basic.py

+    slice_idx = tf.tile(tf.expand_dims(start, -1), [1] * len_common_dims + [size]) + tf.range(size)
+    mask = tf.logical_or(tf.greater(slice_idx, slice_dim - 1), tf.less(slice_idx, 0))  # (B,T1 .., Tn-1, size)
+    slice_idx = tf.clip_by_value(slice_idx, 0, slice_dim - 1)  # cliped slice idx
+    res = tf.gather(x, slice_idx, axis=len_common_dims, batch_dims=len_common_dims)


I think tf.gather with axis and batch_dims is only supported in later TF versions (I don't remember since what version, can you/someone check?).

I'm not sure anymore which is the min TF version we want to support (I think we documented this somewhere; can someone check? @patrick-wilken ? @JackTemaki ?).

Maybe we need our own wrapper for gather. Or you use the more generic gather_nd (which might be slower for this case though).

The arguments are present from tf_2.0.0 up to tf_2.3.0 and tf_2.4.0. So it should be fine?

returnn/tf/util/basic.py

jotix16 marked this pull request as draft June 13, 2021 11:40

jotix16 force-pushed the optimize_out_slice_nd branch 2 times, most recently from a75c364 to bbdbb67 Compare June 13, 2021 17:02

albertz reviewed Jun 13, 2021

View reviewed changes

tests/test_TFNetworkRecLayer.py Outdated Show resolved Hide resolved

albertz reviewed Jun 13, 2021

View reviewed changes

tests/test_TFNetworkRecLayer.py Outdated Show resolved Hide resolved

albertz reviewed Jun 13, 2021

View reviewed changes

tests/test_TFNetworkRecLayer.py Outdated Show resolved Hide resolved

jotix16 force-pushed the optimize_out_slice_nd branch from bbdbb67 to 10cffc1 Compare June 13, 2021 21:56

This comment has been minimized.

Sign in to view

jotix16 force-pushed the optimize_out_slice_nd branch 2 times, most recently from a45b4a5 to e432a28 Compare June 20, 2021 14:12

albertz reviewed Jun 24, 2021

View reviewed changes

returnn/tf/layers/basic.py Outdated Show resolved Hide resolved

jotix16 commented Jun 24, 2021

View reviewed changes

tests/test_TFNetworkRecLayer.py Outdated Show resolved Hide resolved

albertz reviewed Jun 26, 2021

View reviewed changes

returnn/tf/layers/basic.py Outdated Show resolved Hide resolved

jotix16 added 3 commits June 28, 2021 11:28

add test for optimize out slice_nd

d3bf1ed

implement and test naive slice_nd numpy

7669f71

add more generic util:slice_nd

e729446

jotix16 force-pushed the optimize_out_slice_nd branch from 4eaa89a to a8c19a2 Compare June 28, 2021 09:30

jotix16 marked this pull request as ready for review June 29, 2021 09:36

update SliceNdLayer

f66ed78

jotix16 force-pushed the optimize_out_slice_nd branch from 1292e80 to 263ba62 Compare June 29, 2021 09:40

consider dynamic seq_lens in slice_nd

4749b27

jotix16 force-pushed the optimize_out_slice_nd branch from 263ba62 to 4749b27 Compare June 29, 2021 09:41

albertz reviewed Jul 11, 2021

View reviewed changes

returnn/tf/layers/basic.py Outdated Show resolved Hide resolved

albertz reviewed Jul 11, 2021

View reviewed changes

returnn/tf/layers/basic.py Outdated Show resolved Hide resolved

albertz reviewed Jul 11, 2021

View reviewed changes

returnn/tf/layers/basic.py Outdated Show resolved Hide resolved

albertz reviewed Jul 11, 2021

View reviewed changes

returnn/tf/util/basic.py Outdated Show resolved Hide resolved

albertz requested a review from Zettelkasten July 11, 2021 23:39

jotix16 added 4 commits July 13, 2021 13:26

add arg names

4ab5c4e

dont change type of start and just use placeholder

fd8d4f7

rename method

7ab7c35

get nr axis from data instead of tensor

32fbe4c

jotix16 requested a review from a team as a code owner July 13, 2021 17:35

jotix16 added 2 commits July 13, 2021 19:39

rename method

db5938a

get_shape instead of tf.shape

1f32c1e

albertz mentioned this pull request Sep 15, 2021

SliceNdLayer now uses GatherLayer to get the slices. #635

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add test for optimize_out_slice_nd #543

Add test for optimize_out_slice_nd #543

jotix16 commented Jun 13, 2021 •

edited

This comment has been minimized.

jotix16 commented Jun 20, 2021 •

edited

albertz commented Jun 25, 2021 via email

albertz commented Jun 26, 2021

jotix16 commented Jun 28, 2021 •

edited

albertz Jul 11, 2021

jotix16 Jul 13, 2021

albertz Jul 11, 2021

jotix16 Jul 13, 2021

jotix16 Jul 13, 2021

Zettelkasten Jul 13, 2021

jotix16 Jul 14, 2021

Zettelkasten Jul 14, 2021

albertz Jul 11, 2021

jotix16 Jul 13, 2021

albertz Jul 11, 2021

jotix16 Jul 13, 2021 •

edited

albertz Jul 11, 2021

jotix16 Jul 13, 2021 •

edited

Add test for optimize_out_slice_nd #543

Are you sure you want to change the base?

Add test for optimize_out_slice_nd #543

Conversation

jotix16 commented Jun 13, 2021 • edited

This comment has been minimized.

jotix16 commented Jun 20, 2021 • edited

albertz commented Jun 25, 2021 via email

albertz commented Jun 26, 2021

jotix16 commented Jun 28, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jotix16 Jul 13, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jotix16 Jul 13, 2021 • edited

Choose a reason for hiding this comment

jotix16 commented Jun 13, 2021 •

edited

jotix16 commented Jun 20, 2021 •

edited

jotix16 commented Jun 28, 2021 •

edited

jotix16 Jul 13, 2021 •

edited

jotix16 Jul 13, 2021 •

edited