Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GatherLayer on batch axis #1087

Open
vieting opened this issue Aug 1, 2022 · 3 comments · May be fixed by #1089
Open

GatherLayer on batch axis #1087

vieting opened this issue Aug 1, 2022 · 3 comments · May be fixed by #1089

Comments

@vieting
Copy link
Contributor

vieting commented Aug 1, 2022

I have a case where I'd like to use the GatherLayer on the batch axis. This works, but the size placeholder is not gathered. Does anything speak against gathering on the batch axis or was this just not considered before and we should fix the size placeholder?

Not gathering the size placeholder leads to an error later in my model and it also doesn't make sense to keep the same size placeholder in general.

Here is a test case to demonstrate the issue:

def test_GatherLayer_batch_dim():
  with make_scope() as session:
    import numpy as np
    net = TFNetwork(extern_data=ExternData())
    batch_dim, time_dim, feature_dim = 3, 4, 2
    # [B, T, F]
    random = np.random.RandomState(42)
    values_seqs = random.rand(batch_dim, time_dim, feature_dim).astype('float32')
    values_size = np.array([4, 2, 3])
    values_placeholder = tf.constant(values_seqs, dtype=tf.float32)
    values_size_placeholder = {0: tf.constant(values_size, dtype=tf.int32)}
    values = InternalLayer(
      name="values", network=net,
      output=Data(
        name="values",
        batch_dim_axis=0, time_dim_axis=1, feature_dim_axis=2,
        shape=[None, feature_dim],
        placeholder=values_placeholder,
        size_placeholder=values_size_placeholder,
      ))
    position_np = np.array([0, 2])
    position = InternalLayer(
      name="position", network=net,
      output=Data(
        name="position",
        placeholder=tf.constant(position_np, dtype=tf.int64),
        batch_dim_axis=0, shape=[], dtype="int64",
      ))
    values.output.sanity_check()
    position.output.sanity_check()

    # should become [B', T, F]
    layer = GatherLayer(
      name="gather", network=net,
      sources=[values], position=position, axis="B",
      output=GatherLayer.get_out_data_from_opts(
        name="gather", sources=[values], position=position, axis="B"))
    layer.output.sanity_check()
    out_seqs, out_size = session.run([layer.output.placeholder, layer.output.size_placeholder.as_dict()])
    assert isinstance(out_seqs, numpy.ndarray)

    np.testing.assert_equal(values_seqs[position_np, :], out_seqs)
    np.testing.assert_equal(values_size[position_np], out_size[0])
@albertz
Copy link
Member

albertz commented Aug 1, 2022 via email

@vieting vieting linked a pull request Aug 4, 2022 that will close this issue
@vieting
Copy link
Contributor Author

vieting commented Aug 30, 2022

As discussed offline, it is possible to get the desired results in my use case using the MaskedComputationLayer. Instead of the indices to gather, we need a boolean mask over the batch axis. In my use case, I have this anyway and only computed the indices from the mask. We can use the mask like this:

network = {
    "encoder": {...},  # B, T, F
    "boolean_mask": {...},  # B
    "encoder_masked": {
        "class": "masked_computation",
        "mask": "boolean_mask",
        "unit": {"class": "copy", "from": "encoder"}
    },  # B', T, F
    ...
}

Since that does exactly what I need, I'll close the issue.

@vieting vieting closed this as completed Aug 30, 2022
@albertz
Copy link
Member

albertz commented Aug 30, 2022

The issue here is not fixed/implemented yet (GatherLayer on batch axis).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants