Dealing with aliasing #239

breuleux · 2019-07-29T19:21:46Z

Myia is currently not able to handle aliased tensors in data structures. This issue can crop up in the Pytorch frontend, in code like this:

class LinearSeq(torch.nn.Module):
    def __init__(self, a, b):
        super(LinearSeq, self).__init__()
        self.lin = torch.nn.Linear(a, b)
        self.seq = torch.nn.Sequential(self.lin)

    def forward(self, x):
        return self.seq(x)

The problem is that Myia sees both self.lin and self.seq[0], but it understands them as different parameters rather than the same parameter. Thus, if forward only uses self.seq, the gradient wrt self.lin is zero, and the update will be applied on seq, but not lin. Furthermore, if both seq and lin are used, they will accumulate gradients separately and will diverge.

This is a difficult problem, and if we handle it, I believe it would be best to consider the aliasing patterns statically (by which I mean specialize graphs wrt aliasing patterns). The fact that two tensors in opposite corners of a data structure may be aliased seems particularly difficult to deal with, but maybe we can get away with only supporting a few simple patterns.

So the question is, how do we deal with this?

The text was updated successfully, but these errors were encountered:

breuleux mentioned this issue Aug 21, 2019

Merge gradients for aliased parameters #255

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dealing with aliasing #239

Dealing with aliasing #239

breuleux commented Jul 29, 2019

Dealing with aliasing #239

Dealing with aliasing #239

Comments

breuleux commented Jul 29, 2019