Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dealing with aliasing #239

Open
breuleux opened this issue Jul 29, 2019 · 0 comments
Open

Dealing with aliasing #239

breuleux opened this issue Jul 29, 2019 · 0 comments

Comments

@breuleux
Copy link
Member

Myia is currently not able to handle aliased tensors in data structures. This issue can crop up in the Pytorch frontend, in code like this:

class LinearSeq(torch.nn.Module):
    def __init__(self, a, b):
        super(LinearSeq, self).__init__()
        self.lin = torch.nn.Linear(a, b)
        self.seq = torch.nn.Sequential(self.lin)

    def forward(self, x):
        return self.seq(x)

The problem is that Myia sees both self.lin and self.seq[0], but it understands them as different parameters rather than the same parameter. Thus, if forward only uses self.seq, the gradient wrt self.lin is zero, and the update will be applied on seq, but not lin. Furthermore, if both seq and lin are used, they will accumulate gradients separately and will diverge.

This is a difficult problem, and if we handle it, I believe it would be best to consider the aliasing patterns statically (by which I mean specialize graphs wrt aliasing patterns). The fact that two tensors in opposite corners of a data structure may be aliased seems particularly difficult to deal with, but maybe we can get away with only supporting a few simple patterns.

So the question is, how do we deal with this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant