8000 Support view() on batch dimensions for non-contiguous tensors? · Issue #3653 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Support view() on batch dimensions for non-contiguous tensors? #3653

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vadimkantorov opened this issue Nov 12, 2017 · 8 comments
Closed

Support view() on batch dimensions for non-contiguous tensors? #3653

vadimkantorov opened this issue Nov 12, 2017 · 8 comments
Assignees

Comments

@vadimkantorov
Copy link
Contributor
vadimkantorov commented Nov 12, 2017
a = torch.Tensor(16, 8, 32, 64)
a.view(-1, 32, 64)                   # works
a.transpose(-1, -2).view(-1, 64, 32) # doesn't
a.view(-1, 32, 64).transpose(-1, -2) # works but doesn't fit some interfaces, i.e. sometimes we want to do view as the last operation

Such view's are needed to implement 4D+ bmm that can treat all dimensions except last two as batch dimensions (similarly to Linear module's behavior). Unless I move transpose inside the bmm func (which would not match the existing interface but well), an extra contiguous call is needed.

Does it make sense to support such view call? On one hand it breaks invariant that view always returns a contiguous tensor, on the other side the situation for several batch dimensions may be common.

Earlier discussed in #764

@apaszke
Copy link
Contributor
apaszke commented Nov 12, 2017

There's a good reason for the view invariant - most of such reshapes are impossible to pull off using stride tricks if then tensor isn't contiguous. On the other hand, making it contiguous inside view would mean that sometimes the returned tensor shares storage with input, and sometimes doesn't. This is important for cases like these: x.view(-1)[::x.size(1) + 1] += c (add c to diagonal of matrix x). If you know/suspect that tensors might sometimes be non-contiguous just add .contiguous() before .view() it's a no-op if the tensor already is contiguous

@fmassa
Copy link
Member
fmassa commented Nov 12, 2017

I think Vadim's point is that if the subspace of the non-contiguous tensor that we are modifying is contiguous, then the stride tricks should work, and we could reduce some of the restrictions in view.
This is similar in spirit with the unsqueeze function, that required dedicated implementation because of the restrictions in view.
Not sure if there are edge cases there though, and might just be safer/better to have everything uniform.

@apaszke
Copy link
Contributor
apaszke commented Nov 12, 2017

Ohhh I haven't noticed this, my bad. What "works but doesn't fit some interfaces" means? I think it should give results equivalent to the previous line (if it worked). I think relaxing the constraint on view to be able to work with certain classes of non-contiguous tensors would be reasonable

@vadimkantorov
Copy link
Contributor Author
vadimkantorov commented Nov 12, 2017

Yep, you got me right :)

By "works but doesn't fit some interfaces" I meant that bmm interface accepts BxPxQ and BxQxR, so if I'm implementing a generalized version that flattens some first dimensions, then the inputs must already be properly transposed by caller. And then there's this problem that we can't flatten first dimensions of non-contiguous tensors. In other words, extending a function like bmm requires first a call to transpose and then view, so the last line can't be used without having to change the bmm's contract.

@apaszke
Copy link
Contributor
apaszke commented Nov 13, 2017

I'm not sure if I understand, can't you just call contiguous() inside your function (this doesn't change the interface), and only then flatten the leading dimensions? This will be a no-op if user gave you a properly transposed tensor, and will fix the problem otherwise

@vadimkantorov
Copy link
Contributor Author
vadimkantorov commented Nov 13, 2017

That's what I currently do.

Though B will always be non-contiguous and an unnecessary copy of B will happen, since transpose must be applied first (instead of blas call with right transa, transb arguments):

import torch
def bmm(A, B):
    print(A.is_contiguous(), B.is_contiguous())
    return torch.bmm(A.contiguous().view(-1, A.size(-2), A.size(-1)), B.contiguous().view(-1, B.size(-2), B.size(-1))).view(A.size()[:-1] + (-1, ))

a = torch.Tensor(16, 8, 32, 64)
bmm(a, a.transpose(-1, -2)) # prints (True, False)

If view is augmented to support the way @fmassa described better than I did, then maybe .is_contiguous(dim = [0,1,2]) support also makes sense, so that the caller can find out if the tensor is contiguous in the given subspace of dimensions and .contiguous(dim = [0,1,2]) to force it so.

@ssnl
Copy link
Collaborator
ssnl commented Dec 6, 2017

working on this

@soumith
Copy link
Member
soumith commented Dec 18, 2017

closed via #4062

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
0