8000 Towards modularizing the codebase in a semantically meaningful way · Issue #735 · kymatio/kymatio · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Towards modularizing the codebase in a semantically meaningful way #735

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks
eickenberg opened this issue Apr 27, 2021 · 0 comments
Open
3 tasks

Comments

@eickenberg
Copy link
Collaborator

Hi everybody,

It's been about a year now that I have been simmering thoughts about how to organize the kymatio code base away from the current hard-baked design choices and towards a modular architecture that maintains the same outputs and interfaces in the default settings, but that allows researchers to import parts of the code base instead of requiring to rewrite parts when they want to do anything new.
In addition, we are actually hiding away several types of wavelet transforms and fourier convolutions whose results we cannot access other than by their smoothed moduli. It should be possible for users to access these intermediate states in an easy way.

Any modifications to the code base must respect some guiding principles.

  1. They should implement new functionality for all backends
  2. They should maintain or improve computation speeds
  3. They should maintain or improve memory requirements
  4. They should at least tend towards unifying the codebase across dimensionalities of signal (though this is always hard and case-dependent. There is a reason we set up the codebase such that the different dimensionalities of signal can be treated with different code)

One important thing to maintain through any modifications is the possibility of depth-first traversal of the scattering tree. It is at the leaves of this tree that the outputs live and the outputs have lower memory requirements than the intermediate states. Storing all intermediate states due to breadth-first search may become prohibitive in certain settings, especially for 3D signals at high resolution. This depth-first-requirement makes it less simple to specify wavelets and scattering layers as a stack of feed-forward modules in the spirit of torch.nn.Sequential.

One way of addressing this issue would be to create modules with hooks/callbacks in which you hook a second-order scattering layer into a wavelet transform object.

Fortunately, python comes to our rescue with a more elegant solution: iterator/generator pipelines

Modularizing Kymatio with generator pipelines

I see the following semantic split of kymatio functionalities:

  1. (Convolutions)
  2. Wavelet Transforms
  3. Scattering Transforms

Scattering uses wavelets and we have implemented several types of them. Making a wavelet object can expose them to the user. Wavelet transforms are implemented as convolutions with very specific filters. Sometimes it is useful to use a generic Fourier convolution to implement a wavelet transform, but sometimes there are other ways of computing wavelet transforms more efficiently.

The three levels above reflect this setting. Based on that, I wrote a code base, currently in Pytorch only, to implement this idea, and to stress-test it on large 3D scattering settings. So far it has held up. I have put the relevant bits in the a repo at https://github.com/eickenberg/scattering_iterators for people to check out. Large chunks of it are well documented. Most functions are tested. They should work across all dimensionalities at least for the basic scattering transform (though I haven't tested 1D).

In order to integrate these ideas into kymatio I propose the following procedure:

TODO

  • Integrate the convolution iterator https://github.com/eickenberg/scattering_iterators/blob/main/convolution_iterator.py into the codebase. This entails writing it for the other backends, and then replacing the loops over filters in the scattering functions with loops over these iterators
  • Integrate the wavelet transform iterator into the codebase by first extending it to all backends, then replacing the loop over convolution iterator with a loop over the wavelet transform iterator
  • Integrate the scattering iterator as above

At the second juncture it will be useful to add in several different procedures for wavelet transforms (such as Haar wavelets as a test case for wavelets computed by differences and subsampling) to see how the system handles them.

I will eventually get to implementing these ideas, but I am also happy to guide anybody who wants to take a stab at it and review any PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant
0