-
Notifications
You must be signed in to change notification settings - Fork 140
examples/1d/classif_keras.py is misleading #1040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, sorry for the long time in responding. I'd like to look into this issue a bit more. Thanks for the information and details on the spoken digits classification. Can you give me more details about the tensorflow stack track? how did you get this stack trace? why is the computation graph so large? what is you suggestion for solving this just in the case of the 1d tensorflow implementation? |
Hi @MuawizChaudhary, thanks for looking into it. There are in fact multiple issues. You can immediately look at the incorrect selection from the MNIST Digits: only one digit by a few speakers is ever used. I have a hunch that someone added As for Tensorflow, you may simply save the model with the Generally, at runtime, tensors don't exist as instances of anything, except constants and TensorArray, Dataset and similar N-dim objects, which are designed to contain data; a regular tensor is a logical abstraction for describing inputs/outputs shapes of an op. Ops are graph vertices, and are real sets of pre-written kernels optimized for different devices, but tensors are edges which only convey shapes. In TF model, "reallocating" or "extending" a tensor does not even have a sensible definition; changing a shape mid-graph may have graph-wide repercussions. As a simplified example, augmenting a list (assuming it's possible, for the sake of an example) in a fixed-bound loop of final size, say, 100, consists of 100 vertices, each concatenating an element to the end of a fixed As you probably remember, there are 2 nested fixed loops over layers and then over filter scales. Tracing of these loops is possible in a regular language automaton (no python code is executed twice during tracing, so it's always finite), so by the end all numbers of iterations, all filters etc. are known and hardcoded each—at the price of a massive and an extremely slow, non-parallelizable graph. The whole body of the inner loop traced and emitted into the graph anew. I shall get back to you in about two weeks; I'm near a finish line of quite a project. We'll be able to discuss the issue. At first, I removed all but NP and TF support and only 1D "classic" (non-JTF) code, then reworked it to TensorFlow compatibility, which required additional precomputed constants storage, including the index of the output coefficient that will be eventually stored: there is no memory reallocation during runtime, or at the least it's very expensive. What worries me is that every framework has its own quirks, and it may be an excruciatingly hard problem to implement 1D, 2D, 3D transforms using multiple backend in an idiomatic and efficient way. I gave up in the end and implemented the transform in C++ as I was not happy with its performance, although I improved it possibly 50-fold. The highly-branching tree with a variable number of branches is not an easy data shape to build efficient algorithms around, unfortunately. The DFS, implemented in Kymatio, unlike the BFS in the original Ànden's 2014 Matlab code, is also the worst, sequential strategy for modern CPUs and especially GPUs. I'm sorry, I won't have time in the next two weeks, let's make it three just in case, but I believe the general architecture of the project is definitely worth discussing! |
Hi @kkm000, thanks for the detailed reply. Indeed there are two separate issues to be resolved here, 1. being the error in the example, 2. the other being the inefficiency of the implementation in graph format. On 1; I'll try to see if I or someone else has time in the next couple of weeks to resolve the issue. I'm not experienced with 1d audio or scattering, so preferably someone else would be able to provide a fix. On 2; yes I agree this is an issue that will take much time to resolve. Whenever you are done with your deadline, lets discuss more. #735 should be where we discuss this issue. I'll ping you in approx a month from now to discuss more. EDIT: Good luck with your deadline! |
kymatio/examples/1d/classif_keras.py
Lines 82 to 83 in 2e190d9
The dataset is arranged as digit{0..9} then speaker{1..6} then utterance{1..50}, i.e. in the first 50 records speaker 1 calls out 'zero', in the next 50 speaker 2's says 'zero', etc. Since there are 6 speakers, 'one' first occurs at the record 300.
The first 200 records represent 4 speakers (out of 6), all pronouncing 'zero'.
kymatio/examples/1d/classif_keras.py
Lines 5 to 7 in 2e190d9
kymatio/examples/1d/classif_keras.py
Line 168 in 2e190d9
It's a bit hard to see how this is a classification problem. But the description reads, and the network structure suggests as if 10 digits are classified, not that the model is trained, validated, and evaluated on a single label out of 10.
kymatio/examples/1d/classif_keras.py
Lines 14 to 16 in 2e190d9
Accuracy of 100% (the actual number on the held-out set after either 50 or 35 epochs) should always be suspicious.
There are extremely serious problems with TensorFlow implementation (I cannot say anything about other frameworks), which make Kymatio non-productizable right from the start, and likely slower than a proper implementation should have been. The generated serialized computation graph from the same example's model is ≈200MB in size, and consists of 337 identical traces, one per nested loop iteration in
scattering1d.py
.kymatio/README.md
Line 40 in 2e190d9
Having all loops unrolled into a 200MB executable graph hardly qualifies as a “memory-efficient” computation by me, but that's just my opinion. In the eye of this beholder, the README's tenor sets expectations unreasonably high.
I'll share more detail indeed when I'm finished tearing it apart to get a working TensorFlow 1D implementation of the "classic" Andén et.al. 2014 paper scattering transform; give me a few weeks. All I can say now is, the structure of code is so Byzantine and, unfortunately, assumes so much nothing at all about TensorFlow that I, personally, wouldn't dare attempt to salvage the codebase and then take it up to its declared scope (i.e., {1,2,3}D cases with multiple frameworks). Redesigning it would require much more intimate knowledge of all supported frameworks, which I simply don't have.
The text was updated successfully, but these errors were encountered: