8000 smaller model runs slower than a larger one when compiled for edgetpu · Issue #50951 · tensorflow/tensorflow · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

smaller model runs slower than a larger one when compiled for edgetpu #50951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Drulludanni opened this issue Jul 26, 2021 · 7 comments
Closed
Assignees
Labels
comp:lite TF Lite related issues comp:micro Related to TensorFlow Lite Microcontrollers stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.5 Issues related to TF 2.5 type:performance Performance Issue

Comments

@Drulludanni
Copy link

System information-

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: coral usb edgetpu
  • TensorFlow version (use command below): v1.12.1-49562-gee58e600bfc 2.5.0-dev20210125
  • Python version: 3.7

Describe the current behavior
So I have two models (U-nets) that are nearly identical except one of them uses fewer filters in some of the convolutional layers which makes that network strictly smaller, and when running the tflite version of the models the smaller one is indeed faster than the larger one, however when compiled and run on the edgetpu the smaller network runs slower than the larger network.

Describe the expected behavior
performance gain form tflite should be the same on the edgetpu

Standalone code to reproduce the issue
https://drive.google.com/drive/folders/1-u9GpNwRdbCAxtaMuAdDZazMWqeMIt_n?usp=sharing

Other info / logs
I already made an issue at the google coral edgetpu page seen here, they said the issue was with interpreter.invoke() in the script ../lib/python3.8/site-packages/tflite_runtime/interpreter.py and that I should contact the tensorflow team.

@Saduf2019
Copy link
Contributor

@Drulludanni
We are unable to open the files in the drive shared, can you share the performance on a colab gist and share for us to analyse.

@Saduf2019 Saduf2019 added the stat:awaiting response Status - Awaiting response from author label Jul 26, 2021
@Drulludanni
Copy link
Author

that is very weird, the folder is shared with anyone with a link (if I open the link in incognito I can still view all the files).

But here is the code:

from pycoral.utils import edgetpu
import numpy as np
import time

models = ['large.tflite', 'small.tflite', 'large_edgetpu.tflite', 'small_edgetpu.tflite']

for model_path in models:
    print(model_path)
    interpreter = edgetpu.make_interpreter(model_path)
    interpreter.allocate_tensors()
    input_index = interpreter.get_input_details()[0]['index']
    output_index = interpreter.get_output_details()[0]['index']


    n_trials = 10
    total = 0
    x = np.zeros((1,256,256,3), dtype=np.uint8)

    # first call is usually slower so we skip it
    interpreter.set_tensor(input_index, x)
    interpreter.invoke()
    pred = interpreter.get_tensor(output_index)

    for i in range(n_trials):
        t =  time.perf_counter()
        interpreter.set_tensor(input_index, x)
        interpreter.invoke()
        pred = interpreter.get_tensor(output_index)
        delta =  time.perf_counter() - t
        print(delta)

        total += delta


    print("inference time:", total/n_trials)

and this is the output:

large.tflite
8.4517966
8.4927569
8.474044400000004
8.477497300000003
8.472683700000005
8.491142599999996
8.470729400000003
8.475646299999994
8.474072800000002
8.462358199999997
inference time: 8.47427282
small.tflite
5.282427299999995
5.293356000000003
5.279340199999993
5.272431900000001
5.2895331
5.280650100000003
5.273774800000012
5.278881600000005
5.27215240000001
5.2793834
inference time: 5.280193080000002
large_edgetpu.tflite
0.01768240000001242
0.017120800000014924
0.016798199999982444
0.016576700000001665
0.016357799999980216
0.016318699999999353
0.016506899999995994
0.01634260000000154
0.01632789999999318
0.016752200000013318
inference time: 0.016678419999999507
small_edgetpu.tflite
0.021253099999995584
0.021638700000011113
0.02142359999999144
0.020244700000006333
0.01973069999999666
0.019953200000003335
0.019520999999997457
0.01960610000000429
0.02138159999998379
0.02115660000001185
inference time: 0.020590930000000184

I tried to make a google colab to run the code but well, I have no idea how to make it run since an edgetpu is required and I don't know if it is possible to somehow make a virtual one in google colab, but here it is anyways: https://colab.research.google.com/drive/1YipG-DUlg0MzGOHV_y4_zadd38YI3wlz?usp=sharing and it should also include the models in my test if you wanna download them to use locally.

@Saduf2019
Copy link
Contributor

@Drulludanni

Could you please refer to these links:link, link1 and let us know.

@Drulludanni
Copy link
Author

Neither of those links are helpful. The reason the code wont run is because there is no edgetpu connected to the colab, and that is the problem I don't know how to either have an edgetpu connected to the colab or how to fake the edgetpu being there with some kind of edgetpu emulation and as far as I'm aware nobody has done/tried that which is why I don't think I can ever make the google colab work for my problem.

@Saduf2019 Saduf2019 added comp:tpus tpu, tpuestimator TF 2.5 Issues related to TF 2.5 and removed stat:awaiting response Status - Awaiting response from author labels Jul 29, 2021
@Saduf2019 Saduf2019 assigned ymodak and unassigned Saduf2019 Jul 29, 2021
@ymodak ymodak added comp:lite TF Lite related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower and removed < 8000 a id="label-0c844d" href="/tensorflow/tensorflow/labels/comp%3Atpus" data-name="comp:tpus" style="--label-r:0;--label-g:82;--label-b:204;--label-h:215;--label-s:100;--label-l:40;" data-view-component="true" class="IssueLabel hx_IssueLabel d-inline-block v-align-middle"> comp:tpus tpu, tpuestimator labels Jul 30, 2021
@ymodak ymodak assigned petewarden and unassigned ymodak Jul 30, 2021
@mohantym mohantym self-assigned this Jul 5, 2022
@mohantym mohantym removed their assignment Aug 24, 2022
@mohantym mohantym added the comp:micro Related to TensorFlow Lite Microcontrollers label Oct 6, 2022
@mohantym
Copy link
Contributor
mohantym commented Oct 6, 2022

Hi @Drulludanni !
We are checking to see whether you still need help in this issue .
You can check the quantization issues through quantization debugger for above models now.

There might be some operations which is not leveraging gpu of your edge tpus.You can find those operation using below flag.
tf.lite.experimental.Analyzer.analyze(model_content=fb_model, gpu_compatibility=True)
Ref.

Thank you!

@mohantym mohantym added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Oct 6, 2022
@mohantym mohantym self-assigned this Oct 6, 2022
@google-ml-butler
Copy link

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

@google-ml-butler google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Oct 14, 2022
@google-ml-butler
Copy link

Closing as stale. Please reopen if you'd like to work on this further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:lite TF Lite related issues comp:micro Related to TensorFlow Lite Microcontrollers stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.5 Issues related to TF 2.5 type:performance Performance Issue
Projects
None yet
Development

No branches or pull requests

6 participants
0