8000 BUG retry by Ixiodor · Pull Request #6137 · celery/celery · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

BUG retry #6137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

BUG retry #6137

wants to merge 4 commits into from

Conversation

Ixiodor
Copy link
Contributor
@Ixiodor Ixiodor commented Jun 1, 2020

Bug solve for retry?

Note: Before submitting this pull request, please review our contributing
guidelines
.

Description

Bug solve for retry?
@Ixiodor Ixiodor mentioned this pull request Jun 1, 2020
18 tasks
comment indent
@Ixiodor
Copy link
Contributor Author
Ixiodor commented Jun 1, 2020

result after the fix, to be comparated to the one in issure report

[2020-06-01 14:29:54,698: INFO/MainProcess] Connected to amqp://remote_worker:**@127.0.0.1:5672//
[2020-06-01 14:29:54,736: INFO/MainProcess] mingle: searching for neighbors
[2020-06-01 14:29:55,786: INFO/MainProcess] mingle: all alone
[2020-06-01 14:29:55,807: WARNING/MainProcess] /home/ubuntu/.local/lib/python3.7/site-packages/celery/fixups/django.py:203: UserWarning: Using settings.DEBUG leads to a memory
leak, never use this setting in production environments!
leak, never use this setting in production environments!''')
[2020-06-01 14:29:55,808: INFO/MainProcess] celery@testroom ready.
[2020-06-01 14:30:02,428: INFO/MainProcess] Received task: api_v3.tests.execute[a56bdab9-401a-4cc8-9801-dc8ed3dae764]
[2020-06-01 14:30:02,788: WARNING/ForkPoolWorker-1] started
[2020-06-01 14:30:02,789: WARNING/ForkPoolWorker-1] retry
[2020-06-01 14:30:02,820: INFO/MainProcess] Received task: api_v3.tests.execute[a56bdab9-401a-4cc8-9801-dc8ed3dae764] ETA:[2020-06-01 14:30:12.789893+00:00]
[2020-06-01 14:30:02,880: INFO/ForkPoolWorker-1] Task api_v3.tests.execute[a56bdab9-401a-4cc8-9801-dc8ed3dae764] succeeded in 0.45013424299941107s: None
[2020-06-01 14:30:15,056: WARNING/ForkPoolWorker-1] started
[2020-06-01 14:30:15,056: WARNING/ForkPoolWorker-1] ended
[2020-06-01 14:30:15,116: INFO/ForkPoolWorker-1] Task api_v3.tests.execute[a56bdab9-401a-4cc8-9801-dc8ed3dae764] succeeded in 0.35210886299864796s: None

why someone removed this???
Copy link
Member
@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR. it needs to update the tests as well :)

@@ -714,6 +714,8 @@ def retry(self, args=None, kwargs=None, exc=None, throw=True,
if is_eager:
# if task was executed eagerly using apply(),
# then the retry must also be executed eagerly in apply method
# This is needed for eaged
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this need tests as well

Copy link
Contributor Author
@Ixiodor Ixiodor Jun 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one got removed from someone that didn't checked that eager wasn't working...i just re-introduced(Actually was bugged cause of .get() but the correct version is that without .get() )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for that reason, a unit test is needed to verify this change

Copy link
Contributor Author
@Ixiodor Ixiodor Jun 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't wanna discuss, thanks for your work. But where is the unit test for that new introduced BUG? I updated the 4.4.3 and now i have a new BUG. (The one i fixed here re-adding the apply ).
However I'm not much familiar with Celery unit test, I'll try to get in touch with that but it's better wait someone more familiar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you share how you do retry your tasks?
for eager retries, you may either use:
raise self.retry(...)
or
return self.retry(..., throw=False)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it works with both raise/return throw True/False.
#6138

Copy link
Contributor Author
@Ixiodor Ixiodor Jun 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can find in issure report the whole code to test the BUG #6135
Even using raise/return you will get an Exception on "to be retried task" and an execution on "to be retried but changed" task.
This till max_retries hit.

Copy link
@PabloCastellano PabloCastellano Jun 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having the same issue (openwisp/openwisp-firmware-upgrader#78). A minor release shouldn't contain a change like this IMO. Or at least the release notes should have a section about the migration path

@lgtm-com
Copy link
lgtm-com bot commented Jun 1, 2020

This pull request introduces 1 alert and fixes 1 when merging 67798f4 into 52aef4b - view on LGTM.com

new alerts:

  • 1 for Non-exception in 'except' clause

fixed alerts:

  • 1 for Module is imported with 'import' and 'import from'

@@ -714,6 +714,8 @@ def retry(self, args=None, kwargs=None, exc=None, throw=True,
if is_eager:
# if task was executed eagerly using apply(),
# then the retry must also be executed eagerly in apply method
# This is needed for eaged
S.apply()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@auvipy this is the fixed regression

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also check the related unit test?

Copy link
Contributor Author
@Ixiodor Ixiodor Jun 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem is that i dosn't test things directly on unit test/git solution, but debugging my "work code" entering in celery. I have to setup the env to test/make unit tests

@Ixiodor
Copy link
8000
Contributor Author
Ixiodor commented Jun 1, 2020

do you think it's possibile to release an "urgent hotfix" version? Cause this totally broke the eager and patching retry behaviour could be good too :/
We can connect to try to get an unit test cause requires me too much time to get in

@mchataigner
Copy link
Contributor

If you have urgent production issue I suggest you to pin celery version to 4.4.2 until this gets fixed.
By the way, your patch is breaking tests for eager retry.

@Ixiodor
Copy link
Contributor Author
Ixiodor commented Jun 1, 2020

4.4.2 has that BUG too...I'll try to back 4.4.1, but there was a BUG i fixed in the past (result_extend) and i have to check when.
About the patch. Even the only code where i add the:

except Retry:
    pass

?

@Ixiodor
Copy link
Contributor Author
Ixiodor commented Jun 1, 2020

Ok, we can revert the changes on task.py seems.
4.4.1 is affected to the same BUG too, I'll manually patch for now.
(Trying to rever the commit, i commited to delete the whole file, well, isn't what i wanted)

Seems not needed with 4.4.3
@Ixiodor
Copy link
Contributor Author
Ixiodor commented Jun 1, 2020

@mchataigner sorry, I'm doing things rapidly and I'm doing wrong things.
Eager dosn't work as i said.
"Origianl task" get executed but the "new one" to retry with new parameters dosn't get executed in eager. Get executed in eager=false but with the BUG(same BUG happens in eager before your changes).
As i said before, your tests succeed for lucky reasons

HELLOOOOO
started
retry
HELLOOOOO
started
retry
HELLOOOOO
started
retry
HELLOOOOO
started
retry
BYEEEEE

This is what i get from eager with my test code, as you can see, there is not print("ended")
This is what i get with my fix on 'tasks.py':

HELLOOOOO
started
retry
HELLOOOOO
started
ended
BYEEEEE
HELLOOOOO
started
retry
HELLOOOOO
started
ended
BYEEEEE
HELLOOOOO
started
retry
HELLOOOOO
started
ended
BYEEEEE
HELLOOOOO
started
retry
BYEEEEE
HELLOOOOO
started
retry
BYEEEEE
HELLOOOOO
started
retry
HELLOOOOO
started
ended
BYEEEEE
HELLOOOOO
started
retry
BYEEEEE
HELLOOOOO
started
retry
BYEEEEE
HELLOOOOO
started
retry
HELLOOOOO
started
ended
BYEEEEE
HELLOOOOO
started
retry
HELLOOOOO
started
ended
BYEEEEE
HELLOOOOO
started
retry
BYEEEEE
HELLOOOOO
started
retry
BYEEEEE
HELLOOOOO
started
retry
HELLOOOOO
started
ended
BYEEEEE
HELLOOOOO
started
retry
BYEEEEE
BYEEEEE
HELLOOOOO
started
retry
BYEEEEE

Full fix:

HELLOOOOO
started
retry
HELLOOOOO
started
ended
BYEEEEE

I'll wait your updates

@lgtm-com
Copy link
lgtm-com bot commented Jun 1, 2020

This pull request introduces 1 alert and fixes 4 when merging 45b0fb4 into 52aef4b - view on LGTM.com

new alerts:

  • 1 for Non-exception in 'except' clause

fixed alerts:

  • 4 for Module is imported with 'import' and 'import from'

@mchataigner
Copy link
Contributor

Could you share the skeleton of your task? it would help to understand your problem.

@Ixiodor
Copy link
Contributor Author
Ixiodor commented Jun 1, 2020
@app_cluster.task(bind=True, autoretry_for=(Exception,), max_retries=3,
                  default_retry_delay=10)
def execute(self, param_a, param_b=None, **kwargs):
    print("started")
    if param_b is None:
        param_b = "filled"
        print("retry")
        raise self.retry(exc=Exception("i have filled now"), args=[param_a, param_b], kwargs=kwargs)
    print("ended")

def test_celery(self):
    sig = execute.si("something")
    t = sig.delay()
    t = 0

run this in eager=true in 4.4.3 (can we talk on a faster platform?)
I can share with something like teamviewer and try to follow the flow

@mchataigner
Copy link
Contributor

Ok, I'm able to reproduce locally. I will update you asap with a patch.

@Ixiodor
Copy link
Contributor Author
Ixiodor commented Jun 1, 2020

Did you even noticed the split of tasks?
Thanks

@mchataigner
Copy link
Contributor

So, it seems that putting autoretry_for with Exception is too broad as Celery is managing some states with Exceptions.
But then, there is the autoretry_for mechanism that comes in play. It catches the Retry exception (the one with the correct parameters) and then messes with your new args and throw another Retry exception.
So quick fix for you, specify a concrete Exception in autoretry_for. Actually, you don't need autoretry_for in your case as you are calling self.retry directly, and putting and nested Exception in retry is optional.

So, yes, there is a bug in autoretry_for, not related to my patch concerning eager retry.
But, good news, there is a workaround:

@app_cluster.task(bind=True, max_retries=3,
                  default_retry_delay=10)
def execute(self, param_a, param_b=None, **kwargs):
    print("started")
    if param_b is None:
        param_b = "filled"
        print("retry")
        raise self.retry(args=[param_a, param_b], kwargs=kwargs)
    print("ended")

def test_celery(self):
    sig = execute.si("something")
    t = sig.delay()
    t = 0

@Ixiodor
Copy link
Contributor Author
Ixiodor commented Jun 1, 2020

Well no...this was an example code. I need to autoretry for all Exception(General) in my real code.
And the BUG about the code you introduced is inside eager=true, the second task, the one with new parameters, dosn't get executed, only the first one gets executed over and over again.
You can't just catch Retry exception before "autotry" exceptions as i did?
base.py

except Retry:
  pass

@mchataigner
Copy link
Contributor

Yes, that's what I pushed here #6138
But if you're in a hurry, I would suggest not using autoretry_for and explicitly put a try except around your code:

@app_cluster.task(bind=True, max_retries=3,
                  default_retry_delay=10)
def execute(self, param_a, param_b=None, **kwargs
    print("started")
    try:
        if param_b is None:
            param_b = "filled"
            print("retry")
            raise self.retry(args=[param_a, param_b], kwargs=kwargs)
    except Retry:
        raise
    except Exception as exc:
        raise self.retry(exc=exc, args=[param_a, param_b], kwargs=kwargs)
    print("ended")

def test_celery(self):
    sig = execute.si("something")
    t = sig.delay()
    t = 0

autoretry_for is made as syntactic suggar for simple code.

@Ixiodor
Copy link
Contributor Author
Ixiodor commented Jun 1, 2020

Oh, ok. I got it now. Thanks.
I'll check if it's all fine for now.

@Ixiodor
Copy link
Contributor Author
Ixiodor commented Jun 1, 2020

SEEMS to fix #6132 too

@auvipy auvipy added this to the 4.4.x milestone Jun 1, 2020
@auvipy
Copy link
Member
auvipy commented Jun 1, 2020

thanks for raising this! should be fixed by another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0