8000 Optimize all_equal recipe by bbayles · Pull Request #899 · more-itertools/more-itertools · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Optimize all_equal recipe #899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 16, 2024
Merged

Optimize all_equal recipe #899

merged 2 commits into from
Aug 16, 2024

Conversation

bbayles
Copy link
Collaborator
@bbayles bbayles commented Aug 15, 2024

This PR updates all_equal with @rhettinger's new version.

Closes #896

@pochmann3
Copy link
Contributor
pochmann3 commented Aug 15, 2024

Did you overlook my benchmark where this was slower than the current version for the case of one group? (Which is the most relevant one for the mentioned usage un iequals.) This does not look like an optimization to me.

@bbayles
Copy link
Collaborator Author
bbayles commented Aug 15, 2024

Thanks for the comment. To be totally transparent, I am going to favor Raymond's suggestions in almost every case – this library is a wrapper around his work, and unless I have a big disagreement with one of his ideas, I'm going to implement them. You have a good track record of contributing to this library as well, and as such I am very open to your suggestion too. That said, I generally discount microbenchmarks on a single machine with un-representative inputs, which is what appears to be the basis for the "this is slower" claim.

@rhettinger
Copy link
Contributor

Pochmann later posted a for/for variant that is much better than his other proposals. After changing the variable name from groups to iterator, I'm happy with it:

def all_equal(iterable, key=None):
    iterator = groupby(iterable, key)
    for first in iterator:
        for second in iterator:
            return False
    return True

I don't think we can do better than this. It is short and relies only on the most basic of Python features. It is a little less straight-forward than the try/except variant but only has a small and tolerable bit of weirdness. This is the one I recommend.

@pochmann3
Copy link
Contributor

I generally discount microbenchmarks on a single machine with un-representative inputs, which is what appears to be the basis for the "this is slower" claim.

As shown in the issue, I also reproduced my findings on another machine. And reproduced Raymond's disagreeing findings on the same machine when measuring his command line way without a function. I still don't know why measuring that way gives such very different results, but that way isn't the realistic way. The real all_equal is a function.

About "un-representative inputs": They do represent all cases: No groups, one group, more than one group. And all solutions under consideration use groupby. So those inputs are fully representative for the differences of the solutions.

I also like the normal version of my "gross" hack:

def all_equal(iterable, key=None):
    seen_one = False
    for _ in groupby(iterable, key):
        if seen_one:
            return False
        seen_one = True
    return True

It's similarly fast:

iterable = ()
True  137 ± 0.3 ns  Stefan_hack
True  140 ± 0.3 ns  Stefan_seen
True  147 ± 0.7 ns  Stefan_loops
True  306 ± 0.9 ns  current
True  356 ± 0.8 ns  Raymond

iterable = (1,)
True  205 ± 0.3 ns  Stefan_hack
True  208 ± 0.7 ns  Stefan_seen
True  221 ± 0.8 ns  Stefan_loops
True  355 ± 1.3 ns  current
True  435 ± 1.0 ns  Raymond

iterable = (1, 2)
False  251 ± 0.4 ns  Stefan_loops
False  252 ± 0.9 ns  Stefan_seen
False  255 ± 0.8 ns  Stefan_hack
False  329 ± 1.1 ns  Raymond
False  396 ± 0.9 ns  current

Python: 3.13.0rc1+ (heads/3.13:009f9efe658, Aug 15 2024, 14:30:32) [Clang 18.1.6 (Fedora 18.1.6-3.fc40)]
Benchmark script
def current(iterable, key=None):
    return len(list(islice(groupby(iterable, key), 2))) <= 1

def Raymond(iterable, key=None):
    iterator = groupby(iterable, key=key)
    try:
        next(iterator)
        next(iterator)
    except StopIteration:
        return True
    return False

def Stefan_hack(iterable, key=None):
    for _ in groupby(iterable, key):
        if iterable is None:
            return False
        iterable = None
    return True

def Stefan_loops(iterable, key=None):
    iterator = groupby(iterable, key)
    for first in iterator:
        for second in iterator:
            return False
    return True

def Stefan_seen(iterable, key=None):
    seen_one = False
    for _ in groupby(iterable, key):
        if seen_one:
            return False
        seen_one = True
    return True


funcs = [current, Raymond, Stefan_hack, Stefan_loops, Stefan_seen]

iterables = (), (1,), (1, 2)

from timeit import timeit
from statistics import mean, stdev
import sys
import random
from itertools import *

for iterable in iterables:
    print(f'{iterable = }')
    times = {f: [] for f in funcs}
    def stats(f):
        ts = [t * 1e9 for t in sorted(times[f])[:10]]
        return f' {mean(ts):3.0f} ± {stdev(ts):3.1f} ns '
    for _ in range(100):
        random.shuffle(funcs)
        for f in funcs:
            t = timeit(lambda: f(iterable), number=1000) / 1000
            times[f].append(t)
    for f in sorted(funcs, key=stats):
        print(f(iterable), stats(f), f.__name__)
    print()

print('Python:', sys.version)

Attempt This Online!

@rhettinger
Copy link
Contributor

Stefan, take the key=key out of the "Raymond" variant. Also we really don't care about the empty iterator case, it is a distractor. (Side note: passing in a lambda : is possibly the worst way to use timeit. Instead, prefer the setup="from __main__ import test" style shown in the examples at the bottom of the timeit docs.)

Bo, aside from the two other hacky variants proposed, I don't think there are any wrong choices here. My order of preference is 1) the try/next/next proposed in the issue, 2) the for first/for second in the current PR, or 3) just leaving it as-is.

Though not currently the fastest, the try/next/next most clearly expresses the algorithmic intent and doesn't ask the interpreter for anything beyond that. As the interpreter evolves, try/next/next will become the fastest (looking up builtins will be as fast as looking up globals/locals , catching exceptions will become cheaper, can catching StopIteration may be special cased). That said, we're only talking a few tens of nanoseconds outside the inner loop, so none of this really matters much. I don't care about it enough to endlessly joust with Stefan.

@bbayles bbayles merged commit 390a3db into master Aug 16, 2024
12 checks passed
@bbayles bbayles deleted the issue-896-all-equal branch August 16, 2024 14:50
@bbayles bbayles mentioned this pull request Sep 5, 2024
github-merge-queue bot referenced this pull request in canonical/charmcraft Sep 6, 2024
This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
|
[more-itertools](https://redirect.github.com/more-itertools/more-itertools)
| `==10.4.0` -> `==10.5.0` |
[![age](https://developer.mend.io/api/mc/badges/age/pypi/more-itertools/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/more-itertools/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/more-itertools/10.4.0/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/more-itertools/10.4.0/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>more-itertools/more-itertools (more-itertools)</summary>

###
[`v10.5.0`](https://redirect.github.com/more-itertools/more-itertools/releases/tag/v10.5.0)

[Compare
Source](https://redirect.github.com/more-itertools/more-itertools/compare/v10.4.0...v10.5.0)

#### What's Changed

- Optimize all_equal recipe by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/899](https://redirect.github.com/more-itertools/more-itertools/pull/899)
- Reduce groupby.**next** calls in all_equal by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/903](https://redirect.github.com/more-itertools/more-itertools/pull/903)
- Fix types.UnionType by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/905](https://redirect.github.com/more-itertools/more-itertools/pull/905)
- Version 10.5.0 by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/906](https://redirect.github.com/more-itertools/more-itertools/pull/906)

**Full Changelog**:
more-itertools/more-itertools@v10.4.0...v10.5.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "every weekend" in timezone Etc/UTC,
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/canonical/charmcraft).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC41OS4yIiwidXBkYXRlZEluVmVyIjoiMzguNTkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiZGVwZW5kZW5jaWVzIl19-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
lengau referenced this pull request in canonical/charmcraft Sep 8, 2024
…#1885)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
|
[more-itertools](https://redirect.github.com/more-itertools/more-itertools)
| `==10.4.0` -> `==10.5.0` |
[![age](https://developer.mend.io/api/mc/badges/age/pypi/more-itertools/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/more-itertools/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/more-itertools/10.4.0/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/more-itertools/10.4.0/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>more-itertools/more-itertools (more-itertools)</summary>

###
[`v10.5.0`](https://redirect.github.com/more-itertools/more-itertools/releases/tag/v10.5.0)

[Compare
Source](https://redirect.github.com/more-itertools/more-itertools/compare/v10.4.0...v10.5.0)

#### What's Changed

- Optimize all_equal recipe by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/899](https://redirect.github.com/more-itertools/more-itertools/pull/899)
- Reduce groupby.**next** calls in all_equal by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/903](https://redirect.github.com/more-itertools/more-itertools/pull/903)
- Fix types.UnionType by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/905](https://redirect.github.com/more-itertools/more-itertools/pull/905)
- Version 10.5.0 by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/906](https://redirect.github.com/more-itertools/more-itertools/pull/906)

**Full Changelog**:
more-itertools/more-itertools@v10.4.0...v10.5.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "every weekend" in timezone Etc/UTC,
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/canonical/charmcraft).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC41OS4yIiwidXBkYXRlZEluVmVyIjoiMzguNTkuMiIsInRhcmdldEJyYW5jaCI6ImhvdGZpeC8zLjIiLCJsYWJlbHMiOlsiZGVwZW5kZW5jaWVzIl19-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
lengau referenced this pull request in canonical/charmcraft Sep 9, 2024
This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
|
[more-itertools](https://redirect.github.com/more-itertools/more-itertools)
| `==10.4.0` -> `==10.5.0` |
[![age](https://developer.mend.io/api/mc/badges/age/pypi/more-itertools/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/more-itertools/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/more-itertools/10.4.0/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/more-itertools/10.4.0/10.5.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>more-itertools/more-itertools (more-itertools)</summary>

###
[`v10.5.0`](https://redirect.github.com/more-itertools/more-itertools/releases/tag/v10.5.0)

[Compare
Source](https://redirect.github.com/more-itertools/more-itertools/compare/v10.4.0...v10.5.0)

#### What's Changed

- Optimize all_equal recipe by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/899](https://redirect.github.com/more-itertools/more-itertools/pull/899)
- Reduce groupby.**next** calls in all_equal by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/903](https://redirect.github.com/more-itertools/more-itertools/pull/903)
- Fix types.UnionType by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/905](https://redirect.github.com/more-itertools/more-itertools/pull/905)
- Version 10.5.0 by
[@&#8203;bbayles](https://redirect.github.com/bbayles) in
[https://github.com/more-itertools/more-itertools/pull/906](https://redirect.github.com/more-itertools/more-itertools/pull/906)

**Full Changelog**:
more-itertools/more-itertools@v10.4.0...v10.5.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "every weekend" in timezone Etc/UTC,
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/canonical/charmcraft).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC41OS4yIiwidXBkYXRlZEluVmVyIjoiMzguNTkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiZGVwZW5kZW5jaWVzIl19-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

all_equal() improvement
3 participants
0