-
Notifications
You must be signed in to change notification settings - Fork 7
MOL-2607/ feat: implement random-k selector #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MOL-2607/ feat: implement random-k selector #10
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for adding the selector. I added some comments below.
src/peppr/selector.py
Outdated
The best value is chosen from *k* randomly chosen predictions. | ||
""" | ||
|
||
def __init__(self, k: int) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to be able to provide an optional seed
to the constructor that would be used to initialize a NumPy Generator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, addressed in ce196e1!
Check that the :class:`RandomSelector`, when ran multiple times, selects approximately the expected value for known | ||
examples. | ||
""" | ||
selector = peppr.RandomSelector(k=5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The random seed could be used in this test to make it deterministic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in af4a29a.
tests/test_selectors.py
Outdated
selector.select(values, smaller_is_better=False) for _ in range(20) | ||
] | ||
|
||
assert np.isclose(np.mean(selected_values), 5.0, rtol=0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently this test fails, probably because sometimes the mean is randomly not within the tolerance. Using a much larger k
(and more values to select from) would solve this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in af4a29a. I was actually computing the wrong expected value.
Looks good to me now, thanks! |
Summary
Implements a new
RandomSelector
class that selects the best value from k randomly chosen values in the input array. This allows for easier benchmarking of confidence model, since the perfomance obtained with aRandomSelector
is the same that would be obtained with aTopSelector
by a random confidence model.Changes
RandomSelector
class toselector.py
test_selectors.py
Implementation Details
RandomSelector
randomly selects k indices from the input arraysmaller_is_better
parameter