-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
Haveibeenpwned sensor platform #3618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Nice, but five seconds in default state is really needless and with combination timeout on request you can warm up your device if network connection will be lost. |
you mean if a user starts home assistant when there is no netwerk present ? And the problem being that it would keep trying doing requests at a 5 second interval untill network is back? do you suggest i should revert the 5 seconds back to the default scan interval from hass? The problem is the more email addresses users specify in their configuration the longer it would take for getting intial data as with each update it only handles one email before it moves to the next when there is no error being thrown. If it looped through all emails then it would increase the timeouts. so for example 5 emails specified = 5x30 seconds before it had gotten the initial data for all email addresses thats why i lowered it to 5. I'm not certain if it's really a problem if users would have to wait longer to get the intial data when i revert back to default scan interval |
_LOGGER.info("Checking for breaches for email %s", self.email) | ||
|
||
req = requests.get(url, headers={"User-agent": USER_AGENT}, | ||
verify=False, allow_redirects=True, timeout=5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verify=False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
def update(self): | ||
"""Get the latest data for current email from REST service.""" | ||
try: | ||
self.data = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is such a weird approach. Just keep a dictionary with per email the latest fetched version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm using a dictionary now
emails = config.get(CONF_EMAIL) | ||
|
||
data = HaveIBeenPwnedData(emails) | ||
data.update = Throttle(timedelta(seconds=SCAN_INTERVAL-1))(data.update) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this? Better to just request the info for each email address at startup and throttle the update as normal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the rest service blocks consequent calls it needs a delay between calls, if i was to request all email data in a loop at startup it would fail. So i wanted to add a 5 second delay between requests at startup to make sure the requests don't fail. In the beginning i've set SCAN_INTERVAL to 5 but if i set the throtlle also at 5 secons it was not updating every 5 seconds but every 10 because it's possible the throttle will block it if scan_interval & throttle are set to the same value. This did not happen if i setted the throttle 1 second lower than the scan_interval.
Futher in the code i raise the scan interval to 15 minutes and i set the throttle to scan_interval - 5 seconds for the same reason.
Basically i wanted to achieve this:
at startup to the requests per email with 5 seconds between each request, as soon as each email has gotten it's data increase the delay to 15 minutes. I also raised scan_interval because it does not need to call update so much.
at First i tried another aproach using the sleep command but this blocks everything hass will pause during startup if i use sleep command so i did not want to use sleep. I then tried to figure out a way to use just the update function as normal but make sure only 1 email is requested in each call to update.
Not sure what the best approach should have been in this case then ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm using track_point_in_time now but i still had many problems getting it to work, trying a couple of systems to make it work which all had some kind of problem. The current system that i now use works but i'm not sure if it's elegant. Basically i have one throttle value of 15 minutes for normal updates and one for forced updates and i created no throttle update functions for the data as well as the sensor. During has start up the no throtthle update function is called for each created sensor (based on an email) and they keep recalling this routine until the data object has data for the sensors email then that sensor does not shedule a new check 5 seconds in the future. The data object also does not cycle to the next email anymore except if the service returns data. This makes sure no time is lost at startup to update each email with 5 second delays. This seems to work fine for me when checking the times when it checks the data
…nitial startup of hass the request happens with 5 seconds delays and after startup with 15 minutes delays. Scan_interval is increased also to not call update as often
7ba8524
to
0677d4e
Compare
- use track_point_in_time system to make sure data updates initially at 5 seconds between each call until all sensor's email have a result in the dict.
Description:
The
haveibeenpwned
sensor platform creates sensors that check for breached email accounts on haveibeenpwned.it will list every specified email address as a sensor showing the number of breaches on that email account as well as breach meta data (site title + date breach data added)
during hass startup it will request breach data for all email's specified with 5 seconds between each request. After hass has been started this delay is increased to 15 minutes. So it will check 1 email address per 15 minutes. This is done so to prevent abuse on the one hand and on the other hand the breach data almost never changes so there is no need to check frequently for new breach data. The author (troy hunt) of the service had no problems with this protection scheme (i emailed him before creating this pr)
Related issue (if applicable): fixes #
Pull request in home-assistant.io with documentation (if applicable): home-assistant/home-assistant.io#
home-assistant/home-assistant.io#1023
Example entry for
configuration.yaml
(if applicable):Checklist:
If user exposed functionality or configuration variables are added/changed:
If code communicates with devices, web services, or a:
tox
run successfully. Your PR cannot be merged unless tests passcan't run tox on windows
REQUIREMENTS
variable (example).no new depencies
no new depencies
requirements_all.txt
by runningscript/gen_requirements_all.py
.no new depencies
.coveragerc
.If the code does not interact with devices:
tox
run successfully. Your PR cannot be merged unless tests pass