Haveibeenpwned sensor platform #3618

joyrider3774 · 2016-09-30T17:26:14Z

Description:
The haveibeenpwned sensor platform creates sensors that check for breached email accounts on haveibeenpwned.

it will list every specified email address as a sensor showing the number of breaches on that email account as well as breach meta data (site title + date breach data added)

during hass startup it will request breach data for all email's specified with 5 seconds between each request. After hass has been started this delay is increased to 15 minutes. So it will check 1 email address per 15 minutes. This is done so to prevent abuse on the one hand and on the other hand the breach data almost never changes so there is no need to check frequently for new breach data. The author (troy hunt) of the service had no problems with this protection scheme (i emailed him before creating this pr)

Related issue (if applicable): fixes #

Pull request in home-assistant.io with documentation (if applicable): home-assistant/home-assistant.io#

home-assistant/home-assistant.io#1023

Example entry for configuration.yaml (if applicable):

# Example configuration.yaml entry using cloud based emoncms
sensor:
  platform: haveibeenpwned
  email: 
    - your_email1@domain.com
    - your_email2@domain.com

Checklist:

If user exposed functionality or configuration variables are added/changed:

Documentation added/updated in home-assistant.io

If code communicates with devices, web services, or a:

[N/A] Local tests with tox run successfully. Your PR cannot be merged unless tests pass
can't run tox on windows
[N/A] New dependencies have been added to the REQUIREMENTS variable (example).
no new depencies
[N/A] New dependencies are only imported inside functions that use them (example).
no new depencies
[N/A] New dependencies have been added to requirements_all.txt by running
script/gen_requirements_all.py.
no new depencies
New files were added to .coveragerc.

If the code does not interact with devices:

Local tests with tox run successfully. Your PR cannot be merged unless tests pass
Tests have been added to verify that the new code works.

michaelkuty · 2016-09-30T21:08:40Z

Nice, but five seconds in default state is really needless and with combination timeout on request you can warm up your device if network connection will be lost.

joyrider3774 · 2016-09-30T21:21:54Z

you mean if a user starts home assistant when there is no netwerk present ? And the problem being that it would keep trying doing requests at a 5 second interval untill network is back? do you suggest i should revert the 5 seconds back to the default scan interval from hass?

The problem is the more email addresses users specify in their configuration the longer it would take for getting intial data as with each update it only handles one email before it moves to the next when there is no error being thrown. If it looped through all emails then it would increase the timeouts. so for example 5 emails specified = 5x30 seconds before it had gotten the initial data for all email addresses thats why i lowered it to 5. I'm not certain if it's really a problem if users would have to wait longer to get the intial data when i revert back to default scan interval

balloob · 2016-10-02T06:03:28Z

homeassistant/components/sensor/haveibeenpwned.py

+            _LOGGER.info("Checking for breaches for email %s", self.email)
+
+            req = requests.get(url, headers={"User-agent": USER_AGENT},
+                               verify=False, allow_redirects=True, timeout=5)


verify=False ⚠️ don't disable SSL verification.

balloob · 2016-10-02T06:04:49Z

homeassistant/components/sensor/haveibeenpwned.py

+    def update(self):
+        """Get the latest data for current email from REST service."""
+        try:
+            self.data = None


This is such a weird approach. Just keep a dictionary with per email the latest fetched version.

i'm using a dictionary now

balloob · 2016-10-02T06:06:03Z

homeassistant/components/sensor/haveibeenpwned.py

+    emails = config.get(CONF_EMAIL)
+
+    data = HaveIBeenPwnedData(emails)
+    data.update = Throttle(timedelta(seconds=SCAN_INTERVAL-1))(data.update)


Why is this? Better to just request the info for each email address at startup and throttle the update as normal

the rest service blocks consequent calls it needs a delay between calls, if i was to request all email data in a loop at startup it would fail. So i wanted to add a 5 second delay between requests at startup to make sure the requests don't fail. In the beginning i've set SCAN_INTERVAL to 5 but if i set the throtlle also at 5 secons it was not updating every 5 seconds but every 10 because it's possible the throttle will block it if scan_interval & throttle are set to the same value. This did not happen if i setted the throttle 1 second lower than the scan_interval.

Futher in the code i raise the scan interval to 15 minutes and i set the throttle to scan_interval - 5 seconds for the same reason.

Basically i wanted to achieve this:
at startup to the requests per email with 5 seconds between each request, as soon as each email has gotten it's data increase the delay to 15 minutes. I also raised scan_interval because it does not need to call update so much.

at First i tried another aproach using the sleep command but this blocks everything hass will pause during startup if i use sleep command so i did not want to use sleep. I then tried to figure out a way to use just the update function as normal but make sure only 1 email is requested in each call to update.

Not sure what the best approach should have been in this case then ?

i'm using track_point_in_time now but i still had many problems getting it to work, trying a couple of systems to make it work which all had some kind of problem. The current system that i now use works but i'm not sure if it's elegant. Basically i have one throttle value of 15 minutes for normal updates and one for forced updates and i created no throttle update functions for the data as well as the sensor. During has start up the no throtthle update function is called for each created sensor (based on an email) and they keep recalling this routine until the data object has data for the sensors email then that sensor does not shedule a new check 5 seconds in the future. The data object also does not cycle to the next email anymore except if the service returns data. This makes sure no time is lost at startup to update each email with 5 second delays. This seems to work fine for me when checking the times when it checks the data

…nitial startup of hass the request happens with 5 seconds delays and after startup with 15 minutes delays. Scan_interval is increased also to not call update as often

- use track_point_in_time system to make sure data updates initially at 5 seconds between each call until all sensor's email have a result in the dict.

balloob requested changes Oct 2, 2016

View reviewed changes

joyrider3774 added 5 commits October 3, 2016 23:07

Initial version of "haveibeenpwned" sensor component

6297383

2 flake8 fixes

5dea22e

remove debugging error message

b257bad

Increase scan_interval as well as throttle to make sure that during i…

9fb47cf

…nitial startup of hass the request happens with 5 seconds delays and after startup with 15 minutes delays. Scan_interval is increased also to not call update as often

update .coveragerc

0677d4e

joyrider3774 force-pushed the haveibeenpwned branch from 7ba8524 to 0677d4e Compare October 3, 2016 21:24

joyrider3774 added 3 commits October 3, 2016 23:41

remove (ssl) verify=False

a21df61

- use dict to keep the request values with email as key

79331c4

- use track_point_in_time system to make sure data updates initially at 5 seconds between each call until all sensor's email have a result in the dict.

fix a pylint error that happend on the py35 tests

a4106b8

balloob approved these changes Oct 8, 2016

View reviewed changes

balloob merged commit 8f8bba4 into home-assistant:dev Oct 8, 2016

home-assistant locked and limited conversation to collaborators Mar 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Haveibeenpwned sensor platform #3618

Haveibeenpwned sensor platform #3618

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Haveibeenpwned sensor platform #3618

Haveibeenpwned sensor platform #3618

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!