8000 Cast unique_id and async discovery by OttoWinter · Pull Request #12474 · home-assistant/core · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Cast unique_id and async discovery #12474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Feb 23, 2018
Merged

Conversation

OttoWinter
Copy link
Member
@OttoWinter OttoWinter commented Feb 17, 2018

Description:

🏔 This PR adds a unique_id to the cast platform. While doing so, it makes the platform faster by using async discovery and it tries to make the cast code base a bit easier to understand.

⚠️ This needs to be tested ... like, a lot. From the past we know that the cast platform is very fragmented and there are many weird errors that result from 3rd party cast products. While I've tested this over the last 1 ½ days, I can't guarantee it will work for all devices, especially for audio groups.

My Chromecast Audio broke some time ago and my new one is only arriving in about a week (living in the mountains does have its downsides 😕). But then I'll also be able to try audio groups.

Related: #12395, #12405

Fixes #12565, fixes #11905, fixes #7959, probably #11945 (fewer socket_client threads created)

Added benefit: Home Assistant startup time is significantly reduced and those pesky "Setup of platform media_player.cast is taking over 10 seconds." long entries are removed.

Example entry for configuration.yaml (if applicable):

discovery:
media_player:
  - platform: cast

Checklist:

  • The code change is tested and works locally.

If the code does not interact with devices:

  • Local tests with tox run successfully. Your PR cannot be merged unless tests pass
  • Tests have been added to verify that the new code works.

@@ -33,90 +38,159 @@
SUPPORT_TURN_ON | SUPPORT_TURN_OFF | SUPPORT_PREVIOUS_TRACK | \
SUPPORT_NEXT_TRACK | SUPPORT_PLAY_MEDIA | SUPPORT_STOP | SUPPORT_PLAY

KNOWN_HOSTS_KEY = 'cast_known_hosts'
INTERNAL_DISCOVERY_RUNNING = 'cast_discovery_running'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past, we'd just have a separate pychromecast discovery run for every single chromecast. That was an immense waste of resources. This PR uses only 1 discovery service at max. It needs to know though whether its own discovery service is running already, so I store it in hass.data[cast_discovery_running]. If this is the wrong place, please tell me.

"""Called when zeroconf has discovered a new chromecast."""
try:
chromecast = pychromecast._get_chromecast_from_host(
listener.services[name], blocking=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a protected function here as a temporary hack. We need some way of using the non-blocking discovery with blocking Chromecast objects. If we want to merge this PR, I would of course create a PR in pychromecast to allow this without a "hack".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, there's another potential problem here (that was also there before). Every time we find a chromecast, we create a full Chromecast object, with a separate Thread object, socket and whatnot. (related: #11945)

We might be able to get away in some cases by just matching the host, port and information we get from zeroconf.

add_devices(casts)
# Manually add a "normal" Chromecast, we can do that without discovery.
try:
chromecast = pychromecast.Chromecast(*want_host)
Copy link
Member Author
@OttoWinter OttoWinter Feb 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having discovery: and pychromecast's discovery running in parallel is extremely prone to bugs. I understand the reasons for why we're doing it right now, but we should try to change it in the future (if at all possible)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always check if discovery is loaded (via 'discovery' in hass.config.components) and track if discovery is loaded after this platform (by tracking the component_loaded event) to disable discovery when needed

Copy link
Member Author
@OttoWinter OttoWinter Feb 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify: Right now I think it's better to have both discoveries active since AFAIK, they use separate systems - mDNS for pychromecast and uPnP for netdisco. Plus there's the issue of multiple network interfaces. And removing one system could lead to a lot of users complaining about their devices not being discovered anymore.

During the creation of this PR I found lots and lots of edge cases of the form: what if pychromecast discovers A, but discovery doesn't, but discovers B, and so on. Supporting all of them was difficult, therefore using only one discovery service might be easier.

Edit: Oh, it appears as though netdisco is supporting mDNS too. Might take a look at just using netdisco later, as that could be cleaner.

@OttoWinter
Copy link
Member Author

Tested with the following configurations using Google Home Mini and Chromecast:

discovery:
discovery:
media_player:
  - platform: cast
media_player:
  - platform: cast
media_player:
  - platform: cast
    host: 192.168.178.153
media_player:
  - platform: cast
  - platform: cast
    host: 192.168.178.153

@balloob
Copy link
Member
balloob commented Feb 18, 2018

For a future PR: To get rid of threads and run in non blocking more, we should be able to do something like this:

@callback
def socket_readable():
    # This will cause the status listeners to be executed from within event loop
    cast.socket_client.run_once()

hass.loop.add_reader(chromecast.socket_client.get_socket(), socket_readble)

@balloob
Copy link
Member
balloob commented Feb 18, 2018

The cast platform is one of our oldest integrations. I'm glad that you're working on cleaning it up 👍

import logging
from typing import Tuple

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'typing.Tuple' imported but unused

@benjamin1492
Copy link

Apologies up front if this is the wrong forum / site / effort for the information provided below. I've never posted to github, and certainly don't want to start off going against the rules.

I have some useful automations that depend on reliably discovering cast devices and groups. I also have a wide array of cast devices: 4 Google Homes, 10 CCA's, 1 1st gen CCV, 2 Nvidia Shields, 1 Insignia alarm clock with CC built-in, and a Sony speaker with CC built-in.

Everything below is since last last year with the breaking firmware update of Google Cast devices. Prior to that, everything was fine. I have tried every combination of discovery that OttoWinter lays out above multiple times and the results are similar. Currently I'm setup with platform: cast, global discovery turned off, and NOT specifying individual hosts.

Under the current 0.63.3 cast.py (and previous iterations since breaking firmware update), the most common outcome is that during discovery I either get A) groups and units with older firmware, or B) I get the individual units with no groups. If I get the individual units, then they may fall out of the Google Home app. This means they also aren't available as cast targets from anything else (Pandora, Google Play Music). It's as though they can't exist in both Hass and Google Home simultaneously.

When I get groups and the units with older firmware, those are able to co-exist within Home Assistant and the Google Home app. My automations will work when I cast to one of the older devices or one of the groups. Automations will not work, however, when casting to the units with newer firmware since they aren't in Home Assistant as entities.

All that said, notice that I used terms like "most common" and "may" in describing scenarios above. Yesterday, after restarting my router, restarting all the cast devices, restarting the Google Home app, and doing a rain dance, all of the devices and groups showed up when I restarted Home Assistant. And all my automations (some of which even my wife was impressed with) worked for 6 glorious hours. Then I made the mistake adding some sensors and doing a hass restart. I did everything exactly as the previous startup and still only got the individual units. And in this go 'round, not all of the newer firmware devices fell out of the Google Home app. Inexplicably, 2 out of the 10 new firmware CCA's were available as cast targets. No idea why.

Anyway, I'm happy to test anything you want me to test, and under any testing rubric you'd like me to try. I'm a Home Assistant addict, but my full time job is about as far away from anything tech related as one can get. This means any skills I've developed over the years have been self-taught and are woefully inadequate for providing the heavy lifting in this exercise. I have a strong attention to detail, however, and I can copy/paste and download with the best of 'em.

I'm giving this branch of cast.py a spin today, and I'll let you know what happens:
Home Assistant 0.63.3 in virtualenv
Python 3.5
Ubuntu 16.04 on virtualbox residing on Windows 2012 Server R2
Stuff installed and working (in case it matters): zwave, nginx, letsencrypt, duckdns, gactions, owntracks, mosquitto mqtt bridge in docker for a couple of smartthings integrations not yet available on hass

@OttoWinter
Copy link
Member Author
OttoWinter commented Feb 18, 2018

@benjamin1492 I mean what you're describing here is not really too related to this PR here, and should probably rather be posted as an issue (see the "issue" tab on the top of this page).

But I could very well use you as my first guinea pig to test this code change 😄. See, I'm quite certain this code change won't fix your problem, but it does improve debugging, which can help resolve your problem. So it would be great if you could override your homeassistant/components/media_player/cast.py file with the content of this PR (If you're having issues with this, just contact me under the community forums or the discord channel; @OttoWinter). Next, put this in your configuration to enable debug logs:

logger:
  default: info
  logs:
    pychromecast.socket_client: debug
    pychromecast: debug
    homeassistant.components.media_player.cast: debug

and send me your debug log. Again, further discussion should be moved elsewhere (community forums, discord, or even e-mail contact@otto-winter.com if you like) as it's not really related with this code change.

@OttoWinter
Copy link
Member Author

@balloob Good to know about the add_reader feature - thanks! (btw, have the config entries progressed far enough yet to try them out? - Haven't had a in-depth look at it yet, but I think the Cast platform could potentially be suited well to test that new feature)

@balloob
Copy link
Member
balloob commented Feb 18, 2018

Config entries have been merged, it's something we can explore in a future PR.

import pychromecast
def _setup_internal_discovery(hass: HomeAssistantType) -> None:
"""Set up the pychromecast internal discovery."""
if hass.data.get(INTERNAL_DISCOVERY_RUNNING_KEY, False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong: Since this is not an async method, it will be called inside an executor thread, right? In this case, this kind of locking is prone to race conditions. If a second of these methods is started while this method is executing between lines 57 and 86, the "lock" will not work. Perhaps use a mutex instead of a boolean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So apparently there are no atomic booleans in Python, so I tried using asyncio.Lock. But that doesn't allow atomic checking and setting at the same time, so I reverted to using threading.Lock.

"""

if chromecast.uuid is None:
# Found a cast without UUID, we don't store it because we won't be able
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring in line 42 says that you store devices without UUID with the None key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, that was from a previous implementation. As there's no need for us to store those devices with None UUID, I have removed them from the dict to keep the code, and the dict schema a bit cleaner.

return CastDevice(chromecast)

# Found a cast with UUID
known_casts = hass.data[ADDED_CAST_DEVICES_KEY]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you have "known" and "added" casts, and here you assign the ADDED_CAST_DEVICES_KEY stuff to known_casts. I'm confused. What's the difference. Should this be added_casts = …?

Copy link
Member Author
@OttoWinter OttoWinter Feb 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm using both a bit interchangeably here since every time we "know" a chromecast, it is also added.

Edit: Ok, my previous comment is not entirely true, there are cases where we know a chromecast, but don't add it (groups)

if want_host is None:
# We were explicitly told to enable pychromecast discovery.
enable_discovery = True
elif want_host[1] != DEFAULT_PORT:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this fail if the discovery info in line 143 sets the port to DEFAULT_PORT by chance?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the whole point of this line! We only need to run discovery if we find a group (and groups always have non-standard ports).

So if discovery: finds a chromecast with DEFAULT_PORT, we don't need to run our own discovery.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, if groups are guaranteed to live on different ports, that should work. 👍

except pychromecast.ChromecastConnectionError:
_LOGGER.warning("Can't set up chromecast on %s", want_host[0])
raise
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you raise above, you don't need to wrap this into an else, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't :) Will change it

@tinloaf
Copy link
Contributor
tinloaf commented Feb 19, 2018

I also think you need to implement async_will_remove_from_hass to remove cast entities that were removed from hass from your hass.data[ADDED_CAST_DEVICES_KEY].

@OttoWinter
Copy link
Member Author

async_will_remove_from_hass

Sure, I was initially looking for that method and didn't find it. Eventually I forgot that functionality. Will do.

@@ -54,7 +55,8 @@

def _setup_internal_discovery(hass: HomeAssistantType) -> None:
"""Set up the pychromecast internal discovery."""
if hass.data.get(INTERNAL_DISCOVERY_RUNNING_KEY, False):
hass.data.setdefault(INTERNAL_DISCOVERY_RUNNING_KEY, threading.Lock())
if not hass.data[INTERNAL_DISCOVERY_RUNNING_KEY].acquire():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should pass blocking=False here. Otherwise, this thread will never finish.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh noes 🤕 I even went through all the documentation and then forgot to include that parameter 😑.

@asyncio.coroutine
def async_will_remove_from_hass(self):
"""Disconnect Chromecast object when removed."""
self._async_disconnect()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably also should remove the entity object from hass.data[ADDED_CAST_DEVICE_KEY] here (or in _async_disconnect?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kinda already doing that on line 129, but I think I should move that into _async_disconnect, as it makes more sense there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes. But that line is never executed if async_will_remove_from_hass is being called from "outside", i.e., not because of a re-discovered entity. Putting it here or in _async_disconnect is the right thing, I guess.

@OttoWinter
Copy link
Member Author
OttoWinter commented Feb 21, 2018

After looking into netdisco a bit more, I think we could 10000 get away without pychromecast's internal discovery since they both use the same system for chromecast discovery (mDNS).

For example if I print out discovery_info, we have all values we need to create a Chromecast object: (example for discovery_info of chromecast and google home)

{
  "host": "192.168.178.162",
  "port": 8009,
  "hostname": "fc73c725-a458-6c01-0bd6-1b7cb49f52e2.local.",
  "properties": {
    "id": "fc73c725a4586c010bd61b7cb49f52e2",
    "cd": "E7873CDA9DFC38A20F9A03064F7B6DEE",
    "rm": false,
    "ve": "05",
    "md": "Chromecast",
    "ic": "/setup/icon.png",
    "fn": "TV the second",
    "ca": "4101",
    "st": "0",
    "bs": "FA8FCA501A8C",
    "nf": "1",
    "rs": false
  }
}
{
  "host": "192.168.178.153",
  "port": 8009,
  "hostname": "57831e09-bcaf-d41f-37b4-c95b260b2964.local.",
  "properties": {
    "id": "57831e09bcafd41f37b4c95b260b2964",
    "cd": "B8EA4C16D100E7DF551A3314B364F55E",
    "rm": false,
    "ve": "05",
    "md": "Google Home",
    "ic": "/setup/icon.png",
    "fn": "Hey Y'all Googles",
    "ca": "2052",
    "st": "0",
    "bs": "FA8FCA6F2D53",
    "nf": "1",
    "rs": false
  }
}

Then we could use more or less the same code pychromecast's discovery uses.

Two problems with that approach:

  1. We could theoretically break stuff; both libraries use zeroconf, but maybe some device somewhere only likes pychromecast's discovery.
  2. What I find particularly nice about Home Assistant is that it offloads lots of the more complicated logic into external libraries, and having this discovery_info → (host, port, uuid) logic in Home Assistant is not nice; of course I would be happy to create a PR in pychromecast that refactors the add_services method into a public function.

Edit: For 2. we could also modify netdisco's Google Cast discoverable. What do you think?

@balloob
Copy link
Member
balloob commented Feb 21, 2018

The problem is that not everyone has the discovery component running. Especially because currently it's kinda awkward: even if you want something not to be discovered or if you want to configure parts, you can't do it because it will just get discovered. Eventually most people will remove discovery. This is about to change when we migrate to config entries, as we can now remember discoveries.

@balloob
Copy link
Member
balloob commented Feb 21, 2018

I wonder if we should just remove all discovery from components and platforms and instead only allow the discovery component to do discovery. (We should only do this once we migrated to config entries)

@OttoWinter
Copy link
Member Author

I see. That's indeed a problem, but isn't it already in Home Assistant? I mean if discovery: is configured, then don't we already discover all chromecasts?

I'm already looking forward to trying out the config entries 😃 - so because eventually we'd move the entries anyway, I think we should indeed not "change the running system" and just keep pychromecast internal discovery around. It definitely can't hurt.

@OttoWinter OttoWinter mentioned this pull request Feb 21, 2018
1 task
@balloob
Copy link
Member
balloob commented Feb 22, 2018

Correct, with discovery: we discover Chromecasts. However, users will disable this component because it currently works so shitty.

cast.disconnect with blocking=False does **not** do I/O; it simply sets an event for the socket client looper
@OttoWinter OttoWinter changed the title [WIP] Cast unique_id and async discovery Cast unique_id and async discovery Feb 22, 2018


class CastDevice(MediaPlayerDevice):
"""Representation of a Cast device on the network."""

def __init__(self, chromecast):
def __init__(self, hass, chromecast):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't pass the hass object into your constructor, I think. self.hass will be set by add_devices, just use that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes that's right - came up during adding the test but I think that was because I'm not using async_setup_component


discover_cast('service1', cast_group1)
yield from hass.async_block_till_done()
yield from hass.async_block_till_done() # having jobs that add jobs
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why I have to do this, but it works 😅 (I suspect it's due to "jobs that add jobs", but really I don't know)

_LOGGER.debug("Discovered new chromecast %s", mdns)
try:
# pylint: disable=protected-access
chromecast = pychromecast._get_chromecast_from_host(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely happy with having to create the "complete" Chromecast component every time we find some new cast device with discovery. Each call creates a separate thread, socket, ...

It's still much better than the previous version, which would create an entirely new object for each call of setup_platform, but I think we could just store the mdns info in hass.data[KNOWN_CHROMECASTS_KEY] and lazily create chromecast objects inside CastDevice. That also make handling disconnects cleaner. But I think that should be in a separate PR (maybe together with add_reader).

key = (chromecast.host, chromecast.port, chromecast.uuid)
cast_device = _async_create_cast_device(hass, chromecast)
if cast_device is not None:
hass.data[KNOWN_CHROMECASTS_KEY][key] = chromecast
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to disconnect the chromecast object if you are not going to use it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had that in there before, but then I remembered an edge case: what if _async_create_cast_device results in the last branch with setting the Chromecast object? Then we'd be invalidating the Chromecast object we just set. I left it out here since it's still much better than the previous code that just created every Chromecast object for every single platform: definition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think that could best be done with another refactor with add_reader.

return_value=(listener, None)) as start_discovery:
add_devices = yield from async_setup_cast(hass, config, discovery_info)
yield from hass.async_block_till_done()
yield from hass.async_block_till_done()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason you have to do this twice is because @callback are scheduled on the event loop without being wrapped in a task and thus cannot be tracked by async_block_till_done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By doing another yield, callback is executed. If it then schedules a coroutine, that will get picked up again by block till done.

@balloob
Copy link
Member
balloob commented Feb 22, 2018

1 pending disconnect comment.

How do you want to play this? Should we cherry-pick this into 0.64?

@edif30
Copy link
Contributor
edif30 commented Feb 23, 2018

Do you guys need any help testing? I have 7 cast devices and would like to help in any way that I can.

@OttoWinter
Copy link
Member Author
OttoWinter commented Feb 23, 2018

@balloob I mean it would be awesome to have all Chromecast issues resolved at the same time (hopefully 😅), but I mean there's always the chance that this introduced another issue.

However, I think I've tested this enough now to say that it works with all standard cast devices, and I think it's ready to be in a release 😄. #12609

@OttoWinter OttoWinter added this to the 0.64.0 milestone Feb 23, 2018
@OttoWinter
Copy link
Member Author

So I spent some time today testing this even more with a friend of mine (who has tons of cast audio groups set up), and it worked perfectly for him. So I'd consider this ready to be released.

@balloob balloob merged commit 230b73d into home-assistant:dev Feb 23, 2018
@balloob
Copy link
Member
balloob commented Feb 23, 2018

Let's do it then! 🎉

@OttoWinter OttoWinter deleted the cast-uuid branch February 23, 2018 17:58
balloob pushed a commit that referenced this pull request Feb 25, 2018
* Cast unique_id and async discovery

* Lazily load chromecasts

* Lint

* Fixes & Improvements

* Fixes

* Improve disconnects

cast.disconnect with blocking=False does **not** do I/O; it simply sets an event for the socket client looper

* Add tests

* Remove unnecessary calls

* Lint

* Fix use of hass object
@balloob balloob mentioned this pull request Feb 25, 2018
@home-assistant home-assistant locked and limited conversation to collaborators May 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
7 participants
0