-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
Cast unique_id and async discovery #12474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -33,90 +38,159 @@ | |||
SUPPORT_TURN_ON | SUPPORT_TURN_OFF | SUPPORT_PREVIOUS_TRACK | \ | |||
SUPPORT_NEXT_TRACK | SUPPORT_PLAY_MEDIA | SUPPORT_STOP | SUPPORT_PLAY | |||
|
|||
KNOWN_HOSTS_KEY = 'cast_known_hosts' | |||
INTERNAL_DISCOVERY_RUNNING = 'cast_discovery_running' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the past, we'd just have a separate pychromecast discovery run for every single chromecast. That was an immense waste of resources. This PR uses only 1 discovery service at max. It needs to know though whether its own discovery service is running already, so I store it in hass.data[cast_discovery_running]
. If this is the wrong place, please tell me.
"""Called when zeroconf has discovered a new chromecast.""" | ||
try: | ||
chromecast = pychromecast._get_chromecast_from_host( | ||
listener.services[name], blocking=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a protected function here as a temporary hack. We need some way of using the non-blocking discovery with blocking Chromecast objects. If we want to merge this PR, I would of course create a PR in pychromecast to allow this without a "hack".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, there's another potential problem here (that was also there before). Every time we find a chromecast, we create a full Chromecast
object, with a separate Thread
object, socket
and whatnot. (related: #11945)
We might be able to get away in some cases by just matching the host, port and information we get from zeroconf.
add_devices(casts) | ||
# Manually add a "normal" Chromecast, we can do that without discovery. | ||
try: | ||
chromecast = pychromecast.Chromecast(*want_host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having discovery:
and pychromecast's discovery running in parallel is extremely prone to bugs. I understand the reasons for why we're doing it right now, but we should try to change it in the future (if at all possible)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can always check if discovery is loaded (via 'discovery' in hass.config.components
) and track if discovery is loaded after this platform (by tracking the component_loaded
event) to disable discovery when needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me clarify: Right now I think it's better to have both discoveries active since AFAIK, they use separate systems - mDNS for pychromecast and uPnP for netdisco. Plus there's the issue of multiple network interfaces. And removing one system could lead to a lot of users complaining about their devices not being discovered anymore.
During the creation of this PR I found lots and lots of edge cases of the form: what if pychromecast discovers A, but discovery doesn't, but discovers B, and so on. Supporting all of them was difficult, therefore using only one discovery service might be easier.
Edit: Oh, it appears as though netdisco is supporting mDNS too. Might take a look at just using netdisco later, as that could be cleaner.
Tested with the following configurations using Google Home Mini and Chromecast: discovery: discovery:
media_player:
- platform: cast media_player:
- platform: cast media_player:
- platform: cast
host: 192.168.178.153 media_player:
- platform: cast
- platform: cast
host: 192.168.178.153 |
For a future PR: To get rid of threads and run in non blocking more, we should be able to do something like this: @callback
def socket_readable():
# This will cause the status listeners to be executed from within event loop
cast.socket_client.run_once()
hass.loop.add_reader(chromecast.socket_client.get_socket(), socket_readble) |
The cast platform is one of our oldest integrations. I'm glad that you're working on cleaning it up 👍 |
import logging | ||
from typing import Tuple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'typing.Tuple' imported but unused
Apologies up front if this is the wrong forum / site / effort for the information provided below. I've never posted to github, and certainly don't want to start off going against the rules. I have some useful automations that depend on reliably discovering cast devices and groups. I also have a wide array of cast devices: 4 Google Homes, 10 CCA's, 1 1st gen CCV, 2 Nvidia Shields, 1 Insignia alarm clock with CC built-in, and a Sony speaker with CC built-in. Everything below is since last last year with the breaking firmware update of Google Cast devices. Prior to that, everything was fine. I have tried every combination of discovery that OttoWinter lays out above multiple times and the results are similar. Currently I'm setup with platform: cast, global discovery turned off, and NOT specifying individual hosts. Under the current 0.63.3 cast.py (and previous iterations since breaking firmware update), the most common outcome is that during discovery I either get A) groups and units with older firmware, or B) I get the individual units with no groups. If I get the individual units, then they may fall out of the Google Home app. This means they also aren't available as cast targets from anything else (Pandora, Google Play Music). It's as though they can't exist in both Hass and Google Home simultaneously. When I get groups and the units with older firmware, those are able to co-exist within Home Assistant and the Google Home app. My automations will work when I cast to one of the older devices or one of the groups. Automations will not work, however, when casting to the units with newer firmware since they aren't in Home Assistant as entities. All that said, notice that I used terms like "most common" and "may" in describing scenarios above. Yesterday, after restarting my router, restarting all the cast devices, restarting the Google Home app, and doing a rain dance, all of the devices and groups showed up when I restarted Home Assistant. And all my automations (some of which even my wife was impressed with) worked for 6 glorious hours. Then I made the mistake adding some sensors and doing a hass restart. I did everything exactly as the previous startup and still only got the individual units. And in this go 'round, not all of the newer firmware devices fell out of the Google Home app. Inexplicably, 2 out of the 10 new firmware CCA's were available as cast targets. No idea why. Anyway, I'm happy to test anything you want me to test, and under any testing rubric you'd like me to try. I'm a Home Assistant addict, but my full time job is about as far away from anything tech related as one can get. This means any skills I've developed over the years have been self-taught and are woefully inadequate for providing the heavy lifting in this exercise. I have a strong attention to detail, however, and I can copy/paste and download with the best of 'em. I'm giving this branch of cast.py a spin today, and I'll let you know what happens: |
@benjamin1492 I mean what you're describing here is not really too related to this PR here, and should probably rather be posted as an issue (see the "issue" tab on the top of this page). But I could very well use you as my first guinea pig to test this code change 😄. See, I'm quite certain this code change won't fix your problem, but it does improve debugging, which can help resolve your problem. So it would be great if you could override your logger:
default: info
logs:
pychromecast.socket_client: debug
pychromecast: debug
homeassistant.components.media_player.cast: debug and send me your debug log. Again, further discussion should be moved elsewhere (community forums, discord, or even e-mail contact@otto-winter.com if you like) as it's not really related with this code change. |
@balloob Good to know about the |
Config entries have been merged, it's something we can explore in a future PR. |
import pychromecast | ||
def _setup_internal_discovery(hass: HomeAssistantType) -> None: | ||
"""Set up the pychromecast internal discovery.""" | ||
if hass.data.get(INTERNAL_DISCOVERY_RUNNING_KEY, False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong: Since this is not an async method, it will be called inside an executor thread, right? In this case, this kind of locking is prone to race conditions. If a second of these methods is started while this method is executing between lines 57 and 86, the "lock" will not work. Perhaps use a mutex instead of a boolean?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So apparently there are no atomic booleans in Python, so I tried using asyncio.Lock
. But that doesn't allow atomic checking and setting at the same time, so I reverted to using threading.Lock
.
""" | ||
|
||
if chromecast.uuid is None: | ||
# Found a cast without UUID, we don't store it because we won't be able |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring in line 42 says that you store devices without UUID with the None
key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, that was from a previous implementation. As there's no need for us to store those devices with None
UUID, I have removed them from the dict to keep the code, and the dict schema a bit cleaner.
return CastDevice(chromecast) | ||
|
||
# Found a cast with UUID | ||
known_casts = hass.data[ADDED_CAST_DEVICES_KEY] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you have "known" and "added" casts, and here you assign the ADDED_CAST_DEVICES_KEY
stuff to known_casts
. I'm confused. What's the difference. Should this be added_casts = …
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'm using both a bit interchangeably here since every time we "know" a chromecast, it is also added.
Edit: Ok, my previous comment is not entirely true, there are cases where we know a chromecast, but don't add it (groups)
if want_host is None: | ||
# We were explicitly told to enable pychromecast discovery. | ||
enable_discovery = True | ||
elif want_host[1] != DEFAULT_PORT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this fail if the discovery info in line 143 sets the port to DEFAULT_PORT
by chance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the whole point of this line! We only need to run discovery if we find a group (and groups always have non-standard ports).
So if discovery:
finds a chromecast with DEFAULT_PORT
, we don't need to run our own discovery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if groups are guaranteed to live on different ports, that should work. 👍
except pychromecast.ChromecastConnectionError: | ||
_LOGGER.warning("Can't set up chromecast on %s", want_host[0]) | ||
raise | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you raise
above, you don't need to wrap this into an else
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't :) Will change it
I also think you need to implement |
Sure, I was initially looking for that method and didn't find it. Eventually I forgot that functionality. Will do. |
@@ -54,7 +55,8 @@ | |||
|
|||
def _setup_internal_discovery(hass: HomeAssistantType) -> None: | |||
"""Set up the pychromecast internal discovery.""" | |||
if hass.data.get(INTERNAL_DISCOVERY_RUNNING_KEY, False): | |||
hass.data.setdefault(INTERNAL_DISCOVERY_RUNNING_KEY, threading.Lock()) | |||
if not hass.data[INTERNAL_DISCOVERY_RUNNING_KEY].acquire(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should pass blocking=False
here. Otherwise, this thread will never finish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh noes 🤕 I even went through all the documentation and then forgot to include that parameter 😑.
@asyncio.coroutine | ||
def async_will_remove_from_hass(self): | ||
"""Disconnect Chromecast object when removed.""" | ||
self._async_disconnect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably also should remove the entity object from hass.data[ADDED_CAST_DEVICE_KEY]
here (or in _async_disconnect
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm kinda already doing that on line 129, but I think I should move that into _async_disconnect
, as it makes more sense there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes. But that line is never executed if async_will_remove_from_hass
is being called from "outside", i.e., not because of a re-discovered entity. Putting it here or in _async_disconnect
is the right thing, I guess.
5ae0e78
to
1be34fe
Compare
After looking into For example if I print out {
"host": "192.168.178.162",
"port": 8009,
"hostname": "fc73c725-a458-6c01-0bd6-1b7cb49f52e2.local.",
"properties": {
"id": "fc73c725a4586c010bd61b7cb49f52e2",
"cd": "E7873CDA9DFC38A20F9A03064F7B6DEE",
"rm": false,
"ve": "05",
"md": "Chromecast",
"ic": "/setup/icon.png",
"fn": "TV the second",
"ca": "4101",
"st": "0",
"bs": "FA8FCA501A8C",
"nf": "1",
"rs": false
}
} {
"host": "192.168.178.153",
"port": 8009,
"hostname": "57831e09-bcaf-d41f-37b4-c95b260b2964.local.",
"properties": {
"id": "57831e09bcafd41f37b4c95b260b2964",
"cd": "B8EA4C16D100E7DF551A3314B364F55E",
"rm": false,
"ve": "05",
"md": "Google Home",
"ic": "/setup/icon.png",
"fn": "Hey Y'all Googles",
"ca": "2052",
"st": "0",
"bs": "FA8FCA6F2D53",
"nf": "1",
"rs": false
}
} Then we could use more or less the same code pychromecast's discovery uses. Two problems with that approach:
Edit: For 2. we could also modify netdisco's Google Cast discoverable. What do you think? |
The problem is that not everyone has the discovery component running. Especially because currently it's kinda awkward: even if you want something not to be discovered or if you want to configure parts, you can't do it because it will just get discovered. Eventually most people will remove |
I wonder if we should just remove all discovery from components and platforms and instead only allow the discovery component to do discovery. (We should only do this once we migrated to config entries) |
I see. That's indeed a problem, but isn't it already in Home Assistant? I mean if I'm already looking forward to trying out the config entries 😃 - so because eventually we'd move the entries anyway, I think we should indeed not "change the running system" and just keep pychromecast internal discovery around. It definitely can't hurt. |
Correct, with |
cast.disconnect with blocking=False does **not** do I/O; it simply sets an event for the socket client looper
1be34fe
to
6a54042
Compare
|
||
|
||
class CastDevice(MediaPlayerDevice): | ||
"""Representation of a Cast device on the network.""" | ||
|
||
def __init__(self, chromecast): | ||
def __init__(self, hass, chromecast): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't pass the hass
object into your constructor, I think. self.hass
will be set by add_devices
, just use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes that's right - came up during adding the test but I think that was because I'm not using async_setup_component
|
||
discover_cast('service1', cast_group1) | ||
yield from hass.async_block_till_done() | ||
yield from hass.async_block_till_done() # having jobs that add jobs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why I have to do this, but it works 😅 (I suspect it's due to "jobs that add jobs", but really I don't know)
_LOGGER.debug("Discovered new chromecast %s", mdns) | ||
try: | ||
# pylint: disable=protected-access | ||
chromecast = pychromecast._get_chromecast_from_host( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely happy with having to create the "complete" Chromecast component every time we find some new cast device with discovery. Each call creates a separate thread, socket, ...
It's still much better than the previous version, which would create an entirely new object for each call of setup_platform
, but I think we could just store the mdns info in hass.data[KNOWN_CHROMECASTS_KEY]
and lazily create chromecast objects inside CastDevice
. That also make handling disconnects cleaner. But I think that should be in a separate PR (maybe together with add_reader
).
key = (chromecast.host, chromecast.port, chromecast.uuid) | ||
cast_device = _async_create_cast_device(hass, chromecast) | ||
if cast_device is not None: | ||
hass.data[KNOWN_CHROMECASTS_KEY][key] = chromecast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to disconnect the chromecast object if you are not going to use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had that in there before, but then I remembered an edge case: what if _async_create_cast_device
results in the last branch with setting the Chromecast object? Then we'd be invalidating the Chromecast object we just set. I left it out here since it's still much better than the previous code that just created every Chromecast object for every single platform:
definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I think that could best be done with another refactor with add_reader
.
return_value=(listener, None)) as start_discovery: | ||
add_devices = yield from async_setup_cast(hass, config, discovery_info) | ||
yield from hass.async_block_till_done() | ||
yield from hass.async_block_till_done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason you have to do this twice is because @callback
are scheduled on the event loop without being wrapped in a task and thus cannot be tracked by async_block_till_done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By doing another yield, callback is executed. If it then schedules a coroutine, that will get picked up again by block till done.
1 pending disconnect comment. How do you want to play this? Should we cherry-pick this into 0.64? |
Do you guys need any help testing? I have 7 cast devices and would like to help in any way that I can. |
@balloob I mean it would be awesome to have all Chromecast issues resolved at the same time (hopefully 😅), but I mean there's always the chance that this introduced another issue. However, I think I've tested this enough now to say that it works with all standard cast devices, and I think it's ready to be in a release 😄. #12609 |
So I spent some time today testing this even more with a friend of mine (who has tons of cast audio groups set up), and it worked perfectly for him. So I'd consider this ready to be released. |
Let's do it then! 🎉 |
* Cast unique_id and async discovery * Lazily load chromecasts * Lint * Fixes & Improvements * Fixes * Improve disconnects cast.disconnect with blocking=False does **not** do I/O; it simply sets an event for the socket client looper * Add tests * Remove unnecessary calls * Lint * Fix use of hass object
Description:
🏔 This PR adds a
unique_id
to the cast platform. While doing so, it makes the platform faster by using async discovery and it tries to make the cast code base a bit easier to understand.My Chromecast Audio broke some time ago and my new one is only arriving in about a week (living in the mountains does have its downsides 😕). But then I'll also be able to try audio groups.
Related: #12395, #12405
Fixes #12565, fixes #11905, fixes #7959, probably #11945 (fewer socket_client threads created)
Added benefit: Home Assistant startup time is significantly reduced and those pesky "Setup of platform media_player.cast is taking over 10 seconds." long entries are removed.
Example entry for
configuration.yaml
(if applicable):Checklist:
If the code does not interact with devices:
tox
run successfully. Your PR cannot be merged unless tests pass