8000 fix: maintain a ref count for objects sent over remote by MarshallOfSound · Pull Request #17464 · electron/electron · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix: maintain a ref count for objects sent over remote #17464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 16, 2019

Conversation

MarshallOfSound
Copy link
Member
@MarshallOfSound MarshallOfSound commented Mar 19, 2019

Remote Fix

See #17464 (comment) for explanation

Fixes #17039

Refs: https://chromium-review.googlesource.com/c/chromium/src/+/1514516

This also includes the test change as that is the "repro" for this issue 😕

Testing Changes

Previously when this assertion failed all tests that ran after the failed assertion also failed. This ensure that the assertion fails for the test that actually caused the issue but cleans up the left-over windows so that future tests do not fail.

Release Notes

Notes: Fixed race condition where the remote module would sometimes fail to fetch properties of a remote object

@electron-cation electron-cation bot added the new-pr 🌱 PR opened recently label Mar 19, 2019
@codebytere codebytere changed the title spec: clean up after a failed window count assertion test: clean up after a failed window count assertion Mar 20, 2019
@electron-cation electron-cation bot removed the new-pr 🌱 PR opened recently label Mar 20, 2019
Previously when this assertion failed all tests that ran after the
failed assertion also failed.  This ensure that the assertion fails for
the test that actually caused the issue but cleans up the left-over
windows so that future tests do not fail.
@MarshallOfSound MarshallOfSound force-pushed the clean-up-window-assertion branch from 367088b to 54d20d1 Compare April 15, 2019 20:55
@MarshallOfSound MarshallOfSound changed the title test: clean up after a failed window count assertion fix: post remote deref message asyncronously Apr 15, 2019
@MarshallOfSound MarshallOfSound force-pushed the clean-up-window-assertion branch from 54d20d1 to 89f9de3 Compare April 15, 2019 23:29
@zcbenz
Copy link
Contributor
zcbenz commented Apr 16, 2019

This would trigger another regression. Consider following situation:

  1. remote.getCurrentWindow() to create remote object A for current window
  2. Garbage collection
  3. OnObjectGC called for remote object A, RunDestructor schedule for next tick
  4. remote.getCurrentWindow() to create remote object B for current window
  5. RunDestructor gets called, and the objectRegistry in the main process clears the entry for current window
  6. Any operations on remote object B now ends up with missing remote object error

See #4733 (comment) for more history on the original regression, and #4869 for the fix.

I think this PR fixes the CI failure because it deferred the RunDestructor so the actual problem is now hidden.

@MarshallOfSound MarshallOfSound force-pushed the clean-up-window-assertion branch from 89f9de3 to f737634 Compare April 16, 2019 04:23
@MarshallOfSound MarshallOfSound changed the title fix: post remote deref message asyncronously fix: maintain a ref count for objects sent over remote Apr 16, 2019
@MarshallOfSound
Copy link
Member Author

@zcbenz I have updated the PR with a (IMO) better approach to ensuring that we only ever remove an object from the cache if all sent references have been dereffed.

@MarshallOfSound
Copy link
Member Author

IMG_20190415_190716

This is the exact race condition we were hitting. In a rare case when a remote object was fetched a second time and a GC occured at the right time (between the main process sending the remote object metadata and the renderer process pulling the proxy out of a weak map) the renderer would tell the main process that the object had been GC'ed but would then on the next tick receive a remote message telling it to use the object it just free'ed.

I've included the commit message below for more info:

Previously there was a race condition where a GC could occur in the
renderer process between the main process sending a meta.id and the
renderer pulling the proxy out its weakmap to stop it being GC'ed.

This fixes that race condition by maintaining a "sent" ref count in the
object registry and a "received" ref count in the object cache on the
renderer side. The deref request now sends the number of refs the
renderer thinks it owns, if the number does not match the value in the
object registry it is assumed that there is an IPC message containing a
new reference in flight and this race condition was hit.

The browser side ref count is then reduced and we wait for the new deref
message. This guaruntees that an object will only be removed from the
registry if every reference we sent has been guarunteed to be unreffed.

@MarshallOfSound MarshallOfSound force-pushed the clean-up-window-assertion branch from f737634 to cb9aeb9 Compare April 16, 2019 04:30
Copy link
Contributor
@zcbenz zcbenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have ran the code in my mind, seems to be a perfect fix!

@MarshallOfSound
Copy link
Member Author

Note to self: test failures are because I didn't update https://github.com/electron/electron/blob/master/lib/browser/objects-registry.js#L69

Secondary note to self: typescript would have caught this 😆

Previously there was a race condition where a GC could occur in the
renderer process between the main process sending a meta.id and the
renderer pulling the proxy out its weakmap to stop it being GC'ed.

This fixes that race condition by maintaining a "sent" ref count in the
object registry and a "received" ref count in the object cache on the
renderer side.  The deref request now sends the number of refs the
renderer thinks it owns, if the number does not match the value in the
object registry it is assumed that there is an IPC message containing a
new reference in flight and this race condition was hit.

The browser side ref count is then reduced and we wait for the new deref
message.  This guaruntees that an object will only be removed from the
registry if every reference we sent has been guarunteed to be unreffed.
@MarshallOfSound MarshallOfSound force-pushed the clean-up-window-assertion branch from 6347a68 to 3e0f213 Compare April 16, 2019 18:22
@MarshallOfSound MarshallOfSound merged commit 78411db into master Apr 16, 2019
@release-clerk
Copy link
release-clerk bot commented Apr 16, 2019

Release Notes Persisted

Fixed race condition where the remote module would sometimes fail to fetch properties of a remote object

@MarshallOfSound MarshallOfSound deleted the clean-up-window-assertion branch April 16, 2019 20:08
@trop
Copy link
Contributor
trop bot commented Apr 16, 2019

I was unable to backport this PR to "5-0-x" cleanly;
you will need to perform this backport manually.

MarshallOfSound added a commit that referenced this pull request Apr 16, 2019
* spec: clean up after a failed window count assertion

Previously when this assertion failed all tests that ran after the
failed assertion also failed.  This ensure that the assertion fails for
the test that actually caused the issue but cleans up the left-over
windows so that future tests do not fail.

* fix: maintain a ref count for objects sent over remote

Previously there was a race condition where a GC could occur in the
renderer process between the main process sending a meta.id and the
renderer pulling the proxy out its weakmap to stop it being GC'ed.

This fixes that race condition by maintaining a "sent" ref count in the
object registry and a "received" ref count in the object cache on the
renderer side.  The deref request now sends the number of refs the
renderer thinks it owns, if the number does not match the value in the
object registry it is assumed that there is an IPC message containing a
new reference in flight and this race condition was hit.

The browser side ref count is then reduced and we wait for the new deref
message.  This guaruntees that an object will only be removed from the
registry if every reference we sent has been guarunteed to be unreffed.
@trop
Copy link
Contributor
trop bot commented Apr 16, 2019

A maintainer has manually backported this PR to "5-0-x", please check out #17825

1 similar comment
@trop
Copy link
Contributor
trop bot commented Apr 16, 2019

A maintainer has manually backported this PR to "5-0-x", please check out #17825

MarshallOfSound added a commit that referenced this pull request Apr 16, 2019
* spec: clean up after a failed window count assertion

Previously when this assertion failed all tests that ran after the
failed assertion also failed.  This ensure that the assertion fails for
the test that actually caused the issue but cleans up the left-over
windows so that future tests do not fail.

* fix: maintain a ref count for objects sent over remote

Previously there was a race condition where a GC could occur in the
renderer process between the main process sending a meta.id and the
renderer pulling the proxy out its weakmap to stop it being GC'ed.

This fixes that race condition by maintaining a "sent" ref count in the
object registry and a "received" ref count in the object cache on the
renderer side.  The deref request now sends the number of refs the
renderer thinks it owns, if the number does not match the value in the
object registry it is assumed that there is an IPC message containing a
new reference in flight and this race condition was hit.

The browser side ref count is then reduced and we wait for the new deref
message.  This guaruntees that an object will only be removed from the
registry if every reference we sent has been guarunteed to be unreffed.
kiku-jw pushed a commit to kiku-jw/electron that referenced this pull request May 16, 2019
* spec: clean up after a failed window count assertion

Previously when this assertion failed all tests that ran after the
failed assertion also failed.  This ensure that the assertion fails for
the test that actually caused the issue but cleans up the left-over
windows so that future tests do not fail.

* fix: maintain a ref count for objects sent over remote

Previously there was a race condition where a GC could occur in the
renderer process between the main process sending a meta.id and the
renderer pulling the proxy out its weakmap to stop it being GC'ed.

This fixes that race condition by maintaining a "sent" ref count in the
object registry and a "received" ref count in the object cache on the
renderer side.  The deref request now sends the number of refs the
renderer thinks it owns, if the number does not match the value in the
object registry it is assumed that there is an IPC message containing a
new reference in flight and this race condition was hit.

The browser side ref count is then reduced and we wait for the new deref
message.  This guaruntees that an object will only be removed from the
registry if every reference we sent has been guarunteed to be unreffed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remote API is flaking on 5.0.x
3 participants
0