8000 Deadlock in PlaylistViewController · Issue #3405 · iina/iina · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Deadlock in PlaylistViewController #3405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task
anohren opened this issue May 8, 2021 · 5 comments
Closed 8000
1 task

Deadlock in PlaylistViewController #3405

anohren opened this issue May 8, 2021 · 5 comments

Comments

@anohren
Copy link
Contributor
anohren commented May 8, 2021

Seems to be a deadlock at

player.playlistQueue.sync {

System and IINA version:

  • macOS mojave
  • IINA 1.1.2

Expected behavior:

Actual behavior:

827 samples of frozen GUI:
Call graph:
    827 Thread_1071212   DispatchQueue_1: com.apple.main-thread  (serial)
    + 827 start  (in libdyld.dylib) + 1  [0x7fff648853d5]
    +   827 main  (in IINA) + 9  [0x1084b2429]
    +     827 NSApplicationMain  (in AppKit) + 777  [0x7fff35e12ac8]
    +       827 -[NSApplication run]  (in AppKit) + 755  [0x7fff35e235c0]
    +         827 -[NSApplication(NSEvent) sendEvent:]  (in AppKit) + 331  [0x7fff35e34e4b]
    +           827 -[NSWindow(NSEventRouting) sendEvent:]  (in AppKit) + 478  [0x7fff35f95667]
    +             827 -[NSWindow(NSEventRouting) _reallySendEvent:isDelayedEvent:]  (in AppKit) + 2295  [0x7fff35f961a6]
    +               827 -[NSWindow(NSEventRouting) _handleMouseDownEvent:isDelayedEvent:]  (in AppKit) + 5724  [0x7fff3605f937]
    +                 827 -[NSTableView mouseDown:]  (in AppKit) + 6161  [0x7fff361d9a81]
    +                   827 -[NSTableView _sendSelectionChangedNotificationForRows:columns:]  (in AppKit) + 228  [0x7fff35fe0942]
    +                     827 -[NSNotificationCenter postNotificationName:object:userInfo:]  (in Foundation) + 66  [0x7fff3aab9aab]
    +                       827 _CFXNotificationPost  (in CoreFoundation) + 840  [0x7fff38815667]
    +                         827 -[_CFXNotificationRegistrar find:object:observer:enumerator:]  (in CoreFoundation) + 1834  [0x7fff388163bc]
    +                           827 ___CFXNotificationPost_block_invoke  (in CoreFoundation) + 87  [0x7fff388ad3ad]
    +                             827 _CFXRegistrationPost  (in CoreFoundation) + 404  [0x7fff388a4f7e]
    +                               827 ___CFXRegistrationPost_block_invoke  (in CoreFoundation) + 63  [0x7fff388a5014]
    +                                 827 __CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER__  (in CoreFoundation) + 12  [0x7fff388a509a]
    +                                   827 @objc PlaylistViewController.tableViewSelectionDidChange(_:)  (in IINA) + 110  [0x1085db00e]
    +                                     827 PlaylistViewController.tableViewSelectionDidChange(_:)  (in IINA) + 222  [0x1085daf7e]
    +                                       827 PlaylistViewController.showTotalLength()  (in IINA) + 515  [0x1085d7b93]
    +                                         827 _dispatch_sync_f_slow  (in libdispatch.dylib) + 171  [0x7fff64843fce]
    +                                           827 __DISPATCH_WAIT_FOR_QUEUE__  (in libdispatch.dylib) + 269  [0x7fff6484436e]
    +                                             827 _dispatch_event_loop_wait_for_ownership  (in libdispatch.dylib) + 498  [0x7fff64853cd7]
    +                                               827 _dispatch_kq_poll  (in libdispatch.dylib) + 247  [0x7fff6485315f]
    +                                                 827 kevent_id  (in libsystem_kernel.dylib) + 10  [0x7fff649bbc86]



mpv log:

Steps to reproduce:

  1. Load m3u playlist of youtube URLs (maybe 1000 items)
  2. Double click some items until it invariably locks up
  • MPV does not have this problem.

How often does this happen?
Many times a day / every launch

@saagarjha
Copy link
Member

Do you happen to have the entire sample (or even better, a spindump targeting the process)?

@anohren
Copy link
Contributor Author
anohren commented May 9, 2021

As it so happens

Sample

@low-batt
Copy link
Contributor

Analysis

This is a regression introduced by this commit added Feb 27, 2020:
Revert "Guard thread unsafe dictionary in the singleton"

This commit says it reverts the following commit:
Guard thread unsafe dictionary in the singleton

But that commit did not modify PlaylistViewController.showTotalLength. In addition to reverting changes the revert commit also added the offending call to player.playlistQueue.sync cited above to showTotalLength. Apparently the revert commit tried to replace a previous attempt at fixing an issue with a correct one.

Neither commit mentions an issue number. Apparently something was going wrong with the total length displayed by the playlist. The cause appears to be related to mult-threaded access to the swift dictionaries cachedVideoDurationAndProgress and cachedMetadata in PlaybackInfo. Mutable Swift dictionaries are not thread safe. Access must be coordinated if multiple threads reference the dictionary. The changes are using playlistQueue to coordinate access to the dictionaries. Using a serial DispatchQueue to single thread access to a mutable dictionary is a recommended Swift technique.

BUT the clip from the spindump posted in the issue shows that showTotalLength is being called as a part of a task that is being executed by the main DispatchQueue. Tasks in this queue must not take a long time to complete. If the main DispatchQueue falls behind in processing events the macOS window server will mark the application as being unresponsive and will display the spinning wait cursor. The addition of the call to playlistQueue.sync has caused the requirement to not submit long running tasks to the main queue to also apply to the playlistQueue queue as well. So we need to examine what the full spindump shows the playlistQueue is executing.

The spindump shows the playlistQueue is executing a task that calls methods such as:

  • getaddrinfo
  • gnutls_handshake

The getaddrinfo function translates DNS names to IP addresses. The gnutls_handshake function performs a TLS handshake to establish an encrypted connection to a server. So the task executing in the main DispatchQueue is blocked on a task from the playlistQueue that is performing network operations. Network disruptions such as overloaded servers can result in timeouts and retries. Therefore normally network operations should be considered long running and when feasible executed in the background. So what code is currently executing in the playlistQueue?

The method PlaylistViewController.tableView is submitting a task to playlistQueue:

player.playlistQueue.async {

The task being submitted tries to obtain the duration for an entry in the playlist from cachedVideoDurationAndProgress. If the cache does not contain the information then the method PlayerCore.refreshCachedVideoInfo will be called:

self.player.refreshCachedVideoInfo(forVideoPath: item.filename)

As per this comment this is a long running method:

  /**
   Get video duration, playback progress, and metadata, then save it to info.
   It may take some time to run this method, so it should be used in background.
   */
  func refreshCachedVideoInfo(forVideoPath path: String) {
    guard let dict = FFmpegController.probeVideoInfo(forFile: path) else { return }

The method FFmpegController.probeVideoInfo calls the method avformat_open_input and potentially avformat_find_stream_info. These are FFmpeg libavformat methods that will connect to streaming servers and download some of the stream to obtain information about the video. This explains the calls to methods that perform network operations seen in the spindump.

Conclusions:

  • The task running in the main DispatchQueue is pended waiting for a long running background job to finish, this must be corrected
  • The playlistQueue can not be used to both execute background tasks and coordinate single threaded access to the dictionaries
  • The fix applied must preserve proper coordination of multi-threaded access to the mutable dictionaries cachedVideoDurationAndProgress and cachedMetadata

Fixing

I believe the proper Swift way to fix this issue with thread interaction is to use Actors and Async-await. But that is not currently an option. These language features have been added in Swift 5.5, which is available in Xcode 13, however at this time Xcode 13 is in beta. Still useful to be thinking about actors as knowing the future direction of Swift can influence how the fix is structured.

When Swift 5.5 is available, we'd either refactor the PlaybackInfo class into an actor, or breakout the portion of the data that is accessed by multiple threads into a separate actor. To prepare for such a change today we'd want to make the PlaybackInfo class responsible for coordinating thread access to its own data.

The commit in the pull request I will be submitting shortly:

  • Changes cachedVideoDurationAndProgress and cachedMetadata to be private to the class.
  • Adds a private DispatchQueue named lockQueue to coordinate thread access to the cache
  • Adds methods for operating on the cache dictionaries that use lockQueue to single thread the operations
  • Dedicates the DispatchQueue playlistQueue to background tasks

An advantage of making these dictionaries private to the class is that it forces a new programmer to look at the methods provided by the class which will then make it obvious to the programmer that the data is shared between threads and they had better be taking that into consideration when making changes.

The current code is forcing single threaded access over a large set of operations. These changes reduce that to only single thread access to an individual dictionary. Is there some need for atomic access involving multiple dictionaries/data that I missed? Worries me that because I don't know the details of the issue that the current code fixed these changes could bring back that issue.

There are other mutable dictionaries in PlaybackInfo that are accessed by the UI and by background tasks such as matchedSubs. Are there other issues lurking with multi-threaded access to PlaybackInfo data?

Thoughts? Critical reviews including criticism of style and naming are welcome.

Would be good to hear from @inflation and @Alejx as to whether these changes will break what the previous commits fixed.

low-batt added a commit to low-batt/iina that referenced this issue Aug 15, 2021
This commit:
- Changes cachedVideoDurationAndProgress and cachedMetadata to be
  private to the PlaybackInfo class.
- Adds a private DispatchQueue named lockQueue to coordinate thread
  access to the cache
- Adds methods for operating on the cache dictionaries that use lockQueue
  to single thread the operations
- Dedicates the DispatchQueue playlistQueue to background tasks

This corrects a problem where the main thread is blocked waiting for a
background task to finish.
lhc70000 added a commit that referenced this issue Apr 23, 2022
* Fix deadlock in PlaylistViewController, #3405

This commit:
- Changes cachedVideoDurationAndProgress and cachedMetadata to be
  private to the PlaybackInfo class.
- Adds a private DispatchQueue named lockQueue to coordinate thread
  access to the cache
- Adds methods for operating on the cache dictionaries that use lockQueue
  to single thread the operations
- Dedicates the DispatchQueue playlistQueue to background tasks

This corrects a problem where the main thread is blocked waiting for a
background task to finish.

* Add missing getCachedVideoDurationAndProgress

Co-authored-by: low-batt <86170219+low-batt@users.noreply.github.com>
lhc70000 added a commit that referenced this issue Apr 24, 2022
* Fix deadlock in PlaylistViewController, #3405

This commit:
- Changes cachedVideoDurationAndProgress and cachedMetadata to be
  private to the PlaybackInfo class.
- Adds a private DispatchQueue named lockQueue to coordinate thread
  access to the cache
- Adds methods for operating on the cache dictionaries that use lockQueue
  to single thread the operations
- Dedicates the DispatchQueue playlistQueue to background tasks

This corrects a problem where the main thread is blocked waiting for a
background task to finish.

* Add missing getCachedVideoDurationAndProgress

Co-authored-by: low-batt <86170219+low-batt@users.noreply.github.com>
@low-batt
Copy link
Contributor

The fix in PR #3699 has been merged into the develop branch.

@low-batt
Copy link
Contributor

Closing. Fixed in IINA 1.3.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants
0