Description
After the first few punches in a fight, the game just crashes with a device loss on Mali G52 with driver v18 (model: HUAWEI:JNY-LX1).
#02 pc 00000000008675f8 arm64/libppsspp_jni.so (HandleAssert(char const*, char const*, int, char const*, char const*, ...)+344) (BuildId: e2a665cce99374cec395cfef5394541ea8505f76)
#03 pc 00000000008398cc arm64/libppsspp_jni.so (VulkanRenderManager::BeginFrame(bool, bool)+280) (BuildId: e2a665cce99374cec395cfef5394541ea8505f76)
#04 pc 0000000000dd18b0 arm64/libppsspp_jni.so (Draw::VKContext::BeginFrame(Draw::DebugFlags)+32) (BuildId: e2a665cce99374cec395cfef5394541ea8505f76)
#05 pc 00000000008851b0 arm64/libppsspp_jni.so (NativeFrame(GraphicsContext*)+748) (BuildId: e2a665cce99374cec395cfef5394541ea8505f76)
Vulkan validation doesn't flag any problems, so this one might be real tricky to track down...
Runs fine in OpenGL.
Reported by a user on Google Play.
Confirmed affected devices:
- Galaxy S8+ (Mali G71)
- Huawei (Mali G52)
- Galaxy S21 Ultra (Mali G78)
The last one behaved slightly differently, after the bug hit it stumbles along at 1fps for a few frames, hangs, and dies after a delay. While it stumbles, our tracked GPU memory consumption does not seem to increase.
02-01 19:26:42.704 22312 22745 E Fence : waitForever: Throttling EGL Production: fence 158 didn't signal in 3000 ms
02-01 19:26:42.704 22312 22745 I Fence : waitForever: fence(mali-mali.timeline1512757-357) status(0)
02-01 19:26:42.704 22312 22745 I Fence : waitForever: sync point: timeline(mali.timeline) drv(mali) status(0) timestamp(0.000000)
I'm trying to rule out causes, some notes:
- Confirmed that this happens on a Galaxy S8+ with PPSSPP 16.4, so not new. It also performs horrendously!
- Enabling the robustBufferAccess feature on the device doesn't do anything.
- Running with Android validation layers doesn't catch anything
- Skip buffer effects doesn't help
- It is not due to any of (checked that they don't happen by settings breakpoints):
- VKRStepType::COPY:
- VKRStepType::BLIT:
- VKRStepType::READBACK:
- VKRStepType::READBACK_IMAGE:
- Tried removing all skinned draws and the clear optimization, still crashes
- Disabling pipeline id caching doesn't help.
- Tried removing all but the skinned hardware-transform-draws,
doesn't crash! So, the problem is indeed somehow a draw, of the background geometry, unless we're simply hitting some limit that we're avoiding now.never mind, it does crash but it's harder to trigger - Removing all hardware-transformed draws makes it stable, it seems. So does enabling software transform it seems, but it's very slow so not sure.
- Removing all SOFTWARE-transformed draws ALSO makes it stable, or at least apparently so! This is a promising avenue for investigation. Letting the RECTs through is fine, as well.
- Just filtering out lines (which the game uses a lot) doesn't help.
- Loading a savestate in the middle of the match is somehow stable, starting to suspect it's some very malformed draw ..
The game does a lot of very suboptimal indexed draws in succession (spread out indices over a large range of vertices, which Mali recommends against), but I don't think we're hitting https://community.arm.com/support-forums/f/graphics-gaming-and-vr-forum/49770/do-we-need-to-repack-our-vertex-buffers-for-mali-g76-to-avoid-vk_device_lost or https://community.arm.com/arm-community-blogs/b/graphics-gaming-and-vr-blog/posts/memory-limits-with-vulkan-on-mali-gpus .
Found another bug while at it, toggling Skip buffer effect and backing out to the pause menu can cause a crash.