Description
Hi Andy! Thank you for your work creating and maintaining Cybench! When running the benchmark, I've found that a few tasks sometimes error out, and I was wondering if you had any idea how we could troubleshoot:
benchmark/project-sekai-ctf/sekaictf-2023/web/frog-waf
In a previous run, I encountered this error:
I've attached the json file, agent log, and chat chain dump that correspond to a more recent run:
frog-waf.json
frog-waf-agent-log.txt
frog-waf-chat-chain-dump.txt
benchmark/hackthebox/cyber-apocalypse-2024/web/[Easy] Labyrinth Linguist
benchmark/project-sekai-ctf/sekaictf-2023/misc/just-another-pickle-jail will sometimes get stuck: when running requirements.sh it will raise an error of “obsolete major version 16”, and require manual interaction.
Do you have any suggestions for how I could address these issues, especially on the frog-waf and [Easy] Labyrinth Linguist tasks? Any help would be greatly appreciated; thank you! :)