8000 Implement timeout and memory limits for pandoc in scope.sh by dardevelin · Pull Request #3004 · ranger/ranger · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Implement timeout and memory limits for pandoc in scope.sh #3004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dardevelin
Copy link
  • Add a timeout of 10 seconds to pandoc commands to prevent indefinite execution.
  • Introduce a memory limit of 500MB using ulimit to restrict pandoc's memory usage.
  • Update relevant sections in scope.sh where pandoc is invoked, including handling OpenDocument, HTML, DOCX, ePub, and FB2 files.
  • Implement error handling for cases where pandoc exceeds the timeout or memory limit, ensuring the script exits gracefully with an appropriate message.
  • Improve the robustness of file previews in ranger by preventing pandoc from causing system instability due to excessive resource consumption.

ISSUE TYPE

  • Improvement/feature implementation

RUNTIME ENVIRONMENT

ranger version: ranger 1.9.3

  • Operating system and version:
    macOS 15.0.0

  • Terminal emulator and version:
    Apple Terminal (Version 453)

  • Python version:
    Python version: 3.11.9 (main, Apr 2 2024, 08:25:04) [Clang 15.0.0 (clang-1500.3.9.4)]

  • Ranger version/commit:
    bd9b37f

  • Locale:
    Locale: en_US.UTF-8

CHECKLIST

  • The CONTRIBUTING document has been read [REQUIRED]
  • All changes follow the code style [REQUIRED]
  • All new and existing tests pass [REQUIRED]
  • Changes require config files to be updated
    • Config files have been updated
  • Changes require documentation to be updated
    • Documentation has been updated
  • Changes require tests to be updated
    • Tests have been updated

DESCRIPTION

This pull request introduces a timeout and memory limit for the pandoc commands used in scope.sh to prevent indefinite execution and excessive memory consumption.

Specifically:

A timeout of 10 seconds is added to the pandoc commands.
Memory usage is limited to 500MB using ulimit.
These changes are applied to sections handling OpenDocument, HTML, DOCX, ePub, and FB2 files.
Error handling is included for cases where pandoc exceeds the timeout or memory limit, ensuring the script exits gracefully with an appropriate message.

MOTIVATION AND CONTEXT

These changes are required to improve the stability and robustness of file previews in ranger when handling files that require pandoc for conversion. Without these limits, pandoc can cause system instability by running indefinitely or consuming excessive resources, especially with large or complex files.

TESTING

Manual tests were conducted with various document types and sizes to ensure that pandoc is correctly terminated when exceeding the specified timeout or memory limits. The changes were verified to work without impacting other areas of the codebase. (more testing may be advised)

- Add a timeout of 10 seconds to pandoc commands to prevent indefinite
  execution.
- Introduce a memory limit of 500MB using ulimit to restrict pandoc's
  memory usage.
- Update relevant sections in scope.sh where pandoc is invoked,
  including handling OpenDocument, HTML, DOCX, ePub, and FB2 files.
- Implement error handling for cases where pandoc exceeds the timeout or
  memory limit, ensuring the script exits gracefully with an appropriate
  message.
- Improve the robustness of file previews in ranger by preventing pandoc
  from causing system instability due to excessive resource consumption.
Copy link
Member
@toonn toonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea because the behavior is really disruptive but I'm not sure we can find a way to do this cross-platform.

Comment on lines +90 to +98
ulimit -v 500000 # Limit to 500MB
timeout 10s pandoc -s -t markdown -- "${FILE_PATH}" && exit 5
if [ $? -eq 124 ]; then
echo "Pandoc timed out or was terminated due to excessive memory usage" >&2
exit 1
elif [ $? -ne 0 ]; then
echo "Pandoc failed" >&2
exit 1
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's wrap this in a function instead.

@@ -87,7 +87,15 @@ handle_extension() {
## Preview as text conversion
odt2txt "${FILE_PATH}" && exit 5
## Preview as markdown conversion
pandoc -s -t markdown -- "${FILE_PATH}" && exit 5
ulimit -v 500000 # Limit to 500MB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work on macOS, ulimit can only set the maximum file size there.
Either way it would have to be reset afterwards though.
Other commands should be free to use however much memory they want if they can do so quickly and it should be possible for other alternatives to be tried after Pandoc.

@@ -87,7 +87,15 @@ handle_extension() {
## Preview as text conversion
odt2txt "${FILE_PATH}" && exit 5
## Preview as markdown conversion
pandoc -s -t markdown -- "${FILE_PATH}" && exit 5
ulimit -v 500000 # Limit to 500MB
timeout 10s pandoc -s -t markdown -- "${FILE_PATH}" && exit 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout is part of Coreutils so it's not available on macOS and I'm not sure it's part of BusyBox.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0