Blockinfile - Add new module option - 'encoding' #85291
Open
+130
−22
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
SUMMARY
This PR for the
blockinfile
module introduces a new module optionencoding
to add compatibility for target files not encoded in UTF-8. It mirrors the PR for lineinfile (#84999)Currently, the
blockinfile
module code does a binary-read on a target file and puts the contents as bytes in a buffer. This buffer is assumed to contain UTF-8 encoded bytes upon which regex matching operations and write operations are done. If a target file is not UTF-8 encoded, the regex matching does not work correctly because the regex comparison is a UTF-8 regex pattern compared to non-UTF-8 encoded bytes. And since write operations are done by adding UTF-8 bytes to the buffer, in the case of a non-UTF-8 encoded file, since this buffer would not contain UTF-8 encoded bytes, when the buffer is written to the file, the resulting file contains characters from multiple encodings.The proposed change introduces a new module option
encoding
, which when specified reads the file contents into a buffer containing Unicode characters instead of bytes so that regex matching is done in Unicode and write operations are done by adding Unicode characters to the buffer instead of UTF-8 bytes. Since Python3 strings internally represent characters in Unicode, all the Unicode operations are just simply Python string operations. File reads and writes are done intext-mode
so that the optional encoding parameter can be specified when opening the file descriptor (https://docs.python.org/3/library/functions.html#open).ISSUE TYPE