Description
I'm working with a very large resolution file that is over 4K standard resolution (4096x2304). I was able to upload the file to CVAT and perform annotations and work normally as expected. However, when trying to export the annotations, the cvat_worker_export worker container encounters a libopenh264 error. This is what I see when I issue the command docker logs -f cvat_worker_export
while everything is up and running:
[2024-02-01 18:51:02,040] ERROR libav.libopenh264: [OpenH264] this = 0x0x55ae68649c50, Error:ParamValidationExt(), width > 0, height > 0, width * height <= 9437184, invalid 4096 x 3008 in dependency layer settings!
[2024-02-01 18:51:02,040] ERROR libav.libopenh264: [OpenH264] this = 0x0x55ae68649c50, Error:WelsInitEncoderExt(), ParamValidationExt failed return 2.
[2024-02-01 18:51:02,040] ERROR libav.libopenh264: [OpenH264] this = 0x0x55ae68649c50, Error:CWelsH264SVCEncoder::Initialize(), WelsInitEncoderExt failed.
[2024-02-01 18:51:02,040] ERROR libav.libopenh264: Initialize failed
2024-02-01 18:51:02,770 DEBG 'rqworker-export-0' stderr output:
[2024-02-01 18:51:02,767] ERROR cvat.apps.dataset_manager.views: [Task.id=213] [cvat.apps.dataset_manager.views @ export]: exception occurred
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/datumaro/plugins/voc_format/converter.py", line 222, in save_subsets
self._save_image(item, osp.join(self._images_dir, image_filename))
File "/opt/venv/lib/python3.10/site-packages/datumaro/components/converter.py", line 250, in _save_image
item.media.save(path)
File "/opt/venv/lib/python3.10/site-packages/datumaro/components/media.py", line 163, in save
save_image(path, self.data)
File "/opt/venv/lib/python3.10/site-packages/datumaro/components/media.py", line 104, in data
data = self._data()
File "/opt/venv/lib/python3.10/site-packages/datumaro/util/image.py", line 283, in __call__
image = self._loader(self._path)
File "/home/django/cvat/apps/dataset_manager/bindings.py", line 1417, in <lambda>
loader = lambda _: frame_provider.get_frame(i,
File "/home/django/cvat/apps/engine/frame_provider.py", line 195, in get_frame
chunk_reader = loader.load(chunk_number)
File "/home/django/cvat/apps/engine/frame_provider.py", line 85, in load
self.reader_class([self.get_chunk_path(chunk_id, self.quality, self.db_data)[0]]))
File "/home/django/cvat/apps/engine/cache.py", line 60, in get_task_chunk_data_with_mime
item = self._get_or_set_cache_item(
File "/home/django/cvat/apps/engine/cache.py", line 52, in _get_or_set_cache_item
item = create_function()
File "/home/django/cvat/apps/engine/cache.py", line 62, in <lambda>
create_function=lambda: self._prepare_task_chunk(db_data, quality, chunk_number),
File "/home/django/cvat/apps/engine/cache.py", line 203, in _prepare_task_chunk
writer.save_as_chunk(images, buff)
File "/home/django/cvat/apps/engine/media_extractors.py", line 767, in save_as_chunk
self._encode_images(images, output_container, output_v_stream)
File "/home/django/cvat/apps/engine/media_extractors.py", line 777, in _encode_images
for packet in stream.encode(frame):
File "av/stream.pyx", line 164, in av.stream.Stream.encode
File "av/codec/context.pyx", line 480, in av.codec.context.CodecContext.encode
File "av/codec/context.pyx", line 289, in av.codec.context.CodecContext.open
File "av/error.pyx", line 336, in av.error.err_check
av.error.UnknownError: [Errno 1313558101] Unknown error occurred; last error log: [libopenh264] Initialize failed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/django/cvat/apps/dataset_manager/views.py", line 85, in export
export_fn(db_instance.id, temp_file, dst_format,
File "/home/django/cvat/apps/dataset_manager/task.py", line 904, in export_task
task.export(f, exporter, host=server_url, save_images=save_images)
File "/home/django/cvat/apps/dataset_manager/task.py", line 785, in export
exporter(dst_file, temp_dir, task_data, **options)
File "/home/django/cvat/apps/dataset_manager/formats/registry.py", line 36, in __call__
f_or_cls(*args, **kwargs)
File "/home/django/cvat/apps/dataset_manager/formats/pascal_voc.py", line 26, in _export
dataset.export(temp_dir, 'voc', save_images=save_images,
File "/opt/venv/lib/python3.10/site-packages/datumaro/util/scope.py", line 158, in wrapped_func
ret_val = func(*args, **kwargs)
File "/opt/venv/lib/python3.10/site-packages/datumaro/components/dataset.py", line 1111, in export
raise e.__cause__
datumaro.components.errors.ItemExportError: Failed to export item ('frame_000000', 'default')
Digging deeper into the issue, it appears I've hit a limit in the OpenH264 encoder that only supports up to 4k imagery. This is the actual check that fails:
https://github.com/cisco/openh264/blob/c59550a2147c255cc8e09451f6deb96de2526b6d/codec/encoder/core/src/encoder_ext.cpp#L515C81-L515C98
and this is the hard limit that Openh264 currently has:
https://github.com/cisco/openh264/blob/c59550a2147c255cc8e09451f6deb96de2526b6d/codec/common/inc/utils.h#L46
This seems to make sense as OpenH264 supports H264 spec up to 5.2, which according to this reference, means that only 4k is supported:
https://en.wikipedia.org/wiki/Advanced_Video_Coding#Levels
As such, this is probably not a CVAT error, more of an Openh264 encoder limitation. Wanted to reach out to the community for thoughts. It appears the Nvidia's NVENC H264 supports up to 8K, would it be feasible to use that instead of Openh264? Might it be a good idea to set a limit in CVAT where in standard OpenH264 mode, only videos 4k or below can be uploaded so that people don't run into this same problem? Am I missing something obvious that could make over 4k Openh264 encode video work as expected?
Thanks!
These are my specs:
OS: Ubuntu 22.04 LTS, Linux kernel 5.15
CVAT tag release: 2.8
Docker version: 24.0.7