8000 CVAT Error working with very large resolution files (4096 x 3008). OpenH264 limitation. · Issue #7425 · cvat-ai/cvat · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
CVAT Error working with very large resolution files (4096 x 3008). OpenH264 limitation. #7425
Closed
@finickyDrone

Description

@finickyDrone

I'm working with a very large resolution file that is over 4K standard resolution (4096x2304). I was able to upload the file to CVAT and perform annotations and work normally as expected. However, when trying to export the annotations, the cvat_worker_export worker container encounters a libopenh264 error. This is what I see when I issue the command docker logs -f cvat_worker_export while everything is up and running:

[2024-02-01 18:51:02,040] ERROR libav.libopenh264: [OpenH264] this = 0x0x55ae68649c50, Error:ParamValidationExt(), width > 0, height > 0, width * height <= 9437184, invalid 4096 x 3008 in dependency layer settings!
[2024-02-01 18:51:02,040] ERROR libav.libopenh264: [OpenH264] this = 0x0x55ae68649c50, Error:WelsInitEncoderExt(), ParamValidationExt failed return 2.
[2024-02-01 18:51:02,040] ERROR libav.libopenh264: [OpenH264] this = 0x0x55ae68649c50, Error:CWelsH264SVCEncoder::Initialize(), WelsInitEncoderExt failed.
[2024-02-01 18:51:02,040] ERROR libav.libopenh264: Initialize failed

2024-02-01 18:51:02,770 DEBG 'rqworker-export-0' stderr output:
[2024-02-01 18:51:02,767] ERROR cvat.apps.dataset_manager.views: [Task.id=213] [cvat.apps.dataset_manager.views @ export]: exception occurred
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/datumaro/plugins/voc_format/converter.py", line 222, in save_subsets
    self._save_image(item, osp.join(self._images_dir, image_filename))
  File "/opt/venv/lib/python3.10/site-packages/datumaro/components/converter.py", line 250, in _save_image
    item.media.save(path)
  File "/opt/venv/lib/python3.10/site-packages/datumaro/components/media.py", line 163, in save
    save_image(path, self.data)
  File "/opt/venv/lib/python3.10/site-packages/datumaro/components/media.py", line 104, in data
    data = self._data()
  File "/opt/venv/lib/python3.10/site-packages/datumaro/util/image.py", line 283, in __call__
    image = self._loader(self._path)
  File "/home/django/cvat/apps/dataset_manager/bindings.py", line 1417, in <lambda>
    loader = lambda _: frame_provider.get_frame(i,
  File "/home/django/cvat/apps/engine/frame_provider.py", line 195, in get_frame
    chunk_reader = loader.load(chunk_number)
  File "/home/django/cvat/apps/engine/frame_provider.py", line 85, in load
    self.reader_class([self.get_chunk_path(chunk_id, self.quality, self.db_data)[0]]))
  File "/home/django/cvat/apps/engine/cache.py", line 60, in get_task_chunk_data_with_mime
    item = self._get_or_set_cache_item(
  File "/home/django/cvat/apps/engine/cache.py", line 52, in _get_or_set_cache_item
    item = create_function()
  File "/home/django/cvat/apps/engine/cache.py", line 62, in <lambda>
    create_function=lambda: self._prepare_task_chunk(db_data, quality, chunk_number),
  File "/home/django/cvat/apps/engine/cache.py", line 203, in _prepare_task_chunk
    writer.save_as_chunk(images, buff)
  File "/home/django/cvat/apps/engine/media_extractors.py", line 767, in save_as_chunk
    self._encode_images(images, output_container, output_v_stream)
  File "/home/django/cvat/apps/engine/media_extractors.py", line 777, in _encode_images
    for packet in stream.encode(frame):
  File "av/stream.pyx", line 164, in av.stream.Stream.encode
  File "av/codec/context.pyx", line 480, in av.codec.context.CodecContext.encode
  File "av/codec/context.pyx", line 289, in av.codec.context.CodecContext.open
  File "av/error.pyx", line 336, in av.error.err_check
av.error.UnknownError: [Errno 1313558101] Unknown error occurred; last error log: [libopenh264] Initialize failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/django/cvat/apps/dataset_manager/views.py", line 85, in export
    export_fn(db_instance.id, temp_file, dst_format,
  File "/home/django/cvat/apps/dataset_manager/task.py", line 904, in export_task
    task.export(f, exporter, host=server_url, save_images=save_images)
  File "/home/django/cvat/apps/dataset_manager/task.py", line 785, in export
    exporter(dst_file, temp_dir, task_data, **options)
  File "/home/django/cvat/apps/dataset_manager/formats/registry.py", line 36, in __call__
    f_or_cls(*args, **kwargs)
  File "/home/django/cvat/apps/dataset_manager/formats/pascal_voc.py", line 26, in _export
    dataset.export(temp_dir, 'voc', save_images=save_images,
  File "/opt/venv/lib/python3.10/site-packages/datumaro/util/scope.py", line 158, in wrapped_func
    ret_val = func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/datumaro/components/dataset.py", line 1111, in export
    raise e.__cause__
datumaro.components.errors.ItemExportError: Failed to export item ('frame_000000', 'default')

Digging deeper into the issue, it appears I've hit a limit in the OpenH264 encoder that only supports up to 4k imagery. This is the actual check that fails:
https://github.com/cisco/openh264/blob/c59550a2147c255cc8e09451f6deb96de2526b6d/codec/encoder/core/src/encoder_ext.cpp#L515C81-L515C98

and this is the hard limit that Openh264 currently has:
https://github.com/cisco/openh264/blob/c59550a2147c255cc8e09451f6deb96de2526b6d/codec/common/inc/utils.h#L46

This seems to make sense as OpenH264 supports H264 spec up to 5.2, which according to this reference, means that only 4k is supported:
https://en.wikipedia.org/wiki/Advanced_Video_Coding#Levels

As such, this is probably not a CVAT error, more of an Openh264 encoder limitation. Wanted to reach out to the community for thoughts. It appears the Nvidia's NVENC H264 supports up to 8K, would it be feasible to use that instead of Openh264? Might it be a good idea to set a limit in CVAT where in standard OpenH264 mode, only videos 4k or below can be uploaded so that people don't run into this same problem? Am I missing something obvious that could make over 4k Openh264 encode video work as expected?

Thanks!

These are my specs:
OS: Ubuntu 22.04 LTS, Linux kernel 5.15
CVAT tag release: 2.8
Docker version: 24.0.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0