TABE-51 Dataset Creation

University of York

Paper | Project Page | TABE Codebase | TABE-51 Dataset - COMING SOON

Abstract

We present TABE-51 (Track Anything Behind Everything 51), a video amodal segmentation dataset that provides high-quality ground truth labels for occluded objects, eliminating the need for human assumptions about hidden pixels. TABE-51 leverages a compositing approach using real-world video clips: we record an initial clip of an object without occlusion and then overlay a second clip featuring the occluding object. This compositing yields realistic video sequences with natural motion and highly accurate ground truth masks, allowing models to be evaluated on how well they handle occluded objects in authentic scenarios.

Setup

Once this repo is clone locally:

Make venv in preferred format - for instruction, venv will be shown

python3 -m venv tabe51-gen_venv
source tabe51-gen_venv/bin/activate
pip install -r requirements.txt
pip install git+https://github.com/facebookresearch/sam2.git@2b90b9f

Download the SAM2 checkpoint.

Configs

Within src/tabe51_gen/configs/video_config.py The most important configs to set are:

aspect_ratio (tuple): Defines the aspect ratio of the frames (default: 16:9).
video_names (tuple): List of video file names to process for a single output, as can take in-front and behind frames from different videos to composite together
data_name (str): Name you want to set the output data to
sampled_fps (float): What do you want to downsample the video to (default: 15.0).
data_root_dir (Path): Root directory for storing processed data. To start has to have a "videos" folder containing the video(s) to work with
in_front_frame_ranges (tuple[list[int, int]]): Defines the start and stop frames for in-front objects. This can include multiple ranges for different clips or repeated ranges for multiple objects.
behind_frame_range (tuple[int, int]): Defines the frame range for the background object.
sam_checkpoint (Path): Path to the SAM2 segmentation model checkpoint.

Running

Note: This method isn't as automated as would be ideal and is still a bit laborious, work will be done to try and improve this flow!

We showcase an example video within examples

1. Split the video into frames

Run: python split_videos_into_frames.py
Desc: First stage to split the video into separate frames, saved to config.data_root_dir / config.downsampled_frame_dir

2. Line up the frames

Run: python line_up_frames.py
Desc: manually line up the frames, this will save the aligned frames. The script will open a gui. to config.data_root_dir / config.seperated_frames_dir.
Instructions:

Set Initial Frame Ranges
- Define the background frame range as behind_frame_range.
- Set the in-front frame ranges as in_front_frame_ranges (a tuple of tuples of however many in-front frames are wanted).
Check Frame Alignment
- When the visualizer loads, use the left and right arrow keys to navigate through frames.
- The far-right view shows an alpha overlap, helping visualize alignment.
Adjust Frame Ranges
- Close the program, edit the frame ranges, and repeat until the alignment looks correct.
Save the Aligned Frames
- In the image viewer, press 's' to save the aligned frames.

3. Bounding Box Cropping

Run: python auto_bbox_cropper.py
Desc: We've found that SAM2 performs significantly better when the object of interest is within a cropped area of the full frame. Attempting to segment directly from the full image can cause details to be lost. To improve accuracy, crop the bounding boxes of the frames—this will save the cropped frames to: config.data_root_dir / config.bbox_clipped_frames_dir.
Instructions:

It will first start with the behind clip object

Click points on the object
- Left click for positive points
- Middle click for negative points (not always needed, but can help)
Note: If the object is not yet in the scene, press 'q' to skip that frame.
Run Segmentation
- Once points are selected (~5 points is a good baseline, covering all elements of the object), press 's' to segment the item through the video.
Check Output
- If the bounding box segments are good, proceed to Step 4.
- If the bounding box segments are poor:
  1. Delete the outputs.
  2. Run auto_bbox_cropper.py again, but this time:
    - Press 'm' once the image loads.
    - Draw a manual bounding box around the object of interest.
    - Press Enter to create the output.
  3. Either continue using this manual bounding box method or return to Step 1.
Once the behind clip object is completed, move to the in-front object and repeat the above steps until completion.

4. Segmenting the Objects

Run: python segment_cropped_frames.py
Desc: Used to segment the object from these cropped images.
Instructions:

Start with the Behind Clip Object

Click points on the object
- Same approach as auto_bbox_cropper.py.
- If the object is not yet in the scene, press 'q' to skip that frame.
Run Segmentation
- Once points are selected, choose one of the following methods:
  - Press 's' to segment the object through the entire video.
  - Press 'm' to segment only the current frame.
When to use 'm' instead of 's':
- Use 'm' when only part of the object is visible, as propagating incomplete data across the whole video may lead to poor segmentation.
- Use 's' if most or all of the object is visible to segment the full video.
Once the behind clip object is completed, move to the in-front object and repeat the above steps until completion.

composite_runner.py - used to composite the segmented objects together into our dataset format

Note: This will run all top level names within the config.data_root_dir / config.segmented_frames_dir directory

BibTeX Citation

If you utilise our code and/or dataset, please consider citing our paper:

@article{hudson2024track,
  title={Track Anything Behind Everything: Zero-Shot Amodal Video Object Segmentation},
  author={Hudson, Finlay GC and Smith, William AP},
  journal={arXiv preprint arXiv:2411.19210},
  year={2024}
}

Misc

We welcome any contributions or collaborations to this work. Also any issues found, we will try and help as best we can in the Issues section :)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
examples/videos		examples/videos
src/tabe51_gen		src/tabe51_gen
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TABE-51 Dataset Creation

Abstract

Setup

Configs

Running

1. Split the video into frames

2. Line up the frames

3. Bounding Box Cropping

4. Segmenting the Objects

BibTeX Citation

Misc

About

Uh oh!

Releases

Packages

Uh oh!

Languages

finlay-hudson/TABE51-generation

Folders and files

Latest commit

History

Repository files navigation

TABE-51 Dataset Creation

Abstract

Setup

Configs

Running

1. Split the video into frames

2. Line up the frames

3. Bounding Box Cropping

4. Segmenting the Objects

BibTeX Citation

Misc

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages