8000 GitHub - MyForking/VisionDepth3D: Generates high-quality 3D videos in multiple formats using AI-powered depth mapping, Pulfrich effect, and customizable visual enhancements for an immersive experience.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Generates high-quality 3D videos in multiple formats using AI-powered depth mapping, Pulfrich effect, and customizable visual enhancements for an immersive experience.

License

Notifications You must be signed in to change notification settings

MyForking/VisionDepth3D

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

VisionDepth3D

VisionDepth3D: The All-in-One 3D Suite for Creators

From depth maps to final render β€” everything you need to turn flat 2D footage into cinematic 3D. Powered by AI. Optimized for speed. Designed for creators.

Github All Releases Python Version


GUI Layout

GUITabsSBS


Table of Contents

Key Features – VisionDepth3D All-in-One 3D Suite

AI-Powered Depth Estimation (GPU Accelerated)

  • Seamless integration with 20+ transformer-based depth models: ZoeDepth, Depth Anything, MiDaS, DPT, DepthPro, DinoV2, and more
  • One-click model selection with automatic downloads, no CLI setup or config files
  • PyTorch GPU acceleration (no OpenCV recompile needed)
  • Batch support for both video sequences and image folders
  • Temporal smoothing, intelligent scene-adaptive normalization
  • Built-in color inversion, customizable colormaps (Viridis, Inferno, Magma)
  • Real-time frame-by-frame progress bar, FPS display, and ETA tracking
  • Auto-resizing, smart batching, and graceful handling of large resolutions
  • Pause/resume/cancel supported during all GPU operations

Advanced 3D Rendering Engine (Real-Time Stereo Composer)

  • Pixel-accurate depth parallax shifting using CUDA + PyTorch
  • Full control over foreground(pop) / midground(balance) / background(pull) parallax
  • Half-SBS, Full-SBS, VR, Passive Interlaced, Anaglyph, and Dynamic Floating Window formats
  • Dynamic floating window with cinema-style masking that slides and eases smoothly
  • Built-in Pulfrich effect renderer (motion delay-based left-eye blending)
  • Feathered shift masking, sharpening, and edge-aware smoothing
  • Subject tracking-based convergence for natural stereo alignment
  • GPU-accelerated real-time processing with live GUI stats (FPS, elapsed time, %)
  • Output is compatible with Quest VR, YouTube 3D, and most stereo players

RIFE Frame Interpolation (ONNX Runtime)

  • Integrated RIFE ONNX model (no PyTorch required) for real-time frame doubling
  • Interpolation modes: 2x, 4x, 8x FPS with smooth motion blending
  • Folder-based processing of raw frames + automatic video reassembly
  • Preserves frame resolution, count, audio sync, and aspect ratio
  • Supports preview and export at high quality using FFmpeg codecs
  • Real-time progress tracking + FPS + ETA built into GUI

4x Super Resolution Upscaling (Real-ESRGAN Powered – ONNX GPU)

  • Integrated Real-ESRGAN (x4) super-resolution model, exported to ONNX with full GPU support
  • Batch upscaling with intelligent VRAM-aware batching (1–8 frames)
  • Supports 720p ➜ 1080p, 1080p ➜ 4K, or any custom resolution
  • Automatically resizes final frames to match output format and target resolution
  • Lightning-fast CUDA-accelerated ONNX runtime (no PyTorch required)
  • Full integration with frame renderer: upscales after 3D rendering or interpolation
  • Clean, artifact-free outputs using enhanced fp16 inference for visual clarity
  • Progress bar, FPS counter, ETA timer integrated into the GUI
  • Fully exportable to video with codec support: MP4V, XVID, MJPG, FFmpeg NVENC

Smart Depth-Aware Effects

  • Gradient-aware artifact suppression near depth edges (limbs, hair)
  • Feathered transition masks to avoid ghosting and popping
  • Depth-aware sharpening and blending for polished 3D output
  • Dynamic bar generation for floating window masking that eases smoothly like theatrical films
  • Real-time zero parallax estimation and smoothing per-frame

Audio + Video Re-Integration

  • Audio button to directly rip and attach audio from source video using FFmpeg
  • Format choices: AAC, MP3, WAV with adjustable bitrate
  • Built-in tools, no shell commands needed – fully GUI-based

Preview System with Format Testing

  • Choose preview format: Passive Interlaced, HSBS, Shift Heatmap
  • Live preview on frame for quick tuning
  • Auto-exports as image preview file, no temp videos needed
  • Toggle convergence depth and parallax before full render

Real-Time 3D Player (VDPlayer)

  • Lightweight player built for Half-SBS, Full-SBS, and VR output
  • Fast seeking, play/pause/fullscreen toggles
  • Timestamp scrubber + resolution-aware display
  • Designed to instantly preview 3D results without leaving the app

Smart GUI & Workflow Features

  • Multi-tab Tkinter interface, responsive and persistent settings
  • Pause, resume, and cancel buttons for all rendering threads
  • Codec selector with GPU NVENC options (H.264, HEVC, AV1-ready)
  • One-click launcher, no pip/CLI scripting needed
  • slider recall, and auto-cropping for black bars

Supported Output Formats & Aspect Ratios

  • Formats: Half-SBS, Full-SBS, VR Mode, Red-Cyan Anaglyph, Passive Interlaced
  • Ratios: 16:9, CinemaScope (2.39:1), 2.76:1, 4:3, 21:9, Square 1:1, Classic 2.35:1
  • Supports export in MP4, MKV, AVI with codecs: XVID, MP4V, MJPG, DIVX, and FFmpeg NVENC

Guide Sheet: Install

πŸ“Œ System Requirements

  • βœ”οΈ This program runs on python 3.12
  • βœ”οΈ This program has been tested on cuda 12.8
  • βœ”οΈ Conda (Optional, Recommended for Simplicity)
  • ❌ Linux/macOS is not officially supported until a more stable solution is found

πŸ“Œ Step 1: Download the VisionDepth3D Program

  • 1️⃣ Download the VisionDepth3D zip file from the official download source. (green button)
  • 2️⃣ Extract the zip file to your desired folder (e.g., c:\user\VisionDepth3D).
  • 3️⃣ Download models Here and extract weights folder into VisionDepth3D Main Folder

πŸ“Œ Step 2: Create Env and Install Required Dependencies

🟒 Option 1: Install via pip (Standard CMD Method)

  • 1️. press (Win + R), type cmd, and hit Enter.
  • 2. Clone the Repository (Skip the git clone if you downloaded the ZIP and start from cd)
    git clone https://github.com/VisionDepth/VisionDepth3D.git
    cd C:\VisionDepth3D-main
    pip install -r requirements.txt
    
    • continue to installing pytorch with cuda and then run VisionDepth3D.bat

πŸ”΅ Option 2: Install via Conda (Recommended)

(Automatically manages dependencies & isolates environment.)

  • 1. Clone the Repository (Skip the git clone if you downloaded the ZIP and start from cd)
  • 2. Create the Conda Environment To create the environment, copy and past this in conda to run:
    git clone https://github.com/VisionDepth/VisionDepth3D.git
    cd VisionDepth3D-main
    conda create -n VD3D python=3.12
    conda activate VD3D
    pip install -r requirements.txt
    

πŸ“Œ Step 3: Check if CUDA is installed

πŸ” Find Your CUDA Version: Before installing PyTorch, check which CUDA version your GPU supports:

  • 1️⃣ Open Command Prompt (Win + R, type cmd, hit Enter)
  • 2️⃣ Run the following command:
nvcc --version

or

nvidia-smi
  • 3️⃣ Look for the CUDA version (e.g., CUDA 11.8, 12.1, etc.)

πŸ“Œ Install PyTorch with the Correct CUDA Version

Go to the official PyTorch website to find the best install command for your setup: πŸ”— https://pytorch.org/get-started/locally/

if you are running Cuda 12.8 install Pytorch(nightly)-Cuda 12.8, if that doesnt work use 12.6 version

  • Once Pytorch and all dependancies are installed run the following command:
VisionDepth3D.bat

Congrats you have successfully downloaded VisionDepth3D! This snippet guides users through cloning the repo, creating and activating the environment, and running the appβ€”all in a few simple steps.


Guide Sheet: GUI Inputs

Use the GUI to fine-tune your 3D conversion settings.

1. Codec

  • Description: Sets the output video encoder.
  • Default: mp4v (CPU)
  • Options:
    • mp4v, XVID, DIVX – CPU-based
    • libx264, libx265 – High-quality software (CPU)
    • h264_nvenc, hevc_nvenc – GPU-accelerated (NVIDIA)

2. Convergence Shift (Foreground / Popping out)

  • Description: Pops foreground objects out of the screen.
  • Default: 6.5
  • Range: 3.0 to 8.0
  • Effect: Strong values create noticeable 3D "pop" in close objects.

3. Depth Transition (Midground)

  • Description: Depth for mid-layer transition between foreground and background.
  • Default: 1.5
  • Range: -3.0 to 5.0
  • Effect: Smooths the 3D transition β€” higher values exaggerate depth between layers.

4. Divergence Shift (Screen Plane / Background)

  • Description: Shift depth for background layers (far away).
  • Default: -12.0
  • Range: -10.0 to 0.0
  • Effect: More negative pushes content into the screen (deeper background).

5. Sharpness Factor

  • Description: Applies a sharpening filter to the output.
  • Default: 0.2
  • Range: -1.0 (softer) to 1.0 (sharper)
  • Effect: Brings clarity to 3D edges; avoid over-sharpening to reduce halos.

6. Blend Factor (Pulfrich)

  • Description: Blends delayed and current frames for Pulfrich-style motion depth.
  • Default: 0.5
  • Range: 0.3 (subtle) to 0.7 (stronger)
  • Effect: Controls temporal depth perception. Higher = more blur in motion.

7. Delay Time (Pulfrich)

  • Description: How many seconds to delay the Pulfrich ghost frame.
  • Default: 1/30
  • Range: 1/50 to 1/20
  • Effect: Smaller values = subtle motion depth, larger = stronger Pulfrich 3D.

8. Feather Strength (Edge Anti-Aliasing)

  • Description: Softens hard 3D edges using depth gradients.
  • Default: 10.0
  • Range: 0 to 20
  • Effect: Reduces ghosting artifacts and hard cutouts around subjects.

9. Feather Blur Size

  • Description: How wide the smoothing kernel should be.
  • Default: 9
  • Range: 1 to 15
  • Effect: Larger = more smoothing, helps reduce halo noise on edges.

10. FFmpeg Codec & CRF Quality

  • Codec: Choose GPU-accelerated encoders (h264_nvenc, hevc_nvenc) for faster renders.
  • CRF (Constant Rate Factor):
    • Default: 23
    • Range: 0 (lossless) to 51 (worst)
    • Lower values = better visual quality.

11. Dynamic Subject Locking (New!)

  • Checkbox: Lock Subject to Screen
  • Effect: Enables Dynamic Zero Parallax Tracking β€” the depth plane will automatically follow the subject’s depth to minimize excessive 3D warping.
  • Great for: Human characters or central objects in motion.

Depth Map Tips

  • Match resolution and FPS between your input video and depth map.
  • Use the Inverse Depth checkbox if bright = far instead of close.
  • Recommended depth models:
    • ZoeDepth, Depth Anything V2, MiDaS, DPT-Large, etc.
    • Choose Large models for better fidelity.

Rendering Time Estimates

Clip Length Estimated Time (with GPU)
30 seconds 1–4 mins
5 minutes 10–25 mins
Full Movie 6–24+ hours

Example Workflow

  1. Load video and matching depth map.
  2. Choose output format (Half-SBS, Full-SBS, Anaglyph, etc.).
  3. Enable "Lock Subject to Screen" for tracked parallax.
  4. Set feather smoothing to around 10 and blur 9 for cleanest edges.
  5. Set encoder: use NVENC for speed (h264_nvenc), or libx264 for max compatibility.
  6. Hit "Generate 3D Video" and let it roll!

Pulfrich Effect Quick Guide

  • Works by blending delayed + current frames for moving objects.
  • Best for lateral motion scenes (walking, panning, cars, etc.).
  • Tune:
    • blend_factor = 0.4–0.6
    • delay_time = ~1/30
  • Scene changes are automatically smoothed!

Troubleshooting

  • Black/Empty Output: Wrong depth map resolution or mismatch with input FPS.
  • Halo/Artifacts:
    • Increase feather strength and blur size.
    • Enable subject tracking and clamp the zero parallax offset.
  • Out of Memory (OEM):
    • Enable FFmpeg rendering for better memory usage.
    • Use libx264 or h264_nvenc and avoid long clips in one go.

Dev Notes

This tool is being developed by a solo dev with nightly grind energy (πŸ• ~4 hours a night). If you find it helpful, let me know β€” feedback, bug reports, and feature ideas are always welcome!

Acknowledgments & Credits

I want to express my gratitude to the amazing creators and contributors behind the depth estimation models used in this project. Your work has made it possible to push the boundaries of 3D rendering and video processing. πŸ™Œ

Supported Depth Models

Model Name Creator / Organization Hugging Face Repository
Distil-Any-Depth-Large xingyang1 Distil-Any-Depth-Large-hf
Distil-Any-Depth-Small xingyang1 Distil-Any-Depth-Small-hf
Depth Anything V2 Large Depth Anything Team Depth-Anything-V2-Large-hf
Depth Anything V2 Base Depth Anything Team Depth-Anything-V2-Base-hf
Depth Anything V2 Small Depth Anything Team Depth-Anything-V2-Small-hf
Depth Anything V1 Large LiheYoung Depth-Anything-V2-Large
Depth Anything V1 Base LiheYoung depth-anything-base-hf
Depth Anything V1 Small LiheYoung depth-anything-small-hf
V2-Metric-Indoor-Large Depth Anything Team Depth-Anything-V2-Metric-Indoor-Large-hf
V2-Metric-Outdoor-Large Depth Anything Team Depth-Anything-V2-Metric-Outdoor-Large-hf
DA_vitl14 LiheYoung depth_anything_vitl14
DA_vits14 LiheYoung depth_anything_vits14
DepthPro Apple DepthPro-hf
ZoeDepth Intel zoedepth-nyu-kitti
MiDaS 3.0 Intel dpt-hybrid-midas
DPT-Large Intel dpt-large
DinoV2 Facebook dpt-dinov2-small-kitti
dpt-beit-large-512 Intel dpt-beit-large-512

πŸ™ Thank You! A huge thank you to all the researchers, developers, and contributors who created and shared these models. Your work is inspiring and enables developers like me to build exciting and innovative applications! πŸš€πŸ’™

About

Generates high-quality 3D videos in multiple formats using AI-powered depth mapping, Pulfrich effect, and customizable visual enhancements for an immersive experience.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.9%
  • Batchfile 0.1%
0