Ferenas

Ferenas Ferenas

Fly higher, everyone will see it

23 followers · 58 following

Shanghai Jiao Tong University

Achievements

Stars

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 12,401 2,781 Updated May 22, 2025

ByteDance-Seed / Bagel

Python 1,153 36 Updated May 21, 2025

Ferenas / ConText

[ICML'25] "ConText: Driving In-context Learning for Text Removal and Segmentation"

1 Updated May 19, 2025

switchablenorms / DeepFashion2

DeepFashion2 Dataset https://arxiv.org/pdf/1901.07973.pdf

Jupyter Notebook 2,425 372 Updated Jan 28, 2025

xinghaochen / awesome-hand-pose-estimation

Awesome work on hand pose estimation/tracking

Python 3,210 533 Updated Mar 11, 2025

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,016 28 Updated May 21, 2025

Tsingularity / dift

[NeurIPS'23] Emergent Correspondence from Image Diffusion

Python 688 38 Updated May 14, 2024

allenai / molmo

Code for the Molmo Vision-Language Model

Python 426 37 Updated Dec 12, 2024

facebookresearch / co-tracker

CoTracker is a model for tracking any point (pixel) on a video.

Jupyter Notebook 4,314 295 Updated Jan 21, 2025

google-deepmind / perception_test

Jupyter Notebook 209 15 Updated Apr 23, 2025

google-deepmind / tapnet

Tracking Any Point (TAP)

Jupyter Notebook 1,528 142 Updated May 19, 2025

jin-s13 / COCO-WholeBody

ECCV2020 paper "Whole-Body Human Pose Estimation in the Wild"

Python 805 71 Updated Apr 22, 2025

facebookresearch / DensePose

A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body

Jupyter Notebook 7,053 1,297 Updated Jan 18, 2023

SysCV / sam-pt

SAM-PT: Extending SAM to zero-shot video segmentation with point-based tracking.

Python 998 62 Updated Jan 27, 2024

Yujun-Shi / DragDiffusion

[CVPR2024, Highlight] Official code for DragDiffusion

Python 1,215 93 Updated Jan 29, 2024

microsoft / human-pose-estimation.pytorch

The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)"

Python 2,979 603 Updated Nov 28, 2022

UX-Decoder / DINOv

[CVPR 2024] Official implementation of the paper "Visual In-context Learning"

Python 470 21 Updated Apr 8, 2024

bytedance / UNO

🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning

Python 1,056 55 Updated Apr 17, 2025

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 8,152 619 Updated Apr 27, 2025

zhouyiks / CoLVA

Python 29 1 Updated Jan 9, 2025

zepingyu0512 / in-context-mechanism

code for EMNLP 2024 paper: How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning

Jupyter Notebook 11 Updated Nov 17, 2024

saccharomycetes / mllms_know

[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'

Python 195 9 Updated Apr 20, 2025

TIGER-AI-Lab / MEGA-Bench

This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR2025]

Python 65 5 Updated Apr 17, 2025

Code-kunkun / LamRA

[CVPR 2025] LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant

Python 112 4 Updated May 13, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 2,991 228 Updated May 19, 2025

yaotingwangofficial / Awesome-MCoT

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

585 16 Updated May 20, 2025

rongyaofang / GoT

Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"

Jupyter Notebook 243 9 Updated Apr 30, 2025

TIGER-AI-Lab / OmniEdit

Official Repo for Paper "OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision" [ICLR2025]

111 3 Updated Jan 27, 2025

LLaVA-VL / LLaVA-NeXT

Python 3,843 360 Updated May 6, 2025

OpenGVLab / OmniCorpus

[ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 350 6 Updated May 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ferenas Ferenas

Achievements

Achievements

Block or report Ferenas

Stars

NVIDIA / Megatron-LM

ByteDance-Seed / Bagel

Ferenas / ConText

switchablenorms / DeepFashion2

xinghaochen / awesome-hand-pose-estimation

ByteDance-Seed / Seed1.5-VL

Tsingularity / dift

allenai / molmo

facebookresearch / co-tracker

google-deepmind / perception_test

google-deepmind / tapnet

jin-s13 / COCO-WholeBody

facebookresearch / DensePose

SysCV / sam-pt

Yujun-Shi / DragDiffusion

microsoft / human-pose-estimation.pytorch

UX-Decoder / DINOv

bytedance / UNO

OpenGVLab / InternVL

zhouyiks / CoLVA

zepingyu0512 / in-context-mechanism

saccharomycetes / mllms_know

TIGER-AI-Lab / MEGA-Bench

Code-kunkun / LamRA

QwenLM / Qwen2.5-Omni

yaotingwangofficial / Awesome-MCoT

rongyaofang / GoT

TIGER-AI-Lab / OmniEdit

LLaVA-VL / LLaVA-NeXT

OpenGVLab / OmniCorpus