8000 [Feature] Support ViT-Adapter by okotaku · Pull Request #9354 · open-mmlab/mmdetection · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Feature] Support ViT-Adapter #9354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: dev-3.x
Choose a base branch
from

Conversation

okotaku
Copy link
Contributor
@okotaku okotaku commented Nov 21, 2022

Motivation

paper: https://arxiv.org/abs/2205.08534
code: https://github.com/czczup/ViT-Adapter
issue: #9044

Related PR

open-mmlab/mmpretrain#1209
open-mmlab/mmcv#2451
open-mmlab/mmcv#2452

Result

Backbone ASFF box AP mask AP
DeiT-T official 46.0 41.0
DeiT-T mmdet 45.6 40.9
BEiT-B mmdet 48.7 43.1

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMCls.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

@okotaku okotaku changed the title Vitadapter [Feature] Support ViT-Adapter Nov 21, 2022
@ZwwWayne ZwwWayne requested a review from hhaAndroid November 21, 2022 02:27
@ZwwWayne ZwwWayne added this to the 3.0.0rc5 milestone Nov 21, 2022
@ZwwWayne
Copy link
Collaborator

Hi @okotaku ,
Thanks for your kind PR. Overall the code and config seem to be good for us. May I ask why the performance drops and can we fix them?

@ZwwWayne ZwwWayne assigned hhaAndroid and unassigned Czm369 Nov 21, 2022
@okotaku
Copy link
Contributor Author
okotaku commented Nov 21, 2022

May I ask why the performance drops and can we fix them?

I do not know the cause of the drop in performance at this time.

I was wondering about the two types of window attention implemented in the ViT-Adapter implementation.

https://github.com/czczup/ViT-Adapter/blob/main/detection/mmdet_custom/models/backbones/base/vit.py#L123
https://github.com/czczup/ViT-Adapter/blob/main/detection/mmdet_custom/models/backbones/base/vit.py#L170

However, I have decided that there is no differe 8000 nce and have implemented window attention with reference to the implementation of vitdet in detectron2.

https://github.com/facebookresearch/detectron2/blob/main/detectron2/modeling/backbone/vit.py#L148

I will continue to investigate, but if there is anything you notice, please let me know.

@okotaku
Copy link
Contributor Author
okotaku commented Nov 23, 2022

I found two differences.

  1. drop out rate in MultiScaleDeformableAttention. official = 0.0 mine = 0.1(default of mmcv)
  2. use layer_scale or no. offical = use layer_scale mine = not use layer_scale

I will fix these, train and check mAP again.

@okotaku okotaku changed the title [Feature] Support ViT-Adapter [WIP][Feature] Support ViT-Adapter Nov 24, 2022
@okotaku
Copy link
Contributor Author
okotaku commented Dec 1, 2022

In my experiments, layer scale made mAP worse. I recorded box AP= 45.6 and mask AP= 40.9 with the latest config, which is close to the official performance.
After refactor the code on the mmcls side, release the WIP.

@okotaku okotaku changed the title [WIP][Feature] Support ViT-Adapter [Feature] Support ViT-Adapter Dec 20, 2022

| Backbone | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Config | Download |
| :------: | :-----: | :------: | :------------: | :----: | :-----: | :----------------------------------------------------: | :----------------------: |
| DeiT-T | 3x | | | | | [config](./mask-rcnn_vitadapter-deit-t_fpn_3x_coco.py) | [model](<>) \| [log](<>) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can update this readme

@ZwwWayne
Copy link
Collaborator

Hi @okotaku ,
Thanks for your kind PR. We plan to merge this PR this week. Would you like to simply put these files into projects like ConvNeXt v2? In this way, you do not need to update the metafile.yaml and it can be merged quickly.

@ZwwWayne ZwwWayne assigned zwhus and unassigned hhaAndroid Jan 16, 2023
@okotaku
Copy link
Contributor Author
okotaku commented Jan 17, 2023

@ZwwWayne I understand.
However, the mmcls PRs have not been merged yet, so we may have to wait for that.

@ZwwWayne
Copy link
Collaborator

@ZwwWayne I understand. However, the mmcls PRs have not been merged yet, so we may have to wait for that.

Hi @okotaku ,
Thanks for your quick response. I missed the situation in mmcls before and have reminded them. Could you further move the files into the folders of projects? The expected file structure looks like

|-- .gitignore
|-- projects
    |-- ViTAdapter
        |-- configs
          |-- mask-rcnn_beitadapter-b_fpn_3x_coco.py
        |-- README.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0