MultiSig

A classification architecture for multi-modal data. Each data modality is tokenized via signature methods. A decoder then performs two-task classification: label and data type. The use of a shared encoder proves especially useful for low-data environments with unbalanced data modalities. Currently supports image (.jpg), video (.mp4) and audio (.wav) data types. The signature tokenizations are extensions of the ideas discussed in ImageSig (https://arxiv.org/abs/2205.06929).

The architecture was tested on a (quite unbalanced) dataset with the following structure:

data
├── training_set
│   ├── bird (1000 .jpg / 15 .mp4 / 8 .wav)
│   ├── cat  (5000 .jpg / 65 .mp4 / 5 .wav)
│   └── dog  (5000 .jpg /  2 .mp4 / 4 .wav)
└── test_set
    ├── bird (1000 .jpg /  0 .mp4 / 3 .wav)
    ├── cat  (1000 .jpg /  0 .mp4 / 3 .wav)
    └── dog  (1000 .jpg /  0 .mp4 / 0 .wav)

This work was produced as part of a 2 week industry mini-project in collaboration with DataSig and supervised by Dr Mohamed Ibrahim. Presentation.

Name		Name	Last commit message	Last commit date
Latest commit History ~~14 Commits~~
LICENSE.md		LICENSE.md
README.md		README.md
full_architecture.png		full_architecture.png
full_architecture.py		full_architecture.py
test_interface.py		test_interface.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MultiSig

About

Releases

Packages

Languages

License

lorenzolucchese/multisig

Folders and files

Latest commit

History

Repository files navigation

MultiSig

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages