8000 GitHub - critocrito/models-all-the-way: A investigation by Knowing Machines and Der SPIEGEL into the LAION-5B dataset.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

critocrito/models-all-the-way

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Models all the Way Down/Wie das Weltbild einer künstlichen Intelligenz entsteht

A investigation by Knowing Machines and Der SPIEGEL into the LAION-5B dataset, a collection of over 5 billion image-text pairs. These are the data assets produced during the investigation:

7DB9 Domains

NSFW and licenses

Domain classifications

Histograms of similarity, watermark and unsafe attributes.

Breakdown by languages

Check for robots.txt and duplicates across the subsets.

A note on licensing

The Engelberg Center and Knowing Machines do not claim any rights in the data assets included in this repo. Therefore, the CC0 license attached to the repo only applies to the extent that there are new rights in this specific compilation, and to the text of this readme.

About

A investigation by Knowing Machines and Der SPIEGEL into the LAION-5B dataset.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  
0