AI Notes

notes on AI state of the art, with a focus on generative and large language models. These are the "raw materials" for the https://lspace.swyx.io/ newsletter.

This repo used to be called https://github.com/sw-yx/prompt-eng, but was renamed because Prompt Engineering is Overhyped.

This Readme is just the high level overview of the space; you should see the most updates in the OTHER markdown files in this repo:

IMAGE_GEN.md - the most developed file, with the heaviest emphasis notes on Stable Diffusion, and some on midjourney and dalle.
TEXT.md - text generation, mostly with GPT3
CODE.md - codegen models, like Copilot
stubs - very small/lightweight proto pages
- AGENTS.md - tracking "agentic AI"
- AUDIO.md - tracking audio (transcription + generation)

Table of Contents

Motivational Use Cases
Top AI Reads
Communities
People
Misc
Quotes, Reality & Demotivation
Infrastructure
Legal, Ethics, and Privacy

Motivational Use Cases

images
video
- img2img of famous movie scenes (lalaland)
  - img2img transforming actor with ebsynth + koe_recast
- virtual fashion (karenxcheng)
- seamless tiling images
- evolution of scenes (xander)
- outpainting https://twitter.com/orbamsterdam/status/1568200010747068417?s=21&t=rliacnWOIjJMiS37s8qCCw
- webUI img2img collaboration https://twitter.com/_akhaliq/status/1563582621757898752
- image to video with rotation https://twitter.com/TomLikesRobots/status/1571096804539912192
- "prompt paint" https://twitter.com/1littlecoder/status/1572573152974372864
- audio2video animation of your face https://twitter.com/siavashg/status/1597588865665363969
- music videos
  - video killed the radio star, colab This uses OpenAI's Whisper speech-to-text, allowing you to take a YouTube video & create a Stable Diffusion animation prompted by the lyrics in the YouTube video
  - Stable Diffusion Videos generates videos by interpolating between prompts and audio
- direct text2video project
text-to-3d https://twitter.com/_akhaliq/status/1575541930905243652
- https://dreamfusion3d.github.io/
- open source impl: https://github.com/ashawkey/stable-dreamfusion
- demo https://twitter.com/_akhaliq/status/1578035919403503616
text products
Jasper
gpt3 email https://github.com/sw-yx/gpt3-email
gpt3() in google sheet 2020, 2022 - sheet
https://www.summari.com/ Summari helps busy people read more
sequoia market map https://twitter.com/sonyatweetybird/status/1584580362339962880
base10 market map https://twitter.com/letsenhance_io/status/1594826383305449491
- game assets - emad thread https://twitter.com/EMostaque/status/1591436813750906882

Communities

StableDiffusion Discord https://discord.com/invite/stablediffusion
https://reddit.com/r/stableDiffusion
Akhaliq Discord: https://discord.gg/nYqfg4gnBt
Deforum Discord https://discord.gg/upmXXsrwZc
Lexica Discord https://discord.com/invite/bMHBjJ9wRh
Midjourney's discord
- how to use midjourney v4 https://twitter.com/fabianstelzer/status/1588856386540417024?s=20&t=PlgLuGAEEds9HwfegVRrpg
https://stablehorde.net/

People

This list will be out of date but will get you started. My live list of people to follow is at: https://twitter.com/i/lists/1585430245762441216

Misc

Whisper
- https://huggingface.co/spaces/sensahin/YouWhisper YouWhisper converts Youtube videos to text using openai/whisper.
- https://twitter.com/jeffistyping/status/1573145140205846528 youtube whipserer
- multilingual subtitles https://twitter.com/1littlecoder/status/1573030143848722433
- video subtitles https://twitter.com/m1guelpf/status/1574929980207034375
- you can join whisper to stable diffusion for reasons https://twitter.com/fffiloni/status/1573733520765247488/photo/1
- known problems https://twitter.com/lunixbochs/status/1574848899897884672 (edge case with catastrophic failures)
textually guided audio https://twitter.com/FelixKreuk/status/1575846953333579776
Codegen
- CodegeeX https://twitter.com/thukeg/status/1572218413694726144
- https://github.com/salesforce/CodeGen https://joel.tools/codegen/
pdf to structured data https://www.impira.com/blog/hey-machine-whats-my-invoice-total
text to Human Motion diffusion https://twitter.com/GuyTvt/status/1577947409551851520
- abs: https://arxiv.org/abs/2209.14916
- project page: https://guytevet.github.io/mdm-page/

Quotes, Reality & Demotivation

Narrow, tedium domain usecases https://twitter.com/WillManidis/status/1584900092615528448?s=20&t=aV0Np-2Sx-zq-TQNn2y5AQ
antihype https://twitter.com/alexandr_wang/status/1573302977418387457
things stablediffusion struggles with https://opguides.info/posts/aiartpanic/
New Google
- https://twitter.com/alexandr_wang/status/1585022891594510336
New Powerpoint
via emad
Appending prompts by default in UI
DALLE: https://twitter.com/levelsio/status/1588588688115912705?s=20&t=0ojpGmH9k6MiEDyVG2I6gg

Infrastructure

bananadev cold boot problem https://twitter.com/erikdunteman/status/1584992679330426880?s=20&t=eUFvLqU_v10NTu65H8QMbg
replicate.com
banana.dev
huggingface.co
lambdalabs.com
astriaAI
cost of chatgpt - https://twitter.com/tomgoldsteincs/status/1600196981955100694
- A 3-billion parameter model can generate a token in about 6ms on an A100 GPU
- a 175b param it should take 350ms secs for an A100 GPU to print out a single word
- You would need 5 80Gb A100 GPUs just to load the model and text. ChatGPT cranks out about 15-20 words per second. If it uses A100s, that could be done on an 8-GPU server (a likely choice on Azure cloud)
- On Azure cloud, each A100 card costs about $3 an hour. That's $0.0003 per word generated.
- The model usually responds to my queries with ~30 words, which adds up to about 1 cent per query.
- If an average user has made 10 queries per day, I think it’s reasonable to estimate that ChatGPT serves ~10M queries per day.
- I estimate the cost of running ChatGPT is $100K per day, or $3M per month.

Legal, Ethics, and Privacy

NSFW filter https://vickiboykis.com/2022/11/18/some-notes-on-the-stable-diffusion-safety-filter/
On "AI Art Panic" https://opguides.info/posts/aiartpanic/
Yannick influencing OPENRAIL-M https://www.youtube.com/watch?v=W5M-dvzpzSQ
art schools accepting AI art https://twitter.com/DaveRogenmoser/status/1597746558145265664

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
.github/workflows		.github/workflows
.obsidian		.obsidian
stub notes		stub notes
AUDIO.md		AUDIO.md
IMAGE_GEN.md		IMAGE_GEN.md
IMAGE_PROMPTS.md		IMAGE_PROMPTS.md
LICENSE		LICENSE
README.md		README.md
TEXT.md		TEXT.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Notes

Motivational Use Cases

Top AI Reads

Communities

People

Misc

Quotes, Reality & Demotivation

Infrastructure

Legal, Ethics, and Privacy

About

Uh oh!

Releases

Packages

Languages

License

lpmelau/ai-notes

Folders and files

Latest commit

History

Repository files navigation

AI Notes

Motivational Use Cases

Top AI Reads

Communities

People

Misc

Quotes, Reality & Demotivation

Infrastructure

Legal, Ethics, and Privacy

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages