Due Sunday 14/5/23

Finetuning Stable Diffusion using Dreambooth

primer on stable diffusion —>https://jalammar.github.io/illustrated-stable-diffusion/?utm_source=buildspace.so&utm_medium=buildspace_project

wtf is dreambooth? —>https://dreambooth.github.io/?utm_source=buildspace.so&utm_medium=buildspace_project

Beyond SD¹

You’re now well-versed with what Stable Diffusion is and how it works. Except it has one big problem - you can’t teach it new things. There’s no way I can give it 10 pictures of my dog and make it generate images of my dog on the bed so I can gaslight him into thinking he broke the rules.

This is where Dreambooth comes in. It's a machine learning technique that lets you generate photorealistic images of specific subjects in a variety of different contexts.

Stable Diffusion knows everything about the general world - what clouds look like, how bald Dwayne “The Rock” Johnson is, and what rainbows are made of. Using Dreambooth, you can teach it what you look like too!

How do Dreambooth + SD work together?

To use Dreambooth, we’ll have to give it some training data: a set of images of ourselves, or whoever we want to generate images of, along with a label (our name) and the class of objects the thing belongs to (human).

Dreambooth then fine-tunes a pre-trained text-to-image model to learn to recognize and generate images of the specific subject.

How does it work?

Unlike magic, this tech is even cooler when you know the trick.

We've got the OG Stable Diffusion model. That's not the special part. What's really going to make it ours is the set of input photos we'll use. Here's a simplified flow of how it'll work: