email - apsanprofessional@gmail.com
github - APSan785
Progress Report
| 05/05/23 - 08/05/23 | → Generated images based on random prompts from Lexica.art → Understood Neural networks and basic ML stuff from links - Neural networks By 3b1b What Is Backpropagation? - medium First neural network for beginners explained (with code) | by Arthur Arnx | Towards Data Science → Read about Stable diffusion - https://jalammar.github.io/illustrated-stable-diffusion/ and tried to understand the maths behind it, but sadly couldn’t :( | | --- | --- | | 11/05/23 | Configured Colab Notebook from the link → https://www.youtube.com/watch?v=iMlMfrXJYSg&ab_channel=AmitThinks | | 15/05/23 | Completed Assignment 1 : https://dreambooth.github.io/?utm_source=buildspace.so&utm_medium=buildspace_project Uploaded 6 images for training but laptop camera is trash so the quality was not good Uploaded the model to hugging face but still some cells’ working are not clear My GDrive was also completely filled | | 19/05/23 | Read about Stable diffusion from links and their individual implementations with Python libraries: https://towardsdatascience.com/stable-diffusion-using-hugging-face-501d8dbdd8 https://towardsdatascience.com/stable-diffusion-using-hugging-face-variations-of-stable-diffusion-56fd2ab7a265 It contained explanation about CLIP, VAE, UNET Clip converts text to embeddings VAE encodes them to latents and passes them to UNET where noise is added and reduced in steps to get final latent. Then latent is decoded to image. | | 22/05/23 | Tried some video to video generation using Pix2Pix of Hugging Face. | | 30/05/23 | Using InstructPix2Pix for image to image editing based on prompts The results are not as consistent as were needed. | | 06/07/23 | Cross Attention Colab notebook Using Cross Attention to generate images and make prompt-to-prompt changes. Large-scale language-image models (eg. Stable Diffusion) are usually hard to control just with editing the prompts alone and can be very unpredictable and unintuitive for users. Cross Attention gives the access to set weight of any word (token) that is entered in the prompt. Also changing the seed |