Setup the Access token in hugging face
Hugging face gives us access to a stable diffusion model. Hugging Face is a platform and community that offers free and open-source machine learning datasets and models. Hugging Face is free, yet there is a paid tier as the models are open source. If you don’t already have one, you must make a profile on hugging face. You require a “Access token” after registering.
Steps to Setup the Access token in hugging face:
- Click on your profile icon
- Click on “Settings”
- Navigate to “Access Tokens” on the left tab
- You can either generate a new token or utilize an existing one.
- Copy the token
Image Generation
We Import a pre-trained Stable Diffusion model and is identified by modelid. The model is loaded using the pipe variable with the provided authentication token, and the encoder part of the model is deleted to save memory.
Using the Stable Diffusion model, we develop a generate() method that reads the text prompt from the entry field, creates an image, saves it, and modifies the image_label to show the updated image. When the “Generate” button is hit, the generate() function is supposed to be called.
from customtkinter import *
from PIL import ImageTk
import torch
from diffusers import StableDiffusionPipeline
auth_token = "Your Auth Token from Huggingface"
app = CTk()
app.geometry("500x500")
set_appearance_mode('light')
prompt = CTkEntry(app, height=30, width=350, font=("Arial", 15), text_color="black", fg_color="white")
prompt.place(x=10, y=10)
image_label = CTkLabel(app, height = 400, width = 400, bg_color = 'white', corner_radius = 15)
image_label.place(x=50, y=70)
modelid = "CompVis/stable-diffusion-v1-4"
device = "cuda" if torch.cuda.is_available() else "cpu"
pipe = StableDiffusionPipeline.from_pretrained(modelid, use_auth_token=auth_token)
del pipe.vae.encoder
def generate():
image = pipe(prompt.get(), guidance_scale=8.5).images[0]
image.save('generatedimage.png')
img = ImageTk.PhotoImage(image)
image_label.configure(image=img)
trigger = CTkButton(app, height=30, width=120, font=("Arial", 15), text_color="white", fg_color="#3DB7E4", command=generate)
trigger.configure(text="Generate")
trigger.place(x=370, y=10)
app.title('Text to Image')
app.mainloop()
The guidance_scale parameter is set to 8.5. In order to produce a picture that closely resembles the text prompt, the guidance scale regulates the strength of the guiding signal sent to the model. Images that more closely fit the prompt but might be less varied are produced by a higher guiding scale value. Once the code has been executed, give it some time to download the Stable Diffusion model prerequisites. Depending on your internet speed and system specifications, this could take a few minutes.
Note: This model will operate on the CPU if you don’t have access to a GPU. This will work, but depending on the capacity of your CPU, it might take longer to generate photos.
Output
Build an AI Image Generator App With Tkinter
Let’s take a brief look at the field of diffusion models, which are used to text to create images. Using a Markov chain, a diffusion model gradually adds noise to the data before reversing the process and creating the necessary data sample from the noise. Notable diffusion models are StabilityAI’s Stable diffusion, Google’s Imagen, and OpenAI’s DALLE2.