Approach-1 Using StableDiffusionPipeline
In approach-1 we will use simple Stable Diffusion pipeline using a pre-trained model open sourced by RunwayML.
Import required Libraries
Python3
import torch from diffusers import StableDiffusionPipeline |
Create Stable Diffusion Pipeline
In the StableDiffusionPipeline, the single-line process involves specifying the pre-trained model within the pipeline, where it internally loads both the model and scheduler. Additionally, to optimize computational efficiency, the inference is executed using floating-point 16-bit precision instead of 32-bit. Furthermore, the pipeline is converted to CUDA to ensure that the inference runs on the GPU.
Python3
pipe = StableDiffusionPipeline.from_pretrained( "runwayml/stable-diffusion-v1-5" , torch_dtype = torch.float16) pipe = pipe.to( "cuda" ) |
Define prompt and run Pipeline
Now, users have the flexibility to define their custom prompts and directly pass the text to the pipeline. The outcome will generate a list object containing the generated images.
Python3
prompt = "a horse racing near beach, 8k, realistic photography" image = pipe(prompt).images[ 0 ] image |
Output:
Great, I hope you got better results with your prompt. Let’s proceed with the final approach in this article.
Before you proceed to next approach, make sure to create a new notebook. Now again switch the runtime to GPU. If you run both the approaches in same notebook, it will hit memory and give you error.
Build Text To Image with HuggingFace Diffusers
This article will implement the Text 2 Image application using the Hugging Face Diffusers library. We will demonstrate two different pipelines with 2 different pre-trained Stable Diffusion models. Before we dive into code implementation, let us understand Stable Diffusion.