E-DALL-E: Creating Digital Art with Varying Aspect Ratios

You may have seen some images generated from text using DALL-E 2 [1] from OpenAI. Although the system is impressive and the results are incredible, it’s currently only available in a closed beta with a <a href="https://labs.openai.com/waitlist" rel="noopener ugc nofollow" target="_blank">waitlist</a>. However, you can access and run an independent text-to-image system called DALL-E Mini [2], led by developers Boris Dayma and Pedro Cuenca. Although the results are not as spectacular as the images generated by DALL-E 2, they are still excellent, and the source code and trained models are free and open-sourced. You can try it out in their ad-supported <a href="https://www.craiyon.com/" rel="noopener ugc nofollow" target="_blank">demo</a>. You may have noticed that the images generated by both DALL-E models use a 1:1 aspect ratio; the images are always dead square. The systems will not produce images in landscape or portrait formats, limiting their usefulness. However, I noticed that the image generator for DALL-E Mini uses the VQGAN model [3], which I know very well from <a href="https://robgon.medium.com/list/creating-fine-art-with-ai-73476c209de3" rel="noopener">several articles</a> I wrote on image generation. I also know that VQGAN can render images with varying aspect ratios. So I wrote a little code to take the output from the DALL-E models, or any image, and expand the aspect ratio using VQGAN guided by CLIP from OpenAI [4]. I call the system Expand-DALL-E, or E-DALL-E for short. You can run it on the Colab <a href="https://colab.research.google.com/github/robgon-art/e-dall-e/blob/main/E_DALL_E_Image_Expander.ipynb" rel="noopener ugc nofollow" target="_blank">here</a>. And be sure to check out the image gallery in the appendix. <a href="https://towardsdatascience.com/e-dall-e-creating-digital-art-with-varying-aspect-ratios-5de260f4713d">Read More</a>