Topics In Demand
Notification
New

No notification found.

Creating Images Became Easy with Generative AI
Creating Images Became Easy with Generative AI

February 25, 2023

312

0

In this blog, I will cover a few technical aspects of generative AI which is used in image generation along with some interesting tools which you can use for generating the images.

The most important generative model is Generative Adversarial Networks (GANs). It has been first invented in 2014 by Jan Goodfellow and his colleagues at the University of Montreal. GAN is an unsupervised machine-learning technique that is used to discover patterns in the inputs and help generate similar outputs. It consists of 2 neural networks – Generator and Discriminator.

Look at the image below to understand it further.

Generator Block – It generates new images similar to the original images by taking noise as input

Discriminator Block – It takes both the inputs from the original dataset and generated images & then does a binary classification into real (1) or fake (0) images

 

Generative Adversarial Networks Architecture

 

The model is adversarial because both neural networks are against each other. The generator generates fake images and the discriminator bifurcates the same. The GAN will be considered successful once the generator block generates an image that is able to fool the discriminator block as a “real” image. But then the discriminator needs to update and learn as well. This is an iterative process to make the model more robust and versatile.

Similarly, we have Variational autoencoder (VAE), Diffusion models, etc as generative models.

I concluded my last article with a tool recommendation for text-to-picture conversion. Now, let’s understand some more tools available in the market to play with the images –

  1. GoArt for changing the style of an original image. It is used to create NFTs (Non-fungible Tokens) by transforming the original images into paintings which will provide ready-to-use NFTs within just a few clicks.
  2. DeOdlify.ai uses diffusion models to colorize the black-and-white images and provide amazing results.
  3. ImageNet is an online database for multiple categories of images providing free data which can be used for non-commercial purposes.
  4. GauGAN is a model that takes semantic segmentation as input and converts sketches into real-life beautiful images as output. Not only this, but it also converts text into images, for instance, you can type “mountains with a sunset” and it will generate different images.
  5. DALL-E is another tool for the conversion of text into images that uses a transformer language model which takes both images and text as a single stream of data.
  6. GLIDE is the newest model for image generation built by OpenAI that is based on the diffusion model.

All the above-mentioned tools are open sources and can be learned. If you are interested, you can deep dive into the GitHub codes or Google Colab notebooks as well to understand the evolution of these tools in detail.

Are you ready to tell some new visual stories?


That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.


images
Unnati Kohli
Technical Storyteller

Unnati is currently pursuing the flagship PGDM program at Management Development Institute, Gurgaon (2022-24). Now, she is a Strategy & Analytics Intern at Deloitte USI. Prior to this, she has 37 months of work experience as a Data Analyst in TCS in Telecom Network Domain. She also worked as a Technical Storyteller with Bloggers Alliance.

© Copyright nasscom. All Rights Reserved.