How to use GenAI for image generation the no-code way

Food safety and AI

Image recognition technology is revolutionizing industries from healthcare to retail. In food safety, it plays a crucial role in identifying spoiled or contaminated items, ensuring that what reaches our plate is safe to eat. However, training these models requires balanced, robust and diverse datasets, which are often unavailable in the agrifood sector due to its inherent seasonality and dependence on propitious weather conditions.

To work around this issue, data scientists have traditionally adopted two main approaches:

Image Augmentation Techniques: This includes image transformations such as flipping horizontally or vertically, random cropping, rotation augmentation, translation (shifting images left/right/up/down), noise injection, color distortion, and more. These methods are relatively easy and inexpensive to implement but are not well-suited for situations where you don't have images to begin with.
Synthetic Image Generation: This approach involves creating images from scratch, for example, using deep learning architectures like Generative Adversarial Networks (GANs). While this method can generate new images, it is much more complex to implement, requires high computational resources, long training time, and large training dataset, making it a rather expensive process.

Food safety and now GenAI

The recent rise of advanced and multimodal GenAI systems can now offer an easy, quick and effective solution to cope with the shortage of image data. Concretely, this means prompting an image generation model, such as OpenAI’s DALL-E or Stability AI's Stable Diffusion, to artificially create synthetic images that look as realistic as real images.

Working with such models opens up a lot of opportunities. In just a few steps and with the right prompts, we can get highly customizable images for a relatively low price. These models allow users to generate images tailored to specific needs, reducing the time and cost associated with traditional methods like photoshoots or purchasing stock images.

In this article, you will learn how to leverage DALL-E in KNIME Analytics Platform for generating images of edible and inedible apples. The generated images could then be used to augment a previously existing dataset (or create a new dataset from scratche) and train or fine-tune an image classifier to enhance food safety. In our example, we’ll use KNIME's AI Extension.

DALL-E is a 12-billion parameter version of GPT-3, specifically designed for image generation from text descriptions. It utilizes a transformer-based architecture to encode textual input and decode it into pixel data.

KNIME is a low-code data science tool with over 300 connectors to data sources and integrations to all popular machine learning libararies. It's open source which means it's free to download and use right away.

Open source

Download KNIME Analytics Platform

Generate a dataset of inedible apples

To train an effective image classifier, we need images of both edible and inedible apples. Finding pictures of perfect apples is easy, but what about those that are spoiled or deformed?

Using DALL-E in KNIME, we can easily automate the generation of these images.

An overview of the worflow to automate the generation of images with DALL-E

Let’s dive into the step-by-step process.

Step 1: Authenticate to OpenAI

First, we authenticate to OpenAI service using an API key. You can obtain OpenAI credentials for your account at: https://openai.com/. Additionally, make sure your account has DALL-E credits.

In KNIME, we can authenticate to the service using the Credentials Configuration and the OpenAI Authenticator nodes. Type your API key in the “password” field of the Credential Configuration node and pass on the output credentials flow variable to the OpeAI Authenticator node.

The first step of the workflow where an OpenAI API key is provided to access the model.

Step 2: Create a loop

Next, we use the Counting Loop Start node to set the number of images you want to generate. For this example, we want three images of inedible apples, so we set the counter to three.

The configuration window of the Counting Loop Start node. Here, the number of loops is set to 3.

Step 3: Prompt DALL-E

Now comes the core element of the workflow: the OpenAI DALL-E View node.

This node features a user-friendly interface where the prompt is written directly within the node, unlike other nodes in the extension. It allows for comprehensive customization options, including image size, quality, and style adjustments. For pricing information, please see OpenAI's pricing page.

The node offers a convenient preview of the generated image next to the configuration options and features an image output port.

Configuration-window-OpenAI-DALL-E-KNIME-node

The configuration window of the OpenAI DALL-E node.

Finding the right prompt for DALL-E to generate what we need is the most creative and exciting part. There are several best practices for prompt engineering, as well as a few common pitfalls to avoid to ensure good outputs. In general, the golden rule is to provide instructions in a clear, specific and descriptive way, avoiding redundancies and vagueness.

For our example, the prompt we used is the following: “Generate the image of a real apple that is heavily bruised and deformed. The apple is not edible.”

Additionally, we set the image quality to “standard” and the style to “natural” to ensure a more realistic output.

Step 4: Collect images

To organize the generated images into a table, we use the Image to Table node. This node ensures each image is neatly placed in a table row, ready for analysis.

Step 5: End the loop and save the table

The Loop End node concludes the process after generating the desired number of images. In KNIME, looping control nodes are typically light blue, making them easy to identify.Lastly, we can save the generated images using the Image Writer (Table) node.

Output table after 3 iterations of the prompt

The output table created after three iterations of the same prompt.

The power of no-code GenAI for image generation

With these simple steps, we can easily automate the creation or expansion of datasets necessary for food safety applications by combining the capabilities of DALL-E with the user-friendly KNIME Analytics Platform. What’s best is that we can do that without the need for extensive resources or technical expertise.

The generated dataset can then be used to train image classifiers that ensure food safety by identifying and filtering out spoiled apples before they reach consumers. The ability to generate such specific images on-demand can significantly enhance the robustness of your models and opens up new avenues for improving food safety standards

Note that when it comes to GenAI, not all that glitters is gold, and caution is necessary. Working with image generation may involve potential copyright infringement, the possibility of unintentionally creating inappropriate content, risks of image distortions, reliance on third-party services, and susceptibility to disruptions in API-based interactions.