KNIME logo
Contact usDownload
Read time: 6 min

How to create and send daily news digest with GenAI and KNIME

Leverage KNIME and its AI extension to compile a concise press review and automate distribution

January 30, 2025
ML 201 & AI
Summarize world news blog header
Stacked TrianglesPanel BG

Keeping up with world news can feel overwhelming, whether you’re a finance professional tracking market trends or simply someone wanting a quick grasp of current events. This tutorial shows you how to build an AI-powered news summarizer that automatically curates concise, relevant updates from trusted sources. The workflow also delivers the summaries straight to your inbox.

Learn how to save time and stay informed by automating the generation of personalized news digests using Generative AI and KNIME Analytics Platform. KNIME is a free and open-source data science tool that lets you build applications using visual workflows—no extensive coding required.

This blog series on Summarize with GenAI showcases a collection of KNIME workflows designed to access data from various sources (e.g., Box, Zendesk, Jira, Google Drive, etc.) and deliver concise, actionable summaries.

Here is a 1-minute video that gives you a quick overview of the workflow. You can download the example workflow here, to follow along as we go through the tutorial.

Let's dive right in.

Automate news summarization and distribution

Our goal is to automate the creation and distribution of a daily news digest. Thereby, solving the challenges of retrieving and summarizing diverse and lengthy content from multiple sources.

We can do this in three steps:

  1. Access data by scraping world news from trusted public broadcasters and filtering the top five stories per outlet
  2. Prompt an LLM and use GenAI to create clear, concise summaries
  3. Deploy results by compiling these summaries into a structured report and automate the email distribution

This provides an efficient and personalized way to stay updated-to-date on global events.

The workflow that creates and sends a press review per email.
The workflow that creates and sends a press review per email.

Step 1. Access data: Scrape world news from PBS and BBC

The section of the workflow that scrapes world news from each broadcaster.
The section of the workflow that scrapes world news from each broadcaster.

In our example, we pick the American PBS and British BBC as our broadcasters. Using KNIME, we build two custom components to scrape data from their websites.

We use the Webpage Retriever node to fetch articles from the PBS World News page. The node outputs an XML file, a structured format that organizes webpage data in a hierarchical layout, making it easier to identify and extract specific elements.

To parse the XML file, we use the XPath node, which extracts key elements such as the article’s URL, body text, title, and publication date. From these, we select only the top five trending articles as displayed on the broadcaster’s website.

A similar approach is applied to scrape the world news from the BBC World News website, resulting in a structured table containing similar information for this broadcaster. Next, we use the Concatenate node to merge the tables from both sources, creating a single dataset of ten articles ready for summarization.

The resulting table with the scraped world news.
The resulting table with the scraped world news.

Step 2. Prompt LLM: Summarize news with OpenAI’s GPT-3.5-turbo

The section of the workflow that uses GenAI to get news summaries.
The section of the workflow that uses GenAI to get news summaries.

To summarize the scrapped news, we use the KNIME AI extension and select the most suitable LLM for the task, balancing both cost and performance. For example, OpenAI’s GPT-3.5-turbo is a good option, but other alternatives are also possible, including open-source, local models. 

To set up the connection, follow these steps: 

  1. Input the OpenAI API key in the Credentials Configuration node 
  2. Authenticate to the service with the OpenAI Authenticator node
  3. Use the OpenAI Chat Model Connector node to connect to the GPT-3.5 model

Once the connection to the model is established, we can begin engineering a prompt using the Expression node.

When crafting prompts, it’s often helpful to assign a persona to the LLM. This involves explicitly instructing the model on how to behave. In our case, we request the LLM to act as a helpful assistant with knowledge of world news. We also set a maximum response length of 100 tokens to ensure the summaries remain concise.

To provide the model with the necessary context, we automatically reference the columns containing the articles’ title and body within the prompt.

Keep in mind that prompt engineering is an iterative process and you may need to refine the prompt several times to get the best results. Here’s the prompt we used:

join("\n\n", “You're a helpful assistant with knowledge of world news. Summarize the following texts as a concise paragraph of max 100 words. Make sure each summary includes info about: what, when, who, where, why, and how: ”, $["Title"], $["Body"])

The LLM Prompter node queries the LLM to generate summaries that are both concise and comprehensive, providing us a clear overview of each news article without losing essential information.

Step 3: Deploy results: Create a press review report and send it per email

To format the news and compile them into a static report, we use the KNIME Reporting extension along with the component’s composite view.

Within the “News Summaries” component, we post-process the LLM responses and design the layout to combine the summarized news articles with an introductory heading, ensuring the press review report is both engaging and well-structured. The component also uses the Report Template Creator node, which allows us to define the page size and orientation of the final report. 

To automate the distribution of the report, we use KNIME Email Processing extension, which allows us to connect to email servers, manage email movements and send emails. In this example, we simulate sending the report. To do that, we first build a “Fake Email Inbox Setup” component which generates a test inbox with fictitious emails.

Next, we connect to the chosen email provider using the Email Connector (Labs) node. It requires proper authentication credentials for secure access, and configuration for incoming/outgoing emails.

Finally, the Email Sender (Labs) completes the process by delivering the report to recipients. The node also allows us to add a subject line and format the email content in rich text. The generated report is automatically attached without manual intervention, thanks to the second blue input port, which dynamically attaches the file.

In this section of the workflow, the report is formatted and sent per email.
In this section of the workflow, the report is formatted and sent per email.

The result: The hottest world news delivered every morning

The world news summaries are compiled into a clear, well-structured report, providing an efficient way to review the top highlights. For each broadcaster, the press review report displays the news title, date and time, a concise summary and hyperlink to the full article for further reading.

The press review report compiled by the workflow.
The press review report compiled by the workflow. By scrolling down, we can see all ten news articles summarized and divided by broadcaster.

Optionally, we can enhance the appearance of the news in the report by adding a black frame around the articles. To do this, navigate to the component’s "Advanced Composite View Layout” tab and add the CSS style command on the “content” element:  

 “additionalStyles”: [“border: thin solid black;”]

KNIME workflow automation is available through a KNIME Team plan on the KNIME Community Hub. You can automate your workflows by scheduling it to run at specific times or intervals. The Team plan offers flexible, usage-based pricing with no long-term commitment, allowing users to scale usage as needed and avoid the cost of full-time licenses.

For example, every morning at 8:00 AM, so we can enjoy our morning coffee and stay informed about global events.

GenAI for summarization in KNIME

In this article from the Summarize with GenAI series, we explored how you can use KNIME and GenAI to automate the summarization of world news, staying informed and saving time by receiving concise updates directly in your inbox. 

You learned how to:

  • Scrape and parse news articles from different news outlets
  • Use the KNIME AI extension to summarize news articles
  • Compile summaries into a report and distribute it via email