KNIME logo
Contact usDownload
Read time: 15 min

Three KNIME veterans share expertise at a DataHop

On automation, low-code, and the future of AI

May 13, 2024
Data literacy
KNIME-DataHop
Stacked TrianglesPanel BG

How far can you go with low-code, what does it mean to do more with less, are you a dreamer, a tinkerer, or a data science maker? Have you seen Terminator and what does that have to do with AI anyway? Get three KNIME veterans in the same room at a KNIME DataHop and you'll find out.

KNIME DataHops are our new roadshow event series where we come to your town to bring together data science peers and experts, discuss best practices for working with data, hear customer use cases, and learn about industry trends in data science, analytics, GenAI, and connect. (Next stops are London on June 27 and Frankfurt on July 25.)

At the KNIME DataHop in Cincinnati, Kyle Thompson, scientific content engineer at CAS, Rachel Ambler, director of database and business intelligence systems at Pure Romance, and Evan Bristow, senior prinicpal analyst at Genesys took to the stage for a panel discussion. They discussed strategies to effectively bring skeptics on board to data science with the transformative power of low-code solutions, what adopting an open-source tool meant for their careers, and the future of data science and AI.

Watch the video or read the write-up below.

Rosaria: Let’s start from the beginning. From the title. The title of this panel is “Do more with less”. What does “Do more with less” mean to you, in your profession? And what does this have to do with adopting KNIME as your tool of choice?

Kyle: About five years ago, I encountered KNIME for the first time when at my company they asked me to perform some data analytics, suggesting to use KNIME because of its versatility. I had no background in data science but now I’m at the point where my boss can say “Hey, can we do this?” and I jump into KNIME and figure out some way to accomplish it, no matter what that task is. 

Rachel: We started off just with SQL Server, so everything lived in SQL and that was easy to pull into somewhere else, mainly Tableau. Once we started growing, we were pulling data in from different data sources, like Shopify and CRMs, and we were using REST API endpoints to get the data straight into SQL or PostgreSQL. The amount of data increased and our ability to store it inside any data warehouse where it was usable by the company was getting harder and harder. We basically ended up becoming a bottleneck for the analysts.

Getting a tool like KNIME made all the difference because we were no longer the people responsible for slowing the company down. We have enabled the other side of the organization to pivot a lot faster. Thanks to KNIME, my job has also changed. I no longer think of how to support the business in data logic, but how I support the business in getting access to the data that is needed by the business so that an analyst can actually process it with KNIME.

Evan: KNIME allows you to “do more with less” because it provides repeatability, extensibility and replicability.

  • Repeatability: It may take more initial investment upfront, but if I do a task today and put that task into a workflow, then I save hundreds of man hours on doing that task. 
  • Extensibility: You can take a workflow that has already been built and just extend that to the new question at hand. One of the things that's really great about KNIME is that you can take something that you made for one purpose, chop a workflow off about halfway, and then build off a new workflow that answers yet another question.
  • Replicability: With KNIME, I can actually look at what I've made or look at what someone else has made, and use that as an analogy for my dataset. I can't remember exactly how to do everything all the time, so I go to the KNIME Community Hub and look at a workflow that somebody has already made and use that as inspiration, replicating what they've done in that workflow, using my own data. 

Rosaria: Data science is often seen as a non-creative work, as a single area of specialization. But this is not true, there are many nuances in our job. Tell us more about your job and how you see yourself in it: a dreamer, a maker, an organizer, or something else. And how has that evolved in the years working in data science?  

Rachel: I’m definitely a dreamer and I have been one my entire life. I fell into computers in the first place because I jumped. I wanted to earn money and I had no idea what I was going to do, and so computers sold it for me. All my life, I've been dreaming about different ways to do things, without a clear idea of how I was going to achieve that goal.

One dream I had was connecting my colleagues to the data they needed without me having to do all the work. KNIME was the huge puzzle piece that I didn't know was missing to make my dream come true. Now the people within the company, who are far better at understanding data than I am, can smoothly access it. 

Kyle: I would consider myself a tinkerer. I like to work through things, I enjoy troubleshooting and, at this point, I have built up my own set of tools shaped by my “trial and error” approach. I try something out for the first time, I face initial failures, learn from them and eventually I get something to work. I'm getting to a point in my KNIME journey where I can start sharing my knowledge with other people because I've taken the time to figure things out by myself.

Evan: Because of what I do, I think I’m a bit of all three. Our group sits in the organization below the senior-level sales team and we might receive a question from them asking “how long does it take for a deal to close?”. To answer such a question, you need to interpret it and to understand what the bottom line really is. Nobody really cares how long it takes for a deal to close, so you need to infer what their motivation behind that is. Do they want to get rid of deals in the pipeline that are taking too long? Do they want to estimate when deals are actually gonna land? 

Once you've got that well defined, you have your data repositories, and that’s where KNIME comes into action. It allows you to take data from different sources and manipulate it to be able to answer the question at hand. It allows you to take a path towards, enabling you to provide data-driven feedback within your organization. 

Rosaria: You are all working with data as data analysts, data engineers or data scientists. People often have no idea of the fantastic solutions you can create based on data. Some data analysts also do not know how sophisticated and flexible a low code tool like KNIME can be. So, tell us about the project in your career that you have been most proud of, your biggest success. Show us how far you went with a low-code tool like KNIME.

Evan: At Genesys, we wanted to figure out how many of the top 1000 global companies were our customers and to do so, we used the Geospatial Analytics Extension for KNIME. I found the list of the top 1000 companies online, and I used the Webpage Retriever node to crawl the web pages containing company information, such as addresses and names. Next, I used the geospatial nodes to geocode those locations. I then looked through our database and compiled a list of both account names and addresses of our customers. I proceeded with fuzzy matching between the name and address in the top 1000 list and our database. As a further step, I also did geocoding on our addresses. The result was a list containing the possible candidates for companies that were originally in that top 1000 list, similar both in terms of the string matching and geographical position to the ones in our database. That was a really neat tool and exercise to do.

Rachel: We have two different Tableau servers, one for internal and one for external reporting. If you have been using the standard KNIME integration for publishing to Tableau, you might agree it's not exactly the swiftest thing on the block. When you've got two Tableau servers and you're publishing the same datasets to it, the amount of time to publish on two different servers is enormous. We decided to go back to the drawing board and found a Python library that handles publishing to Tableau a lot better. It understands the Tableau API, so we are now able to publish to two servers using the Python Script node a lot faster than we could publish to one server using the existing Tableau publisher. Now we are able to run these loads once an hour as opposed to once every three or four hours.

Kyle: The thing I’m most proud of is creating a usable SQL database using KNIME, with only a year of experience working with data. We had a couple hundred structured data files (SDFs), a file type that transform molecule properties into machine readable information, from a database we had purchased and they remained untouched for two years because no one had structured them. Some customers were interested in that kind of data so I used KNIME to build, query and analyze that database, without any knowledge of SQL.

Rosaria You have all been using KNIME for a long time. It must have been because you found an advantage. How did KNIME help your career? Let’s start with Kyle. I heard you recently got a promotion, is that right? 

Kyle: Yes indeed, I got a promotion and moved to a different department. Since I had been helping out various people with troubleshooting their KNIME workflows, I jokingly told my current boss that if she ever found some extra money in her budget, I could just do that full time. Long story short, I’m now working with KNIME almost full time, offering technical support to colleagues. 

Rachel: We inherited our data warehouse eight years ago and it is still an absolute mess due to technical debt. It can only be described as a multidimensional nightmare and as such, making changes to this thing is painful in the extreme, since we had a huge amount of business logic embedded within those tables. Since we adopted KNIME, we are still running the data warehouse with the same data, but now the analysts are able to pick out the bits they need. If they want to start putting in data from Shopify to Google Analytics, they make their own calls to Google Analytics straight from within KNIME. It's now reaching the point where we can finally redesign the data warehouse from scratch to make it fit for purpose, which hopefully then will give us the next level of speed.

Evan: KNIME was important for my career because, as I mentioned before, I’m being thrown questions with very vague requirements and being able to provide an answer that is backed by data is central to me. A second aspect that is very relevant for me is that KNIME allows for rapid prototyping. I can use it to rapidly prototype an SQL query that I will end up building and standardizing in a view in Snowflake. For each step in the query, I can see the intermediate results. I might start off by making a series of connections, pulling in the data, and seeing exactly how I want to do it. That just helps me get my mind wrapped around what I want when I actually stop and go back and write the SQL view in Snowflake. If you're using it for industrial analysis, you don't have to do that, but if you're using it to create a curated resource for someone else, that really helps. It facilitates a lot of things that don't necessarily end up making it into production, but it can influence the things that end up going into production.

Rosaria: You have a long experience in the data field. Not only technical experience, but also organizational and communication experience. Do you have a word of advice on how to convince colleagues on the use of a low-code tool like KNIME?

Evan: The ability to convince someone to adopt KNIME really depends on their role in the company. I found that folks in Operations tend to be more receptive because you're saving them time and effort. If you show them how easily they can pull something out of Salesforce and do some lacing with other data that they have in Excel, they're much more willing to give the new tool a try. They might know Excel but they are not experts, unlike people in finance that live and breathe Excel every day and are definitely harder to win over.

Another way to get people on board is showing them how easy it is for them to repurpose a workflow you’ve created for their analysis. Let’s say there is an analytics team that focuses on Territory and Quota. I can cut off the part that is specific to me and then share it with them. They will still need to authenticate to the SQL server themselves, but they start with a leg up since the basic structure of what they need to build off of is already there.

Rachel: KNIME can be downloaded for free so you can hit the ground running without the permission of the financial department. Also, it's Java-based so, technically speaking, it can be installed without admin permissions, and offers a wide array of nodes that can tackle any data problem. If businesses decide that they want to invest in this tool, they can start off by accessing already-made workflows on the KNIME Community Hub, adjust them to their specific tasks, and eventually productionize them with ease on the commercial KNIME product.

Kyle: In the department I was working before, the focus was on getting the job done so we had total freedom on the tools we could use. That was the perfect playground to experiment with a tool like KNIME. I find that it's really helpful for people that are still beginners to be presented with a use case. We talked about how KNIME is a great tool, but if I cannot use it right away for a specific purpose, I’ll probably forget about it. Once I know I can solve even a minor problem with it, I’m much more likely to adapt to it.

Rosaria: The world of data science is constantly evolving. Every few years or even every few months there is a new disruptive improvement in the field. The talk of the day is Large Language Models and AI technology. Are you using or planning to use the new KNIME nodes for AI? If yes, to do what?

Kyle: Currently, we are not. My company, though, has 125+ years of structured chemical data and that's ripe for AI use, so whenever we have a use case, we’ll absolutely try them out.

Rachel: Not at the moment. AI is a vast field, but for a lot of companies, when they say they are interested in AI, they are usually looking at performing statistical analysis. The latter, no matter how useful, is hardly going to move the needle for complex problems. My idea is to start looking into how to perform sentiment analysis using AI in KNIME. For example, we have around fifty thousand consultants around the world and they may raise tickets in the CRM, so it would be nice to start doing some sentiment analysis on these tickets to sense whether there is any negative feeling that is going around, and this approach could be replicated also for our social media. 

Evan: Has anybody seen Terminator? This is a slippery slope. I always have this philosophical discussion with people in the analytics area about the influence of AI. On the one hand, something like the KNIME AI Assistant can be a great facilitator, especially for someone who is just getting started and needs help building workflows. On the other hand, AI can perform tasks but it doesn't necessarily understand them. You always need somebody who understands the business to be able to build and create those curated datasets. Never forget that AI works because it has examples, and if these are fundamentally changing in features you will need someone in the driver's seat to be able to test those assumptions and make the necessary treatments. One example could be how to deal with old data: shall I incorporate the data on products that we no longer sell in the model? Or am I just adding noise? These are questions that AI cannot answer. 

Rosaria: Here’s my last question, what is the most useful KNIME feature?

Evan: Copy pasting nodes to recycle parts of my workflows or add other testing branches. I even made some suggestions on how to improve it in the forum. 

Kyle: The visual nature of workflow building. Seeing how things are connected, how flow variables parameterize operations, or inspecting exactly which node returns an error are all features that really make troubleshooting so much easier.

Rachel: The biggest things for me are accessibility and scalability across the team. I don't have to take care of all the work myself now. Other people can get easily onboarded, use the tool for their analysis independently, and maintain solutions smoothly.

More about the speakers

Kyle Thompson is a Scientific Content Engineer at CAS, a Division of the American Chemical Society in Columbus, Ohio. As a trained biochemist, he has spent 11 years at CAS building databases, creating custom solutions for customers, and performing data analysis. He has been using KNIME since 2019, and in that time he has become KNIME L1-L3 Certified, was part of the inaugural Train the Trainer program, and trained coworkers to swiftly integrate the program into their daily workflows.

Rachel Ambler is well-known to the KNIME community, since she presented in several panels and webinars in the past. She has worked in multiple verticals in her over 30-year career from Freight Forwarding, the British Royal Air Force and Army, Retail, Warehousing and Online Banking which have taken her across the globe including the UK, Europe, South Africa, Asia, and currently the US. She is currently the Director of Database and Business Intelligence Systems at Pure Romance in Cincinnati, Ohio, where she uses multiple tools to help drive the company forward with obtaining, sanitizing, and presenting data using technologies as diverse as PostgreSQL, MS SQL Server, multiple Kubernetes clusters, .net API Servers, Tableau, PowerBI, and of course, KNIME to allow Pure Romance to do more with less and providing faster better methods of data delivery and visualization.. 

Evan Bristow is a Senior Principal Analyst at Genesys in Indianapolis, Indiana, with both experience and education in leveraging analytic tools to identify key business drivers that aid in management decision making. Over the years, he has applied his expertise and technical skills to conduct competitive market analyses, provide methodological support for business-to-business customer relationships, and produce recurring and ad-hoc operations and financial reporting for organizations in the AMER region. His activity and constant dedication as the main supporter of the KNIME community group on Facebook earned him the title of KNIME Contributor of the Month in January 2021.