At the DataHop in New York, where data meets strategy, three data science experts who work in public utilities, HR, and marketing, were invited to discuss what the last mile in data science means to them. Meet the experts and watch or read the discussion to also find out their take on open source data science, the hardest data science challenge they had to solve, and tips on how they did it.
Charles Hersrud is a data scientist at Snohomish County PUD. He wears many hats in his current job: driving organizational change, designing the pipeline architecture, and integrating the right level of governance. His domain niche is in the utility industry for power scheduling and outage management systems.
Dawn Phipps celebrated her 36th Service Anniversary with Gore in October 2023. The breadth of her experience in technology at Gore is impressive. Today she works in Data & Analytics for Business Execution and provides insights for the HR department.
Michael Richter is a Business Intelligence professional with 20+ years in automotive and finance, excels in turning data into actionable strategies. Currently Director of Business Intelligence at Hennessy Automobile Companies, he's recognized for implementing cutting-edge BI solutions in collaboration with KNIME and enjoys sharing his insight in speaking engagements and articles.
Rosaria: What does “The last mile” mean for you, in your profession? And what does this have to do with adopting KNIME as your tool of choice?
Dawn: When data consumers get to me, they are usually struggling, looking for a solution to try to answer their data question. They may have multiple sources of data that they're trying to blend together and they might have already spent several hours getting to the point where they're at. When we talk about the last mile, it means getting the technology in place so that the user is provided with what they need.
Michael: For me, the last mile is getting all of this data and actually providing actionable insights. To explain this idea, I will give a quick example of what happened at Hennessy Automotive Group, the company where I currently work.
The Marketing team came to me and told me that they saw people come into the store, wanting to get a service but then leaving without getting the service performed. We call this “declined service”. Those are the least expensive customers to get back, and since Marketing cannot call up or send a flier to everyone, they wanted to concentrate their efforts on just the declined service customers. I was able to provide them with a list of customers and the declined services that have the highest conversion rate. The result? Marketing has seen 30% in return services, an astonishing accomplishment. To me, the last mile is a result that people don't argue with.
Charles: For me the last mile is the gap in between what happens with a proof of concept and what it means to actually bring something to production. You start putting data together in a vacuum, you develop a model, and you prove that there's value. Once you start productionizing, you have to make sure that all of our ducks are in a row in terms of metadata and lineage around the data –that's feeding the model.
Rosaria: I’m interested in your professional life. Tell us about your roles and what you do.
Dawn: I've been at Gore for 36 years and in IT since 2020. I started working with HR reporting and data analytics in 2014 and that's where I started working in reporting. Fast forward, we've organized into a Data Analytics team spread out all around the globe. I use KNIME almost every day to move data around, land it in different locations, figuring out how to connect it and blend it with other sources.
Michael: I am the Director of Business Intelligence at Hennessy Automotive Group. I had my own software company in the past, and before that I built rockets, so when I say that something isn't rocket science, I really mean it because I used to do that.
Hennessy had a vision to start using and collecting data, since they had data from multiple sources and providers. They wanted me to migrate all that data from Excel sheets into a database, so that it would be put into a BI system for reporting for user consumption. Before KNIME, we would have 20 to 30 people going into a source system and running the exact same report every single day, a colossal waste of time. We now use KNIME to automate and send emails, update databases, and to make sure that the data is right. We have an articulated monitoring system that checks not only whether workflows executed correctly, but also whether we got the data we were expecting to get.
Charles: At Snohomish County PUD, I work in a shared services group that provides BI analytics, data science and data engineering solutions. We manage the entire distribution grid for our area. When we brought KNIME on board, that was really to fill the gap of having a tool that could easily productionize data science workloads. There are some other data scientists sprinkled throughout the company, and KNIME gave us a platform where we could work together and elaborate on ideas from a common platform.
Rosaria: Data Science, and especially data engineering, is often seen as a non-creative work. But it is not true. There are many nuances to our job. Tell us more about your job and how you see yourself in it: a dreamer, a maker, an organizer, or something completely different?
Charles: I would like to think of myself as a dreamer. First of all, I was the one who brought KNIME to our organization. Second, in my daily job I always ask myself the question “can this be done in a better way?”. Because of that, I think I’m a dreamer.
Michael: I’m also a dreamer. Thankfully, the BI team at my company has gotten bigger so now we have a Power BI dashboard developer and a KNIME developer. They are responsible for the day- to-day smooth workflow execution and reporting. Now my main task is interacting with the stakeholders to understand their vision and how to conceptualize it. That’s why I consider myself a dreamer. However, I do like to get into the daily dashboards to see what is happening and what the trends are. If I lost the connection to the data, my interactions with stakeholders wouldn’t make sense anymore.
Dawn: I think I'm a doer or a crafter since I like to make things and see the outcome of what I'm making. I do a lot of testing with the data, ensuring that it's accurate. KNIME just gives us that flexibility to be able to do that really fast, testing and looking at the data to find whether we have a data quality issue and how we can solve it. My job touches very different areas like HR and IT, but also Business Analysis. What it all comes down to is understanding how the data got to a specific point, whether it is the system that drives that process or whether it is something internally that we customized to get that data there. In other words, having a clear idea of the business process.
Rosaria: What is the biggest challenge you’ve had to overcome – or are still trying to overcome – and how did you do it?
Michael: For me the biggest challenge was converting the organization to data culture. At first, it was challenging and it took a big time investment, since we all know that data is great, but if it's not right, it's not very useful, and it can actually be catastrophic.
I understood that my effort to shape a data-driven organization had paid off when I recently went on vacation for three weeks and because of some internal IT issue, the KNIME Server was down. My phone blew up because many colleagues were asking me where their report was.
Dawn: We needed to do some data integration work, and since I work closely with the HR department I have access to people data. The team that was working on this project estimated it would take around two months. I was able, thanks to KNIME, to cut the times down to just a week worth of work. I started working with the company to understand where the data was landing in the cloud, the endpoint this data was going to be living on, and the frequency at which I needed to load it there. I had the data engineer on our side looking at the data to make sure it was what they wanted. At the end, everyone was very impressed.
Charles: We have recently migrated to the cloud. One of the challenges was respecting traditional job roles, since there were developers doing things with infrastructure as code that maybe were traditional in the domain of a security architect, or a systems admin. You had to make sure that everyone was doing things properly and following good patterns, and a good security architecture. Another challenge arose when we transitioned from multiple data sources to a data lake. In that transition, you really have to slow down and be extra-careful about implementing good metadata around the original data source, since a new hire might need to interact with it as well.
Rosaria: You have all been using KNIME for a long time. It must have been because you found an advantage. How did KNIME help your work or career?
Dawn: We are in an HR system where the data is really raw and not pretty, so cleaning that data up and getting it into the hands of our folks that are really driving HR operations is not an easy task. KNIME is now doing all of our HR operational reporting that isn't in a programming logic. For example, grabbing data from the system and just sending it out was automated with KNIME. I’m talking about 50 reports that need to get to about 500 people.
Michael: KNIME did help my career unequivocally. The automotive group is a good size group, ten dealerships and luxury brands, but it does not have an unlimited budget. At first, they only had Excel as a tool, so I needed to get databases, an ETL tool, and a BI back-end. Looking at the magic quadrant for software in data analytics, I saw KNIME, gave it a try and found it very easy. To the extent that I could produce my first workflow the day I downloaded it. I found out that our source provider was consistently inconsistent with how they identified all of our stores, so if you went to service Lexus of Atlanta it might be LEX, ATL, or even LOA in another system. With KNIME, you can easily clean that up with a Rule Engine node.
When my CTO joined, he wasn't familiar with KNIME and we ended up doing a development challenge. It took SQL Server scripting experts two days to accomplish what I had created in a couple of hours.
Charles: Yes, it did help my career. We had other data integration software in my organization, but we did not have anything that equaled the same features that KNIME has. That is to say, being able to create a visual, easy-to-understand workflow that once it is scheduled, it just works. Even if you're somebody that works in a scripting language like Python or R, you can integrate that smoothly within KNIME.
A couple years ago, we had a foreman that was on a back road on his way up to work on one of our hydro dams and because of the wind, a tree crashed down leaving him severely injured in his truck. That's a really sad story, but we have a really happy ending because of KNIME. We were able to develop a model to predict what wind conditions would look like throughout our area. Now, not only KNIME executes this model but we can leverage it to alert people who are out on the field when there is potential for dangerous conditions.
Rosaria: An old colleague of mine used to say that when you implement a migration, you not only have to migrate data and scripts, but also the people. How did you manage to educate your colleagues around KNIME? And before that, how did you manage to convince your bosses to adopt KNIME?
Charles: Convincing my boss to adopt KNIME was pretty easy at the time, since we didn't have any tool that was tailored to scheduling or executing data science type workloads. When we talk about training beginners, I think that the courses are a great way to get somebody's brain ticking and see what they could accomplish with KNIME. However, I believe that you only really learn once you have a tangible use case and you have to sit down and actually work through a real problem.
Dawn: I totally agree with Charles. The best way to learn is just getting your hands dirty using the tool for a use case, maybe even with a little business pressure, as in “I needed this done by Friday, no later”. KNIME makes working with data very easy because you can download it for free in just a few seconds, install it, and search nodes to build your workflows using keywords. Because of its visual component, KNIME makes the process of working with data less foreign and abstract, especially when we are talking about someone who is not a data expert. We didn’t do any formal training, we just got our hands dirty, and now we became the KNIME experts in the company by helping other colleagues.
Michael: For us the migration was very easy, because we built the analytics infrastructure from scratch. In a sense, it was more of tool selection than a tool migration problem.
As for onboarding others, I think the visual programming-based nature of the tool plays a crucial role. When I had to bring in a new KNIME developer, he was able to understand right away how the data was being handled. He did not work in the automotive industry before but he took his experience and was able to make the workflows more efficient.
Rosaria: What do you think the most useful feature of KNIME is?
Michael: The best thing is the community. If you are stuck with your workflow, you can search the KNIME Community Hub, and you often realize that the exact workflow you need was already created and shared by someone else. With KNIME, I have the ability to connect with other people who use the tool to find out what I can do next.
Dawn: I’ll go with a node: the Joiner. It saves me so much time when I’m trying to find where things are not matching.
Charles: My favorite feature is the flexibility that KNIME offers. It caters to people with very different skill sets. Someone who is experienced and knows some programming language can integrate it in their workflows, whereas someone who does not have any experience working with data can also jump in and learn by doing.
Rosaria: Thank you all for taking part!