Casey Kneale

Software Engineer | Data Scientist | Analytical Chemist (Ph.D) | He/Him

Portrait of Casey Kneale

About Me:

I am a technology generalist with several areas of specialization based out of Central Massachusetts. I am a peer-reviewed scientist with journal articles in the fields of Analytical Chemistry, Optics, and Chemometrics (chemical data science). My doctoral thesis was about the development and study of several statistical methods for forensic data analysis where little can be assumed about the samples of interest. Writing papers was fun but now I opt to release findings and thoughts through other means (GitHub, ArXiv, NextJournal, etc).

I tend to serve companies as a Data Scientist or a Software Engineer, either as an employee or a consultant. Both professions allow me to do what I love; designing and engineering solutions to problems with data/evidence. My greatest professional assets are creativity, kindness, and the ability to learn quickly. I enjoy technologies, like Linux, Rust, DuckDB, Julia, pencil and paper, and white boards. I am happy to work with other tools as my resume demonstrates. Those tools mesh well with my intuition and serve many of the needs of the projects I tend to work on for fun.

I was the sole author and maintainer of the what was the largest open source chemometrics software package, ChemometricsTools.jl and its accompanying dataset package ChemometricsData.jl. It may still be the most capable; I no longer keep track. It was a fun project, that served as a basis for my own internal research needs and a way to warm up writing code before work many years ago. I no longer take the external project seriously, and sometimes invest efforts towards an private internal project instead. My github is mostly for fun, as I believe most open source software efforts should be. I am more invested in my family, job, and personal research. Mostly in that order.

I have had the privilege and honor of being a guest invited speaker at a university (UUIC). Though I think my sense of humor is a more valuable characteristic. Some good examples of that humor can be found in my VIMKiller or Turtle Swarm Optimizer repositories. It's important not to take all things seriously.

Outside of work/research/programming I am an enjoyer of forests, native plant gardening with my wife, cape cod, reading used text books for fields I have no business reading, cats, creating scientific instruments, and other calm mindful activities. Fun fact: I have ~8 years worth of personal intellectual property (spectroscopy, chemometrics, data science, physics, etc) and haven't found a good outlet for most it. Any ideas for what to do with that would be welcomed!

Consulting Services:

I am currently open to consulting opportunities. I have professional experience in a variety of high-stakes industries ranging from pharmaceutical manufacturing, cyber security/resilliency, and oil and gas. Almost every company I was a part of successfully M&A'd while I was there, or shortly after I left. I have seen the consequences of a great deal of decisions (buy or build, partner or compete, NDA or don't, etc) in a variety of businesses (B2B, B2G, SaaS, etc).

In the recent past I had my first consulting opportunity for a computer vision company in the agriculture industry. Together we worked out the details of a challenging high risk problem for the company, while I also up-skilled some of the staff on statistics and machine learning. References available on request.

I offer consultations primarily from a technical basis, in relation to machine learning, computer vision, chemical analysis/PAT, Chemometrics, scientific software, data oriented software, and to some extent AI. That is what I know best. However, more often then not, critical technical challenges involve direct product or project alignment. I can assist with those inquiries as well. I have a track record of helping root cause issues and finding gaps from R&D through production and back up to ideation.

My approach is direct but balanced by research and question asking. The techniques I use heavily rely on evidence in the form of proof of concepts, data, or processes analysis and many align closely with Deming. I enjoy interacting with teammates/stake holders, examining code/architecture, writing code or models, analyzing processes, and doing whatever is necessary to address business interests. I can also help with the daunting elements of green field projects; I have professional experience with R&D, statistical evaluations, risk analysis, voice of customer, competitor analysis, and work to stay aware of trends. If this is resonating with you feel free to e-mail me and we can discuss if my experience is an asset for your current needs.

Public Research & Blog Posts

Jan 3, 2025

A More Optimized Algorithm For Rubberband Baseline Correction

The rubberband baseline correction algorithm involves computing a convex hull around a spectral region. The issue with many convex hulls algorithms is how expensive they can be to compute. This repository demonstrates a relatively cheap way to compute the equivalent function, by accounting for some of the guarantees in spectra. The algorithm is both simple and easy to write.


Visit Repository
May 15, 2022

Topological Peak Detection In Rust

Finding peaks or troughs in data is a quest as old as Calculus. One of my favorite ways to approach this challenge is to utilize 'Topological Persistence'. So I wrote my first Rust crate that does effectively that. It's not finely tuned but it served a need I had while working on some audio data analysis. Enjoy!


Visit Repository
Nov 1, 2020

Extreme Vertex Experimental Designs

Often times a calibration does not need to sample from an entire simplex. This can be due to physical constraints or practical ones. In these cases EVE designs are powerful tools. There was no free package that existed to suite my needs, so I wrote this one.


View Repository
Jul 20, 2020

Why Chemometricians Should use Volume % not Weight %

This is a common pitfall in spectroscopic modeling efforts. Weight % can lead to nonlinear response variables, while Volume % doesn't. This blog explains why!


View Blog
Jul 11, 2020

Derivation for Inverse Gas Chromatographies Operating Principles

Back in graduate school I really wanted to understand how inverse gas chromatography worked. Intuitively it made sense, but physically I did not understand how the linear relationship was obtained. Searching the literature was fruitless. So I derived it myself. Warning: Thermodynamics.


View Blog
Jul 11, 2020

What if Fisher & Anderson Provided Uncertainties

I have dealt with a lot of uncertainty quantification in both my career and hobbies. This blog explores the famous Iris dataset, pretending there was uncertainty on the measurands. This blog explores what uncertainty does to a simple PCA model. The intention is to expose data scientists to what PCA does to uncertainties and a lesson in thoughtfulness about data quality.


View Blog

If you enjoyed any of the technical writing samples above, be sure to checkout my NextJournal, GitHub, and ArXiv for more of the same!