It's about making your data work for you, especially when you plan to train ML and build predictive models. So, when we talk about embracing open and FAIR principles, we mean it in the practical sense, especially across:
-
- DianthusTM uHTS (ultra-high-throughput affinity screening).
- DianthusTM α affinity measurement for challenging interactions).
- PrometheusTM Panta (protein stability characterisation: thermal stability plus colloidal stability via DLS/SLS).
Different questions, different instruments, one shared need: data that travels cleanly and feeds directly into the models you're building. At NanoTemper we are building our Open and FAIR Integration concept which can be summed up in three principles.
At NanoTemper, we develop biophysical solutions to help scientists move from data to decision faster. For R&D teams embracing AI/ML, that only works if the “plumbing” is solid: integration, data structures, and analysis workflows that don’t fall apart the moment you scale experimental throughput.
I didn’t start my scientific life thinking, “One day I’ll have strong opinions about data formats.”
My first biochemical experiment was a practical in the third year of uni, monitoring the activity of purple acid phosphatase, determining initial rates from UV-Vis experiments, and then chucking numbers into Excel to create a Lineweaver–Burk plot. That was my first real experience of using physical chemistry to understand a biological process. It was slow and clunky (and didn’t really work), but pretty cool.
Then came a PhD where I ran tens of thousands of enzyme assays by hand (I mean I had a plate reader and a multi-channel pipette, but that was it). The funny part is that the throughput was still low enough that I could “get away with” brittle workflows: awkward exports, transposing data by hand, manually running fitting equations, ‘negotiating’ with the data until it agreed to look like a curve. It wasn’t efficient, but it taught me that if you don’t understand your data, no amount of analysis will save you. (Also: Excel is not an analysis tool for science. It’s a mood.)
Fast-forward to modern R&D, where AI has the potential to change everything (but only if your data is ready for it). We don’t run more experiments because we enjoy pipetting. We still need wet lab experiments because:
A perspective from Deputy Head of Research Nathan Adams.
Open and FAIR integration: the unglamorous superpower behind faster R&D and being AI-ready.
And that’s where open and FAIR stops being an academic ideal and becomes operational reality. The FAIR acronym stands for Findable, Accessible, Interoperable, Reusable. In plain lab terms, it means you can:
-
- trace where a result came from (sample, method, run conditions),
- move data into your chosen analysis environment or AI/ML pipeline (not just the vendor’s),
- re-analyse consistently across projects and time.
What this looks like at NanoTemper
Principle 2: Open, human- and machine-readable data (AI-ready from day one)
If an instrument can be automated, integration shouldn’t require folklore, favours, or mystery interfaces. The goal is simple:
-
- open and easy to use APIs, with easy-to-understand documentation.
- the same interfaces we rely on for our control software are available to integrators and end users.
Because the ‘boring plumbing’ is what lets R&D move fast without cutting corners. And that’s the whole point: faster decisions, higher confidence, real impact. So yeah, 20 odd years later, turns out, I’ve become extremely opinionated on file formats and data structures. And so has NanoTemper. Working with and learning from our partners over the last three years has given us the insight to build the tools we think are fit for the AI/ML revolution. There is always more to do, but we are up for the challenge.
- We build models (statistical, mechanistic, AI/ML; pick your flavor)
- We must validate those models, which means more experiments )
Principle 1: Integration should be "normal", not a special project
Open, structured data enables things scientists need to do:
-
- aggregate results across plates, runs, days, and projects.
- apply consistent QC rules (and update them without breaking everything).
- compare like-with-like when conditions shift.
- build traceable analysis pipelines in Python, R, Knime or from one of the major data analysis and visualisation platforms.
For all automatable NanoTemper devices, we offer a .json export format using a common schema to allow for rapid integration into your existing analysis pipelines. Because your data is already structured and documented, it's AI-ready without extra prep work.
Principle 3: Documentation that tells you what is going on
To make data genuinely reusable, you need documentation that spells out:
-
- data structures (what’s in each field, unambiguously).
- analysis workflow logic (what is calculated from what).
- formulas and assumptions.
- versioning (because our schemas are going to change as we think up new cool ideas).
The industry is changing, but often, due to legacy tech stacks and corporate inertia, still treats data export like a bonus feature. Like: “Here’s a CSV file. Good luck.” With more data points, and AL/ML workflows becoming ever present, we all need better.