Think like a Health Data Scientist

by Nana Mensah | | 6 minute read

This article originally appeared on the STP Perspectives blog.

Think Like a Health Data Scientist

When we interact with the health service, we leave a footprint. Imagine scrapyards, filled with old metal filing cabinets, retired from their jobs as keepers of our health records. Letters, tests, scans and treatments are still archived by law but today they occupy a digital space.

Healthcare Scientists turn patient data into clinical insight, though few of us write code to do so. The Topol review stresses that in 20 years, 90% of all jobs in the NHS will require some element of digital skills. Health Data Scientists rely on programming skills to work with vast amounts of healthcare data, making the most of our digital filing cabinets for patient care. What can Healthcare Scientists learn from this emerging profession?

Perhaps we can learn to think like them. To better understand the role I spoke to Claudia Cabrera and Fiona Grimson. Claudia is a post-doctoral researcher at the WHRI Centre for Translational Bioinformatics, while Fiona works as a Consultant in Data Science at IQVIA. Guided by their responses, here’s my take on what it means to think like a Health Data Scientist.

Data saves lives

“With whom? And why?” would be a reasonable response if asked whether your health data can be shared with professionals who weren’t involved in your care. NHS staff work with universities, consortiums and companies to uncover disease risk factors, reroute demand on services and cut down on prescribing errors, all made possible by insights buried in our collective health data. It’s a fact, data saves lives and Health Data Scientists know this more than anyone.

Fiona: I define a Health Data Scientist as anyone applying statistics and technology to healthcare data. It’s about understanding the health data you’re given and how it connects to research objectives. Unless you understand the strengths and limitations of the data you’ve got, you can’t build useful, meaningful, robust models. It’s also about upholding principles of data governance, proper research consent and patient privacy. It’s important that the patient is the focus. The work we do benefits patients and that’s why we chose this field. Everyone I work with feels this way.

Healthcare Scientists are on the frontline of the #datasaveslives movement. We should take care not to neglect the value of data for improving our services while educating patients and the public along the way. Not all of us are researchers, but it’s hard to predict the impact we can have just by engaging with wider research initiatives relevant to our specialisms.

Claudia: We recently identified more than 500 genomic regions associated with blood pressure from around 1 million people, allowing us to identify new potential therapeutic drug targets. This was only possible thanks to those who shared their health data with collaborators such as the International Consortium of Blood Pressure, the UK Biobank and the US Million Veteran Program. This work is paving the way to prevention and precision medicine for cardiovascular diseases.

Health Data Scientists are forever searching for the value in data. We can take a leaf out of their book by simply keeping open to learning from the data we produce. Treated more like a source of insight than a barren record, health data can save lives.

Tools of the trade

Many Healthcare Scientists rely on the almighty Excel spreadsheet to capture meaningful insight from clinical data. We copy rows between sheets to keep our house in order. We pivot tables and leverage lookups to get another angle on the story. We overwrite cells to fix typos, no crumbs left behind. On the other hand, Health Data Scientists write code. Languages like SAS, R and Python are powerful tools of the trade, designed to slice, query and model data that would be unwieldy to manage with a spreadsheet.

Claudia: There are three key skills I think data scientists need: Programming, Statistics and Domain Knowledge. Programming lets us manage these large datasets so working with Unix, SQL, Python, and R is essential. Statistics allows us to analyse and learn from data. And an understanding of health data, which is often rooted in biology or epidemiology, allows us to correctly interpret the results and improve healthcare.

Should all Healthcare Scientists learn to code? Many of us don’t need advanced computing skills for our jobs, but I find it useful to know that there are free tools beyond Excel that enable us to answer questions we couldn’t before. That can translate to designing studies we otherwise wouldn’t. Regardless of your computing background, there’s one skill scientists can’t do without:

Fiona: Technology is important but so is writing and communicating about your work. The work you do is never just for yourself. A cool complicated statistical model is useless if decision makers aren’t able to understand it’s strengths and limitations. At IQVIA I was involved in a study on treatment data that looked at everything from association with adverse events, to reimbursement costs for the NHS, to whether practices met guidelines. We have to be able to communicate clearly otherwise the work won’t have the impact it should for patients.

As a Bioinformatician, this point resonates. Communication is important for Healthcare Scientists, but we tend to emphasise it far more in patient-facing specialisms. In any case, thinking like a Health Data Scientist means keeping on top of the tools and skills at your disposal. That could be writing clear code, or taking lengths to collect data in an organised spreadsheet, or the deceptively difficult task of sending a crystal clear email.

Stay curious

What matters most when recruiting a Health Data Scientist – Qualifications? Experience? Or something else? Recruiters look for “intangibles”, skills that are difficult to spot but mark a candidate’s potential. Attitude, initiative and creativity are common examples. In a guest blog post for a data science recruitment firm, Frank Lo suggests that the number one intangible skill for all Data Scientists is intellectual curiosity.

Fiona: For me, it’s to do with really understanding the context of the data. Understanding as much as you can about where it’s collected, where data is coming from. If it’s from a technician in a hospital, seek to understand what they’re doing, why they use certain codes, why they record something and why they don’t.

Einstein said that “The important thing is not to stop questioning.” Scientific minds come with inquisitive batteries included and clinical services need every bit of our reserves. We stay curious by asking “What if?”, a question that sows the seed of a new audit, or life-changing research, or a service improvement that redefines best practices. Curiosity is particularly important today, where the goal posts for practical technology are always moving.

Fiona: In just two years on the job, I’ve seen a trend towards datasets getting larger, more complex, more detailed and more layered. This presents real challenges that we are constantly grappling with. The technology has to scale to store this data, keeping it safe and secure. Traditional analysis methods become inefficient, leading us to machine learning and artificial intelligence methods. Privacy is extremely important and federated networks are being developed to allow researchers to model data, kept at its source, without ever seeing patient level information.

In a health system that has historically struggled to find those capable of drawing insight from data, I think healthcare scientists led by curiosity can drive insight and innovation in their respective fields.

Three suggestions

Health Data Scientists leverage statistics, programming and domain expertise to turn health data into insight. Speaking to Fiona and Claudia, I left with three suggestions for all healthcare scientists looking to learn from this profession:

  • Stay open to the value of health data

  • Stay on top of the tools of the trade

  • Stay curious

Special thanks to Claudia and Fiona for taking part in interviews for this article. Additional thanks to all the editors as well as Adriana and Jes at STP Perspectives for publishing.