Sources of health data

The common sources of healthcare data are: EHR, Insurance Claims, Research, Public Health, User Generated data.

Electronic Health Records: hospitals and doctors keep records of each patient visit through their electronic health records (EHR) systems. EPIC, Cerner are by far the largest vendors in the US. EHRs document both the clinical information, such as diagnoses and procedures, as well as health facility workflow related information, such as appointment times. EHRs can also contain clinical notes, radiology images, laboratory results. As such, EHRs are often the richest sources of clinical information.

Common data standards such as HL7 have made inter-operability across different EHRs easier over recent years. However, much EHR data still remains in silos across different organizations.

Insurance Claims: healthcare providers submit claims to and receive payment from insurance companies. Claims data is transaction based, in that they facilitate payment, which usually has a member ID claim ID, service date, high level clinical information and payment amounts.

Claims data are widely used in policy decision making, largely because of it’s ready linkage to $$$… EHRs usually contain far less financial information. Both are also protected in the US by HIPAA, so you should not find patient level information publicly.

Research: medical universities and pharmaceutical companies spent lots of resources conducting clinical research. The data generate tend to be scientifically rich, and specific to each study that makes aggregation across different studies/institutions difficult. But nevertheless, published research results are highly valuable and can often be the only sources for some data points.

Some organizations have done a great deal of public service by conducting high quality research that make complex health matters more accessible, e.g. Kaiser Family Foundation, Dartmouth Atlas.

Public health: government agencies, such as the CDC, Centers for Medicare and Medicaid, Medical Expenditure Panel Survey, gather public health information that span entire cities or regions. This type of data is usually obtained through regular data submission by healthcare facilities and population surveys. These are very useful sources of epidemiology data, such as disease prevalence, mortality rates.

The open data initiative is gathering momentum. Various state health agencies in the US have made their data more easily available, e.g. New York, California etc.

User generated: the plethora of wearables and rise of “the empowered patient” means a wealth of new personal health metrics are being captured. You heart rate, blood pressure, pulse rate, temperature are all valuable pieces of information that could inform healthcare interventions.

9 thoughts on “Sources of health data

  1. Pharmacy / SPP data is also a very important data source that’s very frequently used, especially if you do any work for phama clients. Switch claims data is also super important but that’s arguably a mixture of a number of the above.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s