Blog - medidata

Inconvenient Truth about Claims Databases

Ever more often I hear from my clients that they would like to use an already existing databases. It saves money and time, they say. The industry sees claims databases as an easy source of evidence - quick, painless and ready-to-go.  

Is it really so?

Finding the right database might be tougher than it seems. Key issue to address is the quality of 'data in the base', and a plethora of aspects that have profound impact on it. Here are a couple of interesting facts you might want to take into consideration before going all-in for claims databases.

  • The so-called ‘garbage coding’.

Take death registries for instance, so widely beloved in the US. The American Institute for Health Metrics and Evaluation estimates that 22% of all death logs are described too broadly. ‘Heart disease’ is an end-term, a fatal effect of many other potenially underlying conditions which, developing over decades, ultimately led to death. ‘Brain trauma’ can be caused by stroke or an ill-fated road trip... A whole range of acute, fatal conditions is nothing more than a 'nature's disguise' of long-term causes, such as many NCDs.  

A savvy way to illustrate this is to invoke the ‘French Paradox’. Despite diet full of saturated fats, for many years French people had had actually lower than average rate of heart disease. How so? In 2013 Dr William Kerr from California’s Public Health Institute showed that it had been all along a result of long-term inaccurate record-keeping. French people lived much healthier lives thanks to garbage coding, it seems [1].


  • In its 4th revision of ‘Guide on Methodological Standarts in Pharmacoepidemiology’, ENCePP (the European Network of Centers for Pharmacolepidemiology and Pharmacovigilance) gives us another inconvenient fact about the secondary use of data.

A comparative study of clinical versus an insurance claims database for predictors of prognosis in patients with ischaemic heart disease showed, that claims data failed to identify more than half of the patients with conditions important for prognosis when compared with the clinical information system.


  • When speaking about Real-World Data, a Danish Register Study on Non-Adherence in General Practice detected that 9,3% of all prescriptions for new therapies were never redeemed at the pharmacy. How does that reflect on the assessment of exposure? How does that reflect on Real-World Evidence that we extract from these ‘Real-World Claims Data’?


  • Are you sure you accounted for all risk-window, time-window, immortal time and all other forms of time-related biases? You’d better be.


To conclude, indeed, claims databases seem convenient, but it comes with additional costs one has to be aware of. You cannot simply trick your way out of data quality validation. In prospective observational trials it is decidedly easier to maintain high quality of data, assuming that the project design phase was done professionally. 

Good day to you!


Philip Rozewski



1. Causes of death a mystery for most people because of inaccurate records. Andrew Masterson, Sydney Morning Herald

Read more: