Owen Yang

Many clinicians I know have this contradictory attitude. Before this type of evidence is generated, some clinicians generally dismiss this idea. After this type of evidence is published in a good journal, say BMJ, they suddenly forgets this dismissal but then takes it too seriously instead.

Pharmacoepidemiology is just a jargon

Pharmacoepidemiology is a new fancy term that perhaps should not have existed. We could split the word in half (pharma- and epidemiol-) and say anything related to drugs and epidemiology can be pharmacoepidemiology.

Examples will include reports of the amount of drug use, or reports of the associations between drug use and its intended effects or unintended effects. Interestingly, drug trials themselves are not normally considered as a main focus of pharmacoepidemiology.

Real world evidence is just a jargon for data generated from routine healthcare

When someone says ‘real world evidence’, they generally mean the data they used are from routinely recorded medical information. You may imagine this is from the written records that doctors type down (i.e. electronic health records), but it is generally not the case. Because it is much easier to use structured data (structured data here mean categorised information, or ‘coded’ information, instead of free texts) to process, the so-called real world evidence so far is generally generated from data that is already coded as a routine process of medical care. This mostly include prescriptions, diagnoses (or conclusions), or investigation results, which are coded for the purpose of payment or quality assessment. If you have the ability to process unstructured data, there is nothing wrong to use free texts or image outputs. So the world ‘real world data’ is generally restricted to anything that is generated from routine medical care, but most of the time it is from coded data generated from routine medical care.

It is rather ingenious that people invent this term ‘real world evidence’ and because suddenly the face value is uplifted compared to ‘evidence generated from routine healthcare records’.

The elephant in the room

As long as one tries to generate real-world evidence for pharmacoepidemiology, the first thing he or she should address is confounding by indication.

A drug is prescribed for a reason, and in medicine the reason is very strictly regulated, and this is called drug indication. One generally would not prescribe a drug without an indication. For example, if you look up the indication of metformin, you would find probably diabetes, and polycystic ovarian syndrome perhaps. This means in the ‘real world,’ you would only find people to be prescribed with metformin for these reasons. Therefore, all associations you find with metformin, say metformin and cancer, can well be due to the association with these indications. In this case (A) the association between diabetes and cancer and (B) the association between metformin and cancer are very difficult to tell apart.

What about comparing metformin users and non-users just among those who have diabetes? Certainly we can, but again just like all observational studies, things happen or do not happen for reasons, and these reasons many not be random. Here the elephant in the room that needs addressing (on top of what an observational study is) is contra-indications. A medication is also avoided for a reason. In diabetes patients with severe intolerance to metformin, or those with very bad kidney function, metformin is not prescribed. So an association with metformin may well be an association with the tendency of metformin intolerance or an association with kidney function instead.

A theorist (or an ignorant data scientist) may dismiss this concern, and say indications and contraindications are just examples of confounding, which is an universal issue in an observational study and in the ‘real world evidence’ study. Although this is true, indications and contraindications are the unusual cases where you will find the confounding factors (diabetes) are more strongly correlated with the exposures (metformin). When the correlations are stronger than the potential association of interest, it would not be appropriate to use generic methods (?propensity score) to imply causations. In other words, because metformin is used so overwhelmingly in diabetes and is the first line medication for nearly all patents with diabetes, using propensity score matching to suggest metformin causes or prevent cancer would not be appropriate, if one has not considered the possible association between diabetes and cancer.

Safe to generate insights

The best contribution of a real-world pharmacoepidemiology report to the real world is to generate insights, or one can call it an ‘advanced surveillance system,’ and surveillance system to detect something might have gone wrong, and so confirmatory investigation is needed to judge whether this is true or just a false alarm. Without jumping into causation, it is a great approach to identify whether use a drug is associated with any intended or unintended outcome. In other words, this can be used as a screening method to identify potential issues but not rush into conclusions.

One danger of this is reporting selected results, and so it is best to use a systematic approach when reporting these.