Before approval, new therapeutic drug treatments are extensively tested in clinical trials. However, some of the side effects are only identified when prescribed to larger cohorts of patients, with one or more medical conditions, for a sustained period of time or in combination with other treatments.

For this reason, regulatory agencies - for example, the Food & Drug Administration or the European Medicines Agency - provide pharmacovigilance programs to monitor and survey Adverse Drug Reactions (ADRs).

Data sources

FDA Adverse Event Reporting System (FAERS)

The FDA Adverse Event Reporting System (FAERS) - - is a database that contains millions of public reports with information on adverse event and medication error reports submitted to FDA. The database is designed to support the FDA's post-marketing safety surveillance program for drug and therapeutic biologic products. Adverse events and medication errors are mapped to terms in the Medical Dictionary for Regulatory Activities (MedDRA) terminology.

Computational pipelines and datasets

While recurrence of a given adverse event is relevant, it's the specificity of the event to the drug what might flag concerns. In order to get a list of significant drug-ADRs associations, we have implemented an analysis similar to the one described by Maciejewski et al. (2017).

First we apply a set of filters to the reports as described below:

  • Only reports submitted by health professionals (primarysource.qualification in (1,2,3)).

  • Exclude reports that resulted in death (no entries with seriousnessdeath=1).

  • Only drugs that were considered by the reporter to be the cause of the event (drugcharacterization=1)

  • Remove blacklisted events curated manually to exclude uninformative events

Next, we sought to map the drugs in the FAERS reports to the drugs in the Open Targets Platform (ChEMBL IDs). Any of the above listed fields were used when exact matches were available:

FAERS drugs

Open Targets Platform drugs









The significant drug-ADR pairs were then evaluated using the Likelihood Ratio Test (LRT) as previously described by Huang et al. (2011). The significance of a given drug-ADR is implicitly corrected by how often a drug is found in a report and how often an event is reported across drugs. This way, we prevent the drug-ADR associations to be biased by overrepresented ADRs (e.g. headache, nausea) or drugs (e.g. paracetamol, ibuprofen). In order to assess significance, an LRT critical value for every drug is calculated using an empirical Monte Carlo simulation, similar to the one implemented by openFDA.

Due to the nature of the surveillance reports, it's relatively common for the indication for which a drug was prescribed to appear in the list of significant ADRs. Given the current structure of the data provided in a FEARS report, we cannot distinguish whether it's a problem with the dosage the drug was prescribed or an excessive phenotypic characterisation of the patient in the report.

All pharmacovigilance data is available for download on our data downloads page.


Huang L, Zalkikar J, Tiwari RC. Likelihood ratio test-based method for signal detection in drug classes using FDA's AERS database. J Biopharm Stat. 2013;23(1):178-200. doi: 10.1080/10543406.2013.736810. PMID: 23331230.

Maciejewski M, Lounkine E, Whitebread S, Farmer P, DuMouchel W, Shoichet BK, Urban L. Reverse translation of adverse event reports paves the way for de-risking preclinical off-targets. Elife. 2017 Aug 8;6:e25818. doi: 10.7554/eLife.25818. PMID: 28786378; PMCID: PMC5548487.