LogoLogo
OT PlatformOT GeneticsCommunityBlog
  • Open Targets Platform
  • Getting started
  • Target
    • Tractability
    • Safety
    • Chemical probes & TEPs
    • Baseline expression
    • Molecular interactions
    • Core Gene Essentiality
    • Pharmacogenetics
  • Disease or Phenotype
    • Clinical signs and symptoms
  • 🆕Variant
  • 🆕Study
  • Drug
    • Clinical Precedence
    • Pharmacovigilance
    • Pharmacogenetics
  • 🆕Credible Set
  • Target–disease evidence
  • Target–disease associations
  • 🆕GWAS & functional genomics
    • Data sources
    • Fine-mapping
    • Colocalisation
    • Locus-to-Gene (L2G)
    • Gentropy
  • Bibliography
  • Web interface
    • Associations on the Fly
    • Target Prioritisation
    • Evidence pages
    • Entity profile pages
  • Data and code access
    • Download datasets
    • Google BigQuery
    • GraphQL API
    • 🆕Platform infrastructure
    • 🆕Data pipeline
  • 🆕FAQs
  • Release notes
  • Citation
  • Licence
    • Terms of use
  • Partner Preview Platform
Powered by GitBook
On this page
  • What is Google BigQuery?
  • BigQuery access points
  • Example BigQuery SQL queries
  • Tutorials and how-to guides

Was this helpful?

Export as PDF
  1. Data and code access

Google BigQuery

To support more complex queries and advanced informatics workflows that use Google Cloud services, the Open Targets Platform data is also available as a Google Cloud public dataset via our Google BigQuery instance — open-targets-prod.

What is Google BigQuery?

Google BigQuery is a data warehouse that enables researchers to run super-fast, asynchronous SQL queries using Google's cloud infrastructure. After running your query, you can either export into various formats or copy into a Google Cloud bucket for further downstream analyses.

Open Targets Platform data is publicly accessible as a Google Cloud public dataset. Users only pay for the queries they perform on the data, and through this program, the first 1 TB per month is free.

BigQuery access points

Open Targets has uploaded all of our data to Google BigQuery. You can run queries via:

  • Cloud Console

  • Command line bq tool

  • Client libraries, including Python

For more information on BiqQuery, please review the BigQuery documentation.

Example BigQuery SQL queries

Below is a sample query that uses our association_overall_direct dataset to return a list of targets associated with psoriasis (EFO_0000676) and the overall association score.

SELECT
  associations.targetId AS target_id,
  targets.approvedSymbol AS target_approved_symbol,
  associations.diseaseId AS disease_id,
  diseases.name AS disease_name,
  associations.score AS overall_association_score
FROM
  `open-targets-prod.platform.association_overall_direct` AS associations
JOIN
  `open-targets-prod.platform.disease` AS diseases
ON
  associations.diseaseId = diseases.id
JOIN
  `open-targets-prod.platform.target` AS targets
ON
  associations.targetId = targets.id
WHERE
  associations.diseaseId='EFO_0000676'
ORDER BY
  associations.score DESC

Similarly, you can use our drug_molecule dataset and pass a list of drug trade names to find relevant information:

DECLARE
  my_drug_list ARRAY<STRING>;
SET
  my_drug_list = [ 'Premarin',
  'Calcium disodium versenate',
  'Keytruda',
  'Vioxx',
  'Humira' ];
SELECT
  id AS drug_id,
  name AS drug_chembl_name,
  tradeNameList.element AS drug_trade_name,
  drugType AS drug_type,
  isApproved AS drug_is_approved,
  blackBoxWarning AS drug_blackbox_warning,
  hasBeenWithdrawn AS drug_withdrawn,
FROM
  `open-targets-prod.platform.drug_molecule`,
  UNNEST (tradeNames.list) AS tradeNameList
WHERE
  (tradeNameList.element) IN UNNEST(my_drug_list)

Tutorials and how-to guides

For more information on how to use BigQuery to access Platform data and example queries based on actual use cases and research questions, check out the Open Targets Community and our Google Cloud dataset homepage.

PreviousDownload datasetsNextGraphQL API

Last updated 3 months ago

Was this helpful?