refine.bio

Making vast amounts of data immediately usable and openly available to researchers across the globe.

Meet refine.bio

The vast amount of publicly available biological data can provide researchers with unique insights into complex diseases. refine.bio helps put this wealth of information to use broadly by harmonizing data across many different technologies into one universal repository. This multi-organism collection of gene expression data allows researchers to search for experiments from different publicly available sources and build custom data sets that suit their research needs. The data is uniformly processed with standardized pipelines that have been selected based on their wide-ranging utility.

Researchers can avoid the painstaking work of reprocessing by downloading from this extensive collection of ready-to-use transcriptomic data.

Take me to refine.bio

refine.bio is designed to simplify things

Since it was launched in 2018, users from across the globe have downloaded over 9000 datasets, saving them precious time and accelerating the pace of their research.

1.3 M Samples across 300 organisms

This online repository has over 1.3 million existing samples. Researchers can discover relevant experiments across 300 different organisms and download custom datasets catered to their project.

Flexible access

Researchers can download refine.bio datasets via their browser as well as programmatically via our python client.

Open Source and Free

Researchers across the world can access refine.bio for free at any time. refine.bio is an open source effort, and we welcome contributors in all areas from software engineering to bioinformatics.

Better Medicine Through Machine Learning

refine.bio Compendia are designed to enable researchers to leverage machine learning techniques to extract more information about the biology. The data in refine.bio will support researchers’ efforts to better classify patients and identify what types of treatments might be most effective on a case-by-case basis, further enhancing the burgeoning field of precision medicine.

Take me to refine.bio Compendia

Getting Started

refine.bio examples gives researchers access to a variety of example analyses implemented in R, such as clustering and heat maps, differential expression analysis, and pathway analysis, for use with refine.bio data.

Follow along in R Notebooks

The examples are designed so that users can download the R notebooks and follow along, performing each analysis on their own computers or on the web.

BYO Dataset

The analysis notebooks can be used with the datasets we provide, or easily modified to use a different dataset more relevant to the user, and further modified for more customized analysis.

Get the most out of refine.bio datasets

This enhances usability and shortens the learning curve to help researchers get the most out of their refine.bio datasets.

Take me to the examples

Learn About Our Other Projects

OpenScPCA

OpenScPCA is an open, collaborative project to analyze data from the ScPCA Portal, which currently holds 500 samples from over 50 pediatric cancer types.

Learn More

ScPCA Portal

The Single-cell Pediatric Cancer Atlas (ScPCA) is accelerating the discovery of better treatments for pediatric solid tumors and leukemias.

Learn More

refine.bio

refine.bio is a repository of uniformly processed and normalized, ready-to-use transcriptome data from publicly available sources.

Learn More