Accelerating the Pace of Childhood Cancer Research with Big Data

Alex's Lemonade Stand Foundation Logo

The Childhood Cancer Data Lab was established by Alex’s Lemonade Stand Foundation (ALSF) in 2017. ALSF recognized that pediatric cancer researchers face hurdles that impede the pace of research. 

ALSF introduced the Data Lab to empower researchers and scientists across the globe by removing roadblocks, supporting opportunities for collaboration and sharing, and developing resources to accelerate new treatment and cure discovery.

The Data Lab's mission is to empower pediatric cancer experts poised for the next big discovery with the knowledge, data, and tools to reach it. We construct tools that make vast amounts of data widely available, easily mineable, and broadly reusable. We train researchers and scientists to better understand their own data and to advance their work more quickly.

To date, the Data Lab has trained over 200 childhood cancer researchers and has harmonized over 1.3 million data samples and made them easily available. Learn more about the Data Lab’s impact here. 

Two people looking at goals


The Data Lab develops tools designed to make data and analysis widely available and broadly reusable.

Data Science Workshops

The Data Lab offers workshops to teach researchers the data science skills they need to examine their own data. Our courses focus on the most cutting edge tools and analysis techniques. We ensure that participants walk away with an understanding of:

  • The R programming language, R Notebooks, and some reproducible research practices.
  • Processing bulk and single-cell RNA-seq data from raw all the way to downstream analyses.
  • Downstream analyses methods like differential expression analyses, hierarchical clustering, and preparing publication-ready plots.

“I think anyone who is working on or near single-cell data should take this course. I am so much more confident in what I understand about single-cell analyses compared to where I was at the beginning. 10/10 recommend.”

Jessica Elswood, Postdoctoral Associate, Baylor College of Medicine
- Jessica Elswood, Postdoctoral Associate, Baylor College of Medicine


Make a donation to support the Data Lab’s mission of putting knowledge and resources in the hands of pediatric cancer experts poised for the next big discovery. 

With your help, we can

Fund innovative models to scale training workshops.

Offer our expertise and provide consultation on projects that will change the future for children fighting cancer.

Train at least 200 childhood cancer researchers over the next four years.



November 15, 2023

Git workflows for scientific projects and when we use them

Writing source code is a significant part of data-intensive biomedical research. Everything from cleaning and pre-processing data to generating publication figures can be accomplished programmatically. Increasingly, funding agencies and journals require researchers to share their code. To pick a few examples, the Data Lab’s parent organization, Alex’s Lemonade Stand Foundation (ALSF), has such a requirement for awardees, and PLoS Computational Biology requires authors to make code underlying results and conclusions available.



September 18, 2023

I’m terrible with names…but I’m using ontologies to try to be better

There is an old joke in computer science about how there are only two hard things: cache invalidation, naming things, and off-by-one errors. I’ll leave aside the first one as beyond my own expertise, but the second comes up all the time in my work as a biological data scientist. Naming variables and functions in my code is a constant struggle, but one I have to deal with on my own or with my team. Much bigger problems come up when trying to deal with all the various ways that people across the world use names when talking about the diseases they work on, the types of cells they are looking at, the experimental methods they are using, and just about every other aspect of their studies.



August 16, 2023

Full: Data Lab Reproducible Research Practices Workshop, Philadelphia, October 24-25, 2023

Applications are open for the Data Lab's next workshop! We will be holding a Reproducible Research Practices Course in-person on October 24-25, 2023. Instructors will introduce principles and techniques to achieve reproducible results in computational cancer research. We’ll show you the fundamentals of commonly-used approaches in reproducibility that you can apply to increase the impact of your research by making your findings more robust and reliable! To ensure that workshop attendees have a great hands-on experience, there will be a very limited number of seats available.