Making it… thunder? Notes from the inaugural JupyterCon

UC Berkeley’s Introduction To Data Science course has just become its fastest growing course – ever. The class has 1,200 students enrolled, representing more than 60 different majors. Under the hood of the course? Jupyter notebooks, used to power the interactive textbook students are using to learn concepts of programming and data science. Classes there began in sync with the inaugural Jupyter Conference, which took place August 22-25, 2017 in New York City. Here are some of my key conference takeaways.

The community is growing rapidly

Growth in Berkeley’s data science class is mirrored by growth in the adoption of Jupyter notebooks for data science use. The conference saw about 700 attendees, exceeding organizers’ expectations, with users representing a broad variety of sectors such as aviation, IT, banking, and research. Globally, there are an estimated 6-8 million Jupyter notebook users, and applications for the platform are increasingly rapidly to match this demand.

Jupyter in education

  • Demba Ba at Harvard utilized Jupyter notebooks deployed on AWS to support several classroom projects, such as one that predicted the outcome of a free-throw based on sensor data. Through improved system engineering, Demba was able to drive down the costs for students to use the cloud-based tool to $3/student/month.

  • Fast.ai – used Jupyter notebooks to teach 50,000 students about deep learning.

Jupyter in research

  • Zach Sailer from the University of Oregon highlighted how features of Jupyter, such as interactive widgets, facilitate collaboration between computationalists and experimentalists.

  • UC Berkeley is using Jupyter notebooks to transition the traditional batch-based high-performance computing  workflow to introduce interactivity and iterative exploration.

Jupyter in industry

  • Corporations are adopting Jupyter notebooks to analyze, share, and publish their analyses. Bloomberg had a large presence at the conference and contributes regularly to the Jupyter project. Microsoft makes Jupyter notebooks freely available on Azure and presented using notebooks to prototype applications with its partners.

  • A number of data analytics companies, such as Continuum Analytics and Domino, were also present and major sponsors of the conference, showcasing integration of their Jupyter notebooks into their data science platforms.

Jupyter maturing

For many attendees the conference highlight was a presentation from the developers of JupyterLab (the next generation of Jupyter) on what users can expect. The beta version of JupyterLab should be out soon, and version 1.0 is planned for later in 2017. The demo presented several gasp-inducing new features, such as enabling real-time collaborations, and the ability to easily open and seamlessly navigate millions of data rows. JupyterLab has already had more development time put into it than the original notebooks, which should help make the user interface even more intuitive.

As Jupyter grows and users come up with an increasingly diverse set of applications for the platform, it is not yet clear what its sweet spot will be. As Fernando Perez, founder of Jupyter notebooks pointed out, users will need to learn to balance what simply can be done with Jupyter notebooks with what should be done with them.

What next?

The next JupyterCon is scheduled for August 21-24, 2018 in New York City.

Anyone who would like to try out Jupyter notebooks is invited to do so by logging in at cybera.syzygy.ca. Cybera is collaborating with the Pacific Institute of Mathematical Sciences (PIMS) and Compute Canada to deliver notebooks as broadly as possible, and several environments have been set up at post-secondary institutions across Canada. We’re also interested in deploying JupyterHub deployment  to enable custom environments for easy collaboration.

About Jupyter notebooks

When Fernando Perez started Jupyter notebooks, he wanted to create a programming and analysis platform that is human-centered and interactive. Notebooks allow users to code and produce analysis outputs directly in-line with fully marked up content. This provides an interactive environment for iterating through code and analyses. Find out more on our website.