Neurodesk is a flexible, cloud-based solution designed to create a teaching environment for neuroimaging data analysis




The Landscape
Researchers rely on a wide range of tools to analyse data and address complex research questions. However, in areas like neuroimaging analysis, training researchers to use these tools presents challenges. Many tools are Linux-based and difficult to install due to complex (and often conflicting) software tools, libraries, versions, and dependencies.
A key challenge in workshops and teaching scenarios is creating standardised computing environments for participants. Ensuring consistent setups can be time-consuming and error-prone, often leading to delays and frustration for both instructors and learners. This setup difficulty detracts from the core learning objectives, and it can hinder follow-up work, as trainees may struggle to reproduce their workshop computing environments on their own.
As fields such as imaging analysis also demand significant computational power and storage capacities, workshops are often forced to use simplified analysis and datasets because of the limitations of participants' computers.
What is Neurodesk?
To overcome these challenges, we developed Neurodesk: an accessible, flexible and portable data analysis environment for reproducible neuroimaging. Led by Aswin Narayanan and Steffen Bollmann at the University of Queensland, the Neurodesk project, recently featured in Nature Methods, involves more than 50 global researchers and is supported by a platform grant of the Australian Research Data Commons (ARDC) and a Wellcome Discretionary Award as part of the Chan Zuckerberg Initiative (CZI). The Neurodesk project is also supported by cloud resources from organisations across the world, such as EGI, the Jetstream2 cloud, and AWS.
Neurodesk includes a browser-accessible virtual desktop (Figure 1), a command-line interface, and computational notebooks (Figure 2), allowing for accessible, flexible, portable and fully reproducible neuroimaging analysis on personal workstations, high-performance computers, and the cloud.

The interactive desktop environment enables students to work with all software tools interactively in a full Linux desktop environment.

The JupyterLab environment enables students to work with Jupyter notebooks and work through exercises with code and documentation.
In the backend of Neurodesk sits a modular and open analysis environment consisting of a continuous integration system to build neuroimaging software containers automatically. By providing a separate container for each neuroimaging software package, Neurodesk enables a fully reproducible environment while avoiding dependency conflicts between tools. The system is kept lightweight by mounting containers on the fly from a worldwide network of distributed and scalable software distribution service nodes running CernVMFS. One of these CVMFS stratum one nodes is also provided through an EGI collaboration.
For cloud deployment, we built on the Zero to JupyterHub project, which uses Kubernetes to deploy JupyterHub. The project incorporated the CVMFS CSI driver to handle Neurodesk’s software distribution. This setup can also be deployed using K3s. This lightweight Kubernetes distribution only requires a single virtual machine in the simplest case, avoiding the need for a full Kubernetes setup, which can be resource-intensive and difficult to maintain on OpenStack. We also employ the Longhorn project to create a distributed storage overlay. This storage overlay allows us to add multiple compute nodes to the K3s cluster, transparently handle user home directories moving between compute nodes, have shared working directories accessible from all nodes, and use the automated backup functionality.
Neurodesk Play
One example of this setup is hosting a freely available portal for learning about Neuroimaging data analysis called Neurodesk Play. This allows European researchers to quickly explore Neurodesk and learn about Neuroimaging data analysis without installing any software on their computers. We are also developing an example gallery where we collect common analyses and workflows and show how these workflows can be run in Neurodesk.
Another example of this setup was recently deployed for a university course taught by Dr Vinod Kumar at the University of Tuebingen. In this course, 25 students will work with Neuroimaging software throughout the Winter semester (from October 2024 to mid-February 2025). In this setup, we are using two of the largest available compute nodes on CESNET-MCC in a K3s cluster. The students also have, in addition to their home directories, a shared working directory where they can collaborate on analyses.

I am currently conducting a course, Statistical MR Imaging in Neuroscience, using NeuroDesk in conjunction with the EGI computational facility. This setup is exceptional for teaching MRI and fMRI statistical analysis, enabling students to work directly with code and real neuroimaging data. Through NeuroDesk’s environment on EGI, students can bypass typical setup challenges and dive straight into hands-on learning, significantly enhancing their understanding. They can modify, execute code, and work with datasets seamlessly, gaining direct, practical experience in statistical imaging analysis. In summary, the NeuroDesk-EGI synergy offers unique benefits, especially in enhancing learning outcomes in education.
Vinod Kumar, Researcher Scientist, Max Plank Institute for Biological Cybernetics
Previous workshop facilitators stated that such a cloud setup made their courses more interactive and that it enabled the use of realistic data analysis examples. Cost efficiency was another major benefit, as resources could be scaled down when not in use. A further benefit of using Neurodesk for such workshops is that participants can install Neurodesk on their own computers after the workshop and can keep using all the features they learned during the course on their own hardware and data.
The Neurodesk cloud-based platform has proven to be a valuable tool for teaching neuroimaging data analysis. Based on the success of the workshops we ran in the past year and the demand from other scientific domains, we are currently planning on expanding the platform to support more scientific domains such as genomics, microscopy, and astronomy.
Related magazine news
Pangeo is a collaborative, open-source project revolutionising big data analysis in geoscience