Projekt MorphoSphere

Smart, interactive repository for digitized morphology

Imaging techniques are one of the major drivers of large data production at neutron and photon facilities. Many petabytes of raw data are currently being generated, creating a backlog in the analysis process and slowing down scientific progress in many application areas. To overcome this critical bottleneck, MorphoSphere develops a powerful data management architecture for a distributed, smart and interactive Storage & Analysis Repository of Digitized Morphology as a pilot project for integrative data platforms within the triangle of the data generating (photon, neutron or ion) user facilities (PNI facilities), large scale data facilities (LSDF), high-performance computing (HPC) systems, and specific scientific communities. It aims to optimize not only management of data transfer, processing, and storage, but also computational analysis, visualization, and public availability, thereby significantly promoting scientific output.

MorphoSphere will achieve innovative approaches to facilitate feeding of distributed repositories from federative data generation. The further integration of memory-distributed computing and machine learning into these repositories will allow the exploitation of full information contained in large 3D datasets, enabling smart autonomous 3D image reconstruction, morphological, and morphometric analysis, and, supported by AI based on distributed and federated learning, even for very large volumetric data sets and sample series from high-throughput experiments.

Initially focusing on the needs of the life sciences community for morphological studies, the MorphoSphere architecture can later be transferred and adapted to integrate further domain-specific requirements, e.g. from the materials research community. It will facilitate the correlation of morphological data with other types of data, e.g. genetic or environmental information, thus fostering collaboration between different scientific communities and contributing to data democratization at national level and beyond.

In this research project, the University Computing Centre is focusing on the development and provision of a distributed data management system. In particular, this must enable efficient access for researchers and high-performance data transfers. Various concepts and solutions developed in different communities are being used for this purpose. The developments are being carried out in collaboration with the project partners at KIT and DESY and are intended to be made available for research on a long-term basis.

STATUS AND DURATION

The project will run from 2025-11 to 2028-10.

PROJECT PARTNERS

  • Engineering Mathematics and Computing Lab (EMCL), Prof. Dr. Vincent Heuveline, IWR, Heidelberg University
  • Laboratorium für Applikationen der Synchrotronstrahlung (LAS), Prof. Dr. Tilo Baumbach, Karlsruhe Institute of Technology
  • Service-Bereich Future IT - Research and Education (FIRE), Dr. Martin Baumann, University Computing Centre, Heidelberg University
  • Deutsches Elektronen-Synchrotron DESY, Martin Gasthuber

FUNDING

The project is funded by the Federal Ministry of Research, Technology and Space (BMFTR).