Innovative data science harnessing the spirit of Japanese poetry

ETH Lausanne and ETH Zurich’s Swiss Data Science Center is off to a successful start. In September, scientists at the Center launched their open source platform Renga. First research projects have been chosen.

SDSC
Named after the Japanese poetry form Renga, collaboration will also be the focus of the new Swiss Data Science Center platform. (Image: Colourbox / ETH Zurich)

Renga is a form of Japanese collaborative poetry involving multiple writers who alternate verse by verse. The individual contributions are then joined together to create a full poem. Renga is also the name of a new open software platform that works according to the same principle. Researchers from natural, technical and social sciences work in collaboration with data scientists to create new solutions. The platform was developed by scientists at the Swiss Data Science Center (SDSC). This centre – jointly run by EPFL and ETH Zurich – serves as a bridge between researchers who produce data and those who develop new data analysis and data system technologies.

Sharing data and increasing knowledge

The new cloud-hosted program is designed for a broad variety of analyses. Researchers from a wide range of disciplines can use it to store and analyse organised, calibrated and – if necessary – anonymised data. The benefit of this approach is that the platform grows with each new research project, enriching it with valuable data, methods and results that are then made available to other scientists. Data scientists can also use the data for their own research, facilitating an exchange of solutions and enabling new findings and further research projects. As well as promoting multi-disciplinary cooperation, this fosters scientific transparency and method development.

The platform will be online and available to scientists in 2018. Companies can benefit as well: during a special industry day, those in attendance discovered how it is possible to quickly apply data science in their respective sectors. In future, businesses will be able to run their data through the platform and rely on the analytical expertise of the Swiss Data Science Center’s data and IT specialists.

Uniting and supporting skills

“The platform we have built is unique in the way it promotes performance and excellence in research, in an entirely practical way while supporting the adoption of open data science,” states Olivier Verscheure, Executive Director of the SDSC. One particular challenge is to make the platform user-friendly. After all, this is a platform designed not only for IT and data specialists, but for other researchers as well. Andreas Krause, Professor of Computer Science at ETH Zurich and Co-Director of the Data Science Center, is also quick to extol the project's virtues: “We combine data science methods such as machine learning and statistics with the expertise of data-rich disciplines like life and environmental sciences.”

Selected projects

Like the Renga platform, the funding contributions of the Swiss Data Science Center are aimed at promoting interdisciplinary projects focusing on data science. Funding lasts for two years and is pegged at CHF 300,000 to 600,000 per project. The first call for proposals was recently concluded. A total of 74 research teams from the ETH Domain applied (ETH, EPFL, PSI, WSL, Empa, Eawag), representing ten different disciplines. The SDSC selected eight projects, four of them including researchers from ETH Zurich. The ETH projects are primarily drawn from environmental sciences and health sciences, with deep learning being one of the major components.

Deep learning is a form of machine learning that uses artificial neural networks to recognise typical patterns in large data volumes. Neural networks can also recognise and generate patterns and objects in images, such as pictures of faces or interiors, which are almost impossible to tell from real images.

Recognising images – from the cosmic to the microscopic

For their SDSC project, Thomas Hofmann, Professor of Data Analysis at ETH Zurich, and Alexander Refregier of the Institute for Particle Physics and Astrophysics apply these methods to the field of cosmology. This includes comparing model projections for the distribution of mass in the cosmos and checking the validity of scientific findings. In the current issue of Science Astronomy, they demonstrate how neural networks can be used to generate images of the cosmos that correctly depict its complex structures and patterns. Methods of machine learning thus expand the cosmology tool kit, offering new ways to investigate the evolution of the universe.

The same method informs the project run by Ender Konukoglu, Professor of Biomedical Image Computing at ETH Zurich, and Anne Bonnin of ETH Zurich’s Chair for X-Ray Imaging. However, they are not looking to create a portrait of the universe, but rather microscopic images in the fields of medicine and biology. They are hoping to achieve detailed image analysis and a better understanding of biological relationships.

National initiative for data sciences

The Swiss Data Science Center is a joint initiative run by ETH Lausanne and ETH Zurich. It was established in January 2017 with the goal of enabling and promoting multidisciplinary collaboration in data science.

Data science is among ETH’s strategic research fields for 2017 to 2020. Since September 2017, EPFL and ETH Zurich have offered Master’s degree programmes in data science, which will result in further cooperation with the Swiss Data Science Center (SDSC).

JavaScript has been disabled in your browser