Pearl-fishing in the sea of data

Data is treasure: new worlds open up to organisations that understand how to scoop useful information out of unstructured data. At the President of ETH Zurich’s "Lokaltermin", researchers and business representatives revealed how data could serve a broad public in future.

Enlarged view: ralph eichler
ETH Zurich’s president Ralph Eichler invited guests to the Lokaltermin on big data. (Photo: Tom Kawara)

What are 130 representatives from industry, research and politics doing in a bare concrete hall five floors down in the new LEE building in Zurich’s Leonhardstrasse? Seeking proximity to data; incredible amounts of data. For after the completion of the building in June, numerous computer towers will be storing and processing data here. This is known as “big data” – and the two-hour Lokaltermin, to which the President of ETH Zurich, Ralph Eichler, and the ETH Zurich Foundation invited guests last Wednesday, examined the opportunities and risks involved in the active use of swelling seas of data.

Technological revolution

Dirk Helbing, an ETH-Zurich professor of sociology, illustrated the sheer flood of information that is generated worldwide these days with some impressive statistics: today, more data is produced in a year than in the entire history of humankind. The processor power of computers doubles every eighteen months and in ten to fifteen years computers with the performance capability of the human brain will be available. “We are in the throes of a technological revolution,” confirms Helbing. “And this will also stir up the job market.”

For Helbing, this sea of data comprising emails, tweets, Youtube videos, business reports, sensor data and images harbours tremendous opportunities: simulations with data from financial markets, the real economy, on epidemics, conflicts and environmental changes could provide information in real time on the impact of political, economic and ecological decisions on our future world. “Big data will help us to make more informed decisions,” Helbing is convinced.

Big data in medicine

Science is already practically inconceivable without big data. Of the six research projects that were vying for EU flagship projects until January 2013 and the EUR one billion in research funding alone, three promised new results based on the analysis of huge amounts of data. Regardless of whether it be neuroscience, cancer research or sociology: in future, most disciplines will at least partially become big-data sciences, too.

Medicine is decidedly interested in big data at the moment, as Joachim M. Buhmann, an ETH-Zurich professor of computer science, explained in his opening presentation. Medical practitioners collect no end of data for diagnoses and prognoses, which is mostly archived after a partial evaluation without further use. This is set to change: Buhmann is developing systems to combine patient data and the physicians’ diagnosis, prognosis and therapeutic data. Algorithms could then calculate where the key information is on images of tissue samples and simultaneously compare it with thousands of similar datasets, for instance. Or the algorithms could look for so-called biomarkers, which provide indications of certain diseases, such as cancer – all this with the aim of condensing the information until as many likely assertions about the progression of the disease and the prospects of a course of treatment’s success as possible are feasible. Buhmann called for the establishment of a “health data science” with experts who know how to handle such data.

Technical hurdles and data protection

Martin Erkens, Vice-Director and Global Head of Pharma Research & Early Development Informatics at F. Hoffmann-La Roche, however, warned against getting our hopes up too much: “The reality is still lagging behind the promises.” His experience in the pharmaceutical industry has taught him that there is still a large number of technical problems to iron out before we will be able to use patient data meaningfully in big data applications.

In addition to the technical problems, however, there are also issues of data protection: at present, the data can still be evaluated in the company itself, says Erkens. However, this could change in future if such spiralling amounts of data accumulate that companies have to rely on external help for the analyses.

In his opening presentation, Donald Kossmann, an ETH-Zurich professor of computer science, outlined possible solutions to minimise the risks involved in handling personal data. He presented technology that can be used to combine different datasets in a “cloud” and make it available to third parties, without surrendering the raw data.

Data transfer out of necessity

Confidentiality is especially a top priority in medicine. Anyone who wishes to benefit from sophisticated prognoses, however, has to contribute to the data pool with their own patient data. Brigitte Tag, a professor at the University of Zurich’s Chair of Criminal Law, Criminal Proceedings and Medical Law, stressed that the transfer of medical data does not often take place voluntarily but rather due to an emergency. Consequently, as far as she is concerned, “Trust is good but control is better.” On the one hand, dealing with patient data needs to be clearly regulated by law. On the other hand, users have to create trust that the data is actually used for the benefit of the common good. The fact that legislation is always one step behind the rapid development of technology today is the nature of the beast. Wherever the law does not yet apply, ethics comes into play, explains Tag.

This was recently highlighted by an example in Great Britain: when GPs fed patient data into a national database without informing the patients, the “care.data” project was suspended due to the public outcry. “Who has what data needs to be clearly regulated,” Tag urges. “And we always have to ask the question as to what the people who receive our data want.”

One possibility of securing our own data sovereignty was outlined by Dirk Helbing: the users could have a personal digital data account on their data freely at their disposal and delete it if need be. Personal data could then be sold to a kind of app store, as Helbing envisages, where it can be rated according to quality and made available to companies for their business ideas.

Three new big data chairs

Towards the end of the Lokaltermin, ETH-Zurich President Ralph Eichler confirmed the intention to expand “data science” at ETH Zurich further with the creation of three new chairs in the near future: one for “medical IT” in conjunction with the University Hospital Zurich, one for “information systems” and one for “social network analysis”. The knowledge on big data generated as a result ultimately stands to benefit Switzerland as an economic hub. Moreover, Eichler announced that ETH Zurich will carry a message into parliament for the period 2017 to 2020 aimed at standardising the data structure for all Swiss university hospitals.

JavaScript has been disabled in your browser