In advance of our “Big Data Europe Workshop: The challenges of big data for societies in a changing world” taking place on 18 November 2015 in Luxembourg, CESSDA and Semantic Web Company (SwC) organised a hang-out on 13 October to discuss this topic more informally. The hangout began with an introduction to BDE by Ivana Versic (CESSDA), followed by a presentation on the challenges, dimensions and opportunities of big data by Martin Kaltenböck ( SwC) and by more information on requirement elicitation from Timea Turdean (SwC).
Ivana explained that the BDE project would produce an integrated stack of tools to manipulate, publish and use large-scale data resources. Shen explained that the focus of the project was twofold, firstly to engage with a diverse range of stakeholder groups representing our sixth Horizon 2020 challenge (SC6), “Europe in a changing world – Inclusive, innovative and reflective societies”, and secondly to design, realise and evaluate a big data aggregator platform infrastructure.
Martin started by reminding participants that every day, we create 2.5 quintillion bytes of data, so much that 90% of the data in the world today has been created in the last two years alone, quoting a study by IBM. The amount of data being produced is growing exponentially, whether it comes from mobile phones, weather sensors or social networks. Statistics are also developing and more and more data is being produced by researchers and academia. This is a challenge and an opportunity which requires taking care of data efficiency and data management approaches, mechanisms and technologies. Martin went on to present the five dimensions of big data to his mind, the five V’s: volume (amount of data), velocity (working with real time data), variety (of sources, formats, data types), veracity (e.g. of statistics where using you have to use several sources to have truth in the data), value (finding an added value of data and data management). He also explained that there was a lack of data scientists and education in data management in Europe.
Timea presented the requirement elicitations found so far for societal challenge six which stemmed from three sources, the online survey carried out at the beginning of the project, interviews and the use case pilots (being carried out in the framework of BDE). She went through the results concerning the first four V’s of big data: volume: that there is not a lot of data in place but that it will become increasingly important in the future, velocity: useful to have (e.g. Google flu real time analysis of the situation), variety is very important (economic and social science data), veracity (ensuring the data quality is high). Comparatively speaking (across all six societal challenges), volume is not highly important (20%), velocity is even less important, variety is over 20% for societal challenge six, she continued.
Regarding long-term preservation of data, SC6 has the infrastructure in place, and regarding data processing it is done mostly on small samples of data, which is consistent with the finding mentioned above concerning the little importance of volume.
A summary written by Eleanor Smith,
Senior Communications Officer of CESSDA
- You can watch the “SC6 – Hang Out: Successful data management in the Social Sciences and Humanities” on YouTube
- Come visit us at the European Data Forum 2015 (EDF2015), from 16 to 17 November in Luxembourg:
Read about our “Big Data Europe Workshop: The challenges of big data for societies in a changing world” taking place on 18 November 2015 in Luxembourg,on 18 November here (fully booked, waiting list).