Prodigious amount of data is generated daily. Nowadays, wearable devices are on the rise, sensors are engrained even in miniature equipments for tracking. Should the overarching goals of internet of things reach full potential - meaning every device can transmit or receive data, then the rate of data generation will further soar exponentially. This avalanche of data is not only rendering data analytics more complex, but it is also threatening individuals’ privacy. As a result, this thesis focuses on both knowledge discovery and knowledge protection.
The data analytics challenges are self-evident across an array of industries such as in telecommunications, where organizations continue to lose customers and are obsessed to identify patterns of such customers, or in life sciences where researchers are striving to capture vital patterns in DNA or better understand the human body using sensors. These challenges connote the demand for new techniques to expose more hidden patterns. One approach to address these challenges is by the use of predictive analytics.
This thesis presents a wide range of novel data mining predictive analytics solutions, ranging from location prediction through activity recognition to prediction in the medical domain using probabilistic graphical models such as Markov chains, Hidden Markov Models and Conditional Random Fields. Besides, it explores the usage of Wavelets to determine similarities or correlations between time series.
Furthermore, to ensure that the privacy of an individual is not infringed, this thesis provides new enhanced privacy preserving data mining techniques to anonymize users, huge networks and hide locations.
|Autor||Assam, Roland Amba|
Predictive Analytics and Protection of Massive Data
The volume of data generated from human social interactions is breathtaking, data stemming from human sensors and wearables is on the rise, and soon, machines are poised to start interacting too.
This data can be utilized to perform data analytics, which might expose valuable vivid patterns and knowledge that would be imperative to society.
In this thesis, we present novel data mining predictive analytics solutions, ranging from location prediction through activity recognition to prediction in the medical domain.
On the other hand, such data also contains huge volume of sensitive information that is raising serious privacy concerns. To mitigate such concerns, we present new enhanced privacy preserving data mining techniques to protect data.