Event Detection and User Interest Discovery using Data Streams from Social Media
At the rate at which social media and its influence is evolving, it is essential to monitor and study microblogs and microbloggers. Microblogs enable users to post, reply, broadcast and sharing posts. Event detection, user interests and data streams are three individual crucial research topics in the field of data mining and big data, since detecting events from the ginormous amount of data available is not quite easy. Data streams as the name suggests are continuous flowing data, so the domain deals with real time inflow and outflow of data which is not fixed. The paper aims to compare various algorithms which can be applied for event detection in social media data streams. The HEE model (Hot Event Evolution Model) which not only takes user interest distribution into consideration but also uses the microblog posts to discover user interest. The (TD-ATM), which jointly learns two sets of topics on the two data sets and automatically couples the topic parameters to avoid the potential inconsistencies between these two data sets. Experimental results show the proposed approach outperforms several existing Web service clustering approaches. The EVE model comprises of HITS, the PLSA model and EM algorithm. HITS (Hypertext Induced Topic Search) eliminates the unnecessary posts and separates out the influential posts, while the PLSA model finds the latent events in the document streams. The EM algorithm is responsible for training the parameters. The BEE (Bursty Event Detection) model is used to detect bursty events; it detects them by analyzing the short text datasets. The EDO method is used to detect events in a specific time interval. It contains five modules, namely, tokenization, graph generation, graph pruning, clustering and event evolution.