How RAI’s Hyper Media News aggregation system keeps staff on top of the news

Intervenant(s) : Maurizio Montagnuolo

  • Langue : English
  • Type d'événement : Conférence
  • Date : Mardi 10 juillet 2012
  • Horaire : 11h40
  • Durée : 40 minutes
  • Lieu : Uni Mail S160
Public cible : Professionnels


This presentation introduces the RAI Hyper Media News aggregation system for managing news streams from different media sources. Information streams from both television and the internet are automatically acquired, aggregated in topics and indexed to provide more integrated access to the material. The core algorithm is based on the Apache OpenNLP library for text analysis and a novel similarity function for clustering. Several models are available to suit different languages. Training tools are provided for building new language models, thus making this library extensible and open to many sort of needs. News topics are contextualized within automatically extracted information such as entities, temporal span, categorical topics, social networks popularity and audience scores. All this is indexed using the Apache Solr engine, providing unified search and browse services for any web user. Additional resources from professional repositories (such as broadcasters’ archive), can be accessed as well.


Dr. Maurizio Montagnuolo received his Laurea degree in Telecommunications Engineering from the Polytechnic of Turin in 2004, after developing his thesis at the RAI. In 2008 he received his Ph.D. in "Business and Management" at the University of Turin. His initial work was in the area of artificial intelligence. He was involved in research projects on automatic classification and characterisation of television genre. His current research interests are Web and multimedia data mining.

Documents joints

RAI's Hyber Media News system

How RAI’s Hyper Media News aggregation system keeps staff on top of the news