All information-seeking professionals need to sieve through large amounts of text to retrieve the information they need so that they can stay up-to-date of developments in their field. Language Technology tools can help make the analyst’s work more efficient by increasing the amount of data analysed and by speeding up the process. Software tools applied to big data may additionally provide a bird’s view of trends and data distributions not easily visible to the human reader.
The European Commission’s Joint Research Centre (JRC) has developed the Europe Media Monitor (EMM) family of applications, which aims to provide solutions for the daily media monitoring needs of a large variety of users working in diverse fields. The users include EU institutions, national authorities of EU Member States, international organisations outside the EU, and more. Individuals have free access to EMM through the publicly accessible EMM applications and through apps for mobile devices. EMM gathers a daily average of 220,000 online news articles in about 70 languages, classifies them into thousands of categories, groups related articles, links related news over time and across languages, extracts and disambiguates mentions of entities (persons, organisations and locations), detects spelling variants, recognises direct speech quotations, fills specific event scenario templates, and more. Due to the large scale of the effort, EMM can track topics, detect trends and act as an early warning tool. The moderation software NewsDesk allows human analysts to produce readily formatted in-house newsletters with little effort. See http://emm.newsbrief.eu/overview.html for more details on EMM.
The speaker will present the functionality of EMM, give concrete examples of news content complementarity across languages (national bias) and highlight both the benefits and the potential dangers of automated large-scale media monitoring.