In this article, you will learn how news articles are tagged with companies and topics in order to generate data in the NewsFlow and Materiality modules.
Datamaran uses Natural Language Processing (NLP) to analyze the text of news articles from selected online media.
This aims to bring up two main types of data:
- Articles tagged with a company that we track - the output will be the number of articles mentioning that company.
- Articles tagged with a topic that we track - the output will be the number of articles mentioning that topic.
For each company and topic, our data scientists define a list of inclusion and exclusion terms, and rules that will allow our AI to determine if an article mentions a company in its title. That way the analysis considers the company as a primary subject.
Machine learning is also used to train the machine to identify the context in which a company or topic is mentioned. e.g. Amazon the company is excluded if the context shows that the article is actually talking about the Amazon rainforest.
The same article can be tagged with several companies or several topics. However, in order to be included in the results of the NewsFlow or Materiality modules, the articles must be tagged with at least one company AND one topic.