Transforming News Reports into Data with Gemini

by ai-intensify
0 comments
Helping AI have long-term memory

There is abundant unstructured data about historical events – news articles, government reports, and local bulletins – but it is impossible to manually extract this information at scale. Our method analyzes news reports where flooding is a primary topic. then we use this Google Read Aloud User-Agent Primary text from 80 languages, standardized into English Cloud Translation API.

The most important step of the extraction process is done using the Gemini Large Language Model (LLM). We have designed a sophisticated prompt that guides Gemini through a rigorous analytical verification process:

  • Classification: Differentiates between model reports real, current, or past Floods and articles that only discuss future warnings, policy meetings, or general risk modeling.
  • Temporary Argument: Gemini anchors relative context to an article’s publication date (for example, “last Tuesday”) to determine the exact event time.
  • Spatial Accuracy: The system identifies granular locations (neighborhoods and streets) and maps them into standardized spatial polygons using Google Maps Platform.

GroundSource’s technical validation confirms its credibility for high-risk research. In manual reviews, we found that 60% of the extracted events were accurate in both location and time. Importantly, 82% were accurate enough to be practically useful for real-world analysis – for example, by capturing the correct administrative district or locating it within the same day of the reported peak of the event.

The coverage provided by GroundSource represents a massive expansion on existing archives. By converting unstructured media into data, we generated 2.6 million events – a significant increase compared to the records found in traditional monitoring systems. Furthermore, spatiotemporal matching shows that GroundSource captured between 85% and 100% of the severe flood events recorded by GDACS between 2020 and 2026, demonstrating its effectiveness in identifying smaller, localized events as well as high-impact disasters.

Related Articles

Leave a Comment