Qatar Computing Research Institute (QCRI) has developed a system for the World Bank that automates the geocoding of all bank-financed projects. The system, built by QCRI’s Data Analytics team, enables more efficient and effective labeling of the Bank’s project portfolios and places them on a map for analysis, monitoring and evaluation.
The QCRI Geotagger system augments the World Bank’s Mapping for Results initiative, a partnership with AidData, which manually geocoded all active World Bank-financed projects in 144 countries. Mapping for Results is part of the Bank’s Open Data initiative, which allows for more transparency of its activities and access to information that promises to produce new analysis, tools and solutions to development challenges.
Patrick Meier, Director of Social Innovation at QCRI, said of the institute’s involvement:
‘The Geotagger enables the World Bank and partners to turn Open Data into useable data, which will bring more transparency and accountability to the international development space. QCRI’s Data Analytics team is well placed to support the World Bank’s effort in making sense of this development data thanks to our advanced expertise in applied analytics research.’
QCRI’s Data Analytics team has built expertise focused on researching core data management challenges such as data extraction, integration and quality, with the objective of identifying new directions and techniques for enabling the effective use of data for decision-making. Leveraging this expertise, the team developed a system to access the World Bank’s datasets, retrieve documents, extract and report relevant information on the Bank’s projects.
The system identifies locations and place names in documents from the World Bank Projects Data API using the Stanford Name Entity Recognizer and Alchemy, a text-mining platform. The place names are then geocoded using Google Geocoder, Yahoo! Placefinder and Geonames, and are visualised on a map.
The system also accesses and geocodes information from the World Bank’s procurement notices in order to compare project documents with procurement data, which has been made available for the first time in the Mapping for Results initiative. The developed system thus combines the locations with financial data to provide a holistic view on project expenses.
Since the launch of Mapping for Results, three generations of interns read many thousands of pages of World Bank project documentation, safeguard documents, and results reports to identify and geocode exact project locations. Though very successful, the initiative had encompassed only the active projects due to its heavy reliance on manpower.
The automation and effectiveness provided by QCRI’s Geotagger system also enables the geocoding and mapping of historic projects, allowing researchers to look at the evolving World Bank portfolio from a new, more disaggregated and spatial angle.