CGIAR Big Data Platform on Twitter CGIAR Big Data Platform on Facebook CGIAR Big Data Platform on LinkedIn CGIAR Big Data Platform on Instagram CGIAR Big Data Platform on YouTube CGIAR Big Data Platform RSS Feed
Global Agricultural Research Data Innovation Acceleration Network

Welcome to the GARDIAN ecosystem

What’s new in GARDIAN?

The ability to map and spatially query production estimates for 30+ crops
Visualize 12+ TB of climate datasets
An analytic workbench enabling CGIAR researchers to apply machine learning analytics
A service to help flag personally-identifiable information before they are made open



Explore data assets from across CGIAR and a growing set of institutional partners, including: USAID, the UK’s Foreign, Commonwealth & Development Office, the World Bank, the US Department of Agriculture, the Indian Council for Agricultural Research, and the Open Government Data Platform India.


The Agronomy Field Information Management System (AgroFIMS) enables the creation of field books for digital data collection, using an ontology-based set of variables, units and protocols, hence generating FAIR data. AgroFIMS is organized around modules that represent the typical cycle of operations in agronomic trial management, and includes algorithms for statistical analysis of the collected data.
High-Resolution Population Data
Members of the CGIAR community can access the global gridded population LandScan™ dataset via a subscription from Oak Ridge National Laboratory. LandScan is a community standard for global population distribution data and is widely regarded as one of the best available population datasets. At approximately 1 km (30″ X 30″) spatial resolution, it represents an ambient population (average over 24 hours) distribution.
Commercial Satellite Imagery
CGIAR has partnered with Maxar to accelerate machine learning solutions for agriculture. Under this partnership, CGIAR’s geospatial scientists will be able to mine Maxar´s 100 petabyte imagery library using machine learning and the computational power of the company’s Geospatial Big Data platform (GBDX) to create more sophisticated baseline datasets in agriculture, plan new projects and monitor crop health, crop yield and the environmental impacts of farming.
Gridded Global Weather Data
CGIAR researchers can access validated high-resolution gridded weather data from multiple sources, including The Weather Company, aWhere, and the European Centre for Medium-Range Weather Forecasts (ECMWF). The Big Data Platform provides these data through Application Programming Interfaces (APIs) for advanced users and also facilitates the reanalysis of weather data to serve a broad range of users (e.g., weather data in GIS and crop model-compatible formats for geospatial scientists and crop modelers).
Rural Household Multi-Indicator Survey (RHoMIS)
RHoMIS provides a modular approach to create user-friendly and efficient surveys, pre-written data processing code, and the opportunity to contribute to a harmonized global database. RHoMIS is focused at the household level, and the data is intended to give an overview of the farm-livelihood system, covering a wide range of topics. Major topics covered relate to agricultural production, consumption, and sales, food security, gender dynamics, poverty dynamics, and environmental indicators. Researchers can benefit from efficiency savings in terms of survey tool design and reporting, and can draw upon the global database to broaden the findings from their own surveys.
FAIR Data workflow & upload
GARDIAN will soon include data upload functionalities backed by a workflow leveraging the FAIR Principles to make data Findable, Accessible, Interoperable, Reusable. This workflow is based on the Collaborative Open Plant Omics (COPO) tool to ease data management towards open and FAIR outcomes.
FAIR Data Guidelines
The FAIR Data Principles is a set of guiding principles to make data Findable, Accessible, Interoperable, and Reusable. However, these principles are not orthogonal and have not been designed for automated, machine-based evaluation. To enable this, we have adapted metrics from the Netherlands Institute for Permanent Access to Digital Research Resources (DANS) to provide quantifiable metrics and a guide towards FAIR compliance.
Check your data for Personally Identifiable Information (PII)
PII is any information that can be used to uniquely identify, contact or locate an individual, or can be used with other sources to uniquely identify a person. It consists of a broad range of information, including names, addresses, geolocation, and much more. GARDIAN's PII Engine helps in finding where PII is located in datasets, allowing users to then decide how to deal with it.
Responsible Data Guidelines
The Responsible Data Guidelines are intended to assist agricultural researchers with handling privacy and Personally Identifiable Information (PII) throughout a research project’s data lifecycle.
The Crop Ontology (CO)
The CO compiles validated concepts and their inter-relationships on anatomy, structure and phenotype of crops, on trait measurement and methods as well as on germplasm with multi-crop passport terms.
CG Core metadata schema
The CG Core metadata schema is a minimum set of metadata elements that aligns closely with Dublin Core, a generic and widely adopted metadata standard, and is applicable across varied disciplines/data streams and types of information assets in the agricultural sector. CG core facilitates data discovery, meta-searching and indexing across CGIAR repositories and inter-linking across related resources (e.g. data with publications). It is openly available with a reference guide to help users understand and apply it.
The Agronomy Ontology (AgrO)
AgrO includes terms from the agronomy domain that are semantically organized and can facilitate the collection, storage and use of agronomic data. It enables easy interpretation, aggregation, and reuse of the data by humans and machines alike.
CG Labs
The Platform launched the Collaborative GARDIAN Labs (CGLabs), the latest offering in the GARDIAN data ecosystem. CGLabs is an open collaborative data science platform that allows researchers to work together on the same data science project using datasets securely transferred from GARDIAN and other trusted sources.

CGLabs facilitates discoverability, visualization, and analyses of datasets and collaborative analytics using R and Python computer programming languages. CGLabs establishes a secure transfer and storage of computer program codes and data files through Globus, another core Shared Service that the Platform provides.
The CGIAR Platform for Big Data in Agriculture has activated a subscription to Globus. The subscription offers comprehensive data management capabilities for CGIAR researchers, including file sharing, easy and secure transfer of large datasets, access to cloud storage, protected data management with the setting of appropriate access permissions for sensitive data, and advanced endpoint administration.


The GARDIAN Ecosystem is evolving fast, with many new features and new tools on the horizon. Stay tuned to try out these and many more new possibilities offered by GARDIAN!

Towards semantically enabled data

Imagine if different datasets could be easily aggregated, leading to new analyses and interpretation. Imagine getting new and better meaning from vast amounts of data. Short demonstrations coming soon to allow a glimpse of the power of data semantics.

FAIRification workflow and data upload

GARDIAN will soon include data upload functionalities backed by a workflow leveraging the FAIR principles to make data Findable, Accessible, Interoperable, Reusable.
Visit the website of the CGIAR Platform for Big Data in Agriculture to learn more about what we’re up to, our Communities of Practice, our Inspire Challenge, and much more!