GSoC-2025 - Community Bonding Report - AI‐ready Dataset Metadata as a Service using ZOO‐Project

Name: Harsh Shinde
Project Title: AI-ready Dataset Metadata as a Service using ZOO-Project
Mentors: @chetanm , @gfenoy
Organization: OSGeo / ZOO-Project
OSGeo Profile : OSGoe Profile Link
Project Wiki: Github Wiki Link
Repository: Repo Link


Period: Community Bonding (May 8 – June 1, 2025)


Work Done in Community Bonding period :

  1. OSGeo and Project Onboarding
  • Created my OSGeo user profile and linked it on the Accepted Students wiki page.
  • Joined relevant communication platforms (OSGeo Discourse, ZOO-Chat, mailing list, etc.).
  • Engaged with my mentor and got clarity on deliverables and expectations.
  1. Project Wiki Setup
  • Created my project wiki page: Project Wiki Link
  • Linked the wiki on the official GSoC 2025 Accepted Students wiki page.
  • Added a basic summary of the project, goals, repository link, and weekly reporting structure to the wiki.
  • Added placeholder sections for the week-by-week updates and final deliverables.
  1. Project Preparation
  • Set up the basic folder structure for implementing the ** AI-ready Dataset Metadata as a Service using ZOO-Project in the repository.
  • Explored the codebase of ZOO-Project to understand integration points.
  • Discussed with mentors about potential test cases and documentation strategy.

Community Interaction

  • Actively engaged with my mentor(s) and group communication channels to clarify goals and receive feedback.
  • Followed ongoing development discussions related to the GeoCroissant & ZOO-Project, and familiarized myself with its pull request process, testing routines, and code review workflows.

Plans for the Coding Period

  • Start implementing Time Series Integration using the Earth Engine Catalog, aligned with the GeoCroissant metadata model.
  • Focus Areas:
    • Develop Python scripts to fetch and preprocess time series data from Google Earth Engine and the CEDA Portal.
    • Design and generate valid GeoCroissant-compliant JSON-LD metadata files for the retrieved datasets.
    • Set up and test Earth Engine and CEDA data ingestion workflows, including documentation for reproducibility.
  • Maintain continuous progress tracking by regularly updating the project wiki page and posting weekly reports.

Am I blocked on anything?

  • No, I am not currently blocked on anything.

Outcome

I am fully prepared to begin the coding phase and have set up all required infrastructure (wiki, repo, folder structure). I’ve built a clear understanding of the task ahead and am in regular contact with my mentor. Looking forward to contributing effectively in the coming weeks!