Progress Report for Week 10 (Aug 3 – Aug 10)
1. What did I get done this week?
- Implemented the optional
mlcroissant[geo]extension for geospatial support. - Updated
pyproject.tomlto include all required geo dependencies (geopandas,shapely,pyproj,rasterio,pystac) under the[project.optional-dependencies]section. - Developed a robust
stac_to_geocroissantconverter supporting both file and dictionary inputs, with comprehensive mapping of STAC fields to GeoCroissant JSON-LD. - Validated the workflow by converting sample STAC catalogs to GeoCroissant format and confirming output with the standard
mlcroissant validateCLI. - Demonstrated the complete workflow in Jupyter Lab, from installation to conversion and validation.
2. Plan for Next Week :
- Support for different conversion types, including
[geo]for GeoCroissant. - Implement comprehensive test cases for conversions.
- Add an additional validator specifically for GeoCroissant outputs to ensure geospatial metadata integrity.
3. Am I blocked on anything?
- No, I am not currently blocked on anything.
Links to Work Done:
- Public Repository: croissant/python/mlcroissant/mlcroissant/_src/geo at harsh/geocroissant · HarshShinde0/croissant · GitHub
- Updated Project Wiki Page: GSoC 2025 AI‐ready Dataset Metadata as a Service using ZOO‐Project · HarshShinde0/ZOO-AI-DATASET-MAAS Wiki · GitHub
Best Regards,
Harsh Shinde