Progress Report for Week 2 (June 9 – June 15)
1. What did I get done this week?
- Parsed GeoCroissant metadata: Implemented a parser to convert GeoCroissant metadata (
croissant.json
) into a valid STACItem
object usingpystac
. - Mapped fields between GeoCroissant and STAC: Documented and translated key fields (e.g.,
identifier
→id
,distribution
→assets
, etc.). - Handled spatial and temporal metadata: Inferred geometry, bounding box, and temporal coverage (start, end, midpoint) from metadata and description.
- Asset management: Mapped Croissant
distribution
to STACassets
, inferred correctmedia_type
, and added roles (e.g.,data
,metadata
,documentation
). - STAC extensions support: Added support for the STAC Table Extension and integrated column metadata using
pystac.extensions.table
. - Output and validation: Generated and saved a valid STAC item (
stac_item.json
) and confirmed validity usingstac-validator
.
2. Plan for Next Week (June 16 – June 22):
- Create a working example to convert GeoCroissant metadata into DCAT format.
- Document how GeoCroissant fields map to DCAT properties.
- Reuse and clean up code from the STAC conversion for DCAT use.
- Explore ways to validate the DCAT output using SHACL or JSON-LD tools.
3. Am I blocked on anything?
- No, I am not currently blocked on anything.
Links to Work Done:
- Project Wiki Page: GSoC 2025 AI‐ready Dataset Metadata as a Service using ZOO‐Project · HarshShinde0/ZOO-AI-DATASET-MAAS Wiki · GitHub
- Public Repository: ZOO-AI-DATASET-MAAS/GeoCroissant to STAC at main · HarshShinde0/ZOO-AI-DATASET-MAAS · GitHub
Best Regards,
Harsh Shinde