[GeoNetwork-devel] Contributions / how to improve?

Hi all,

In the last few days, a couple of feedbacks were pointing out that some (new?) rules have to be followed and that there is some kind of “correct direction”. We also have a concept of frightening stuff which maybe we could rephrase as “experiments” …

Would it make sense to discuss and define what are those rules/requirements and direction to have a clear view of what is expected from contributors? Define what level of experimentation is acceptable or not? It would also give more visibility on what level of exigences is required and if it matches supporting projects needs, people knowledge and people efforts dedicated to the project run.

We tried in the past a couple of “constraints” without always much success eg. jslint. Maybe some tools, existing guidelines or new people’s savoir-faire can now help … We also know that we are facing common OS project issues (eg. PR with no review, long term support of PR work), and that’s an area where we can also make progress.

Looking forward to ideas, suggestions and contributions.

Francois

PS: For now, I step down from my role of making an almost monthly maintenance release of version 4. Don’t hesitate to take over…

[1] https://github.com/geonetwork/core-geonetwork/pull/5920
[2] https://github.com/geonetwork/core-geonetwork/pull/5932#issuecomment-906206091
[3] https://github.com/geonetwork/core-geonetwork/pull/5931
[4] https://github.com/geonetwork/core-geonetwork/pull/5941

Hi,

An acceptable level of experimentation would be no experimentation at all. GeoNetwork is not a sandbox and thousands of organizations rely on it daily; experimenting on the main branch incurs tremendous migration and bugfixing costs down the line, costs which are draining potential funding for actual quality work on the project.

As for suggestions, I would strongly invite anyone and everyone to stop incentivizing customers to fund the UI in the core-geonetwork project: AngularJS will be end of life in 3 months, most libraries it relies on are deprecated as well and maintenance costs are extremely high. The https://github.com/geonetwork/geonetwork-ui project has been created as a successor with higher quality standards in mind and a vision for the future of GeoNetwork as a universal data hub. The project is advancing slowly but steadily, and will benefit from all contributions of any kind.

Ideally the current GeoNetwork UI would receive some sort of LTS/LTR status and enter a long-term stabilized state, but I’m not sure anyone would be willing to offer this kind of guarantee.

···

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Olivier Guyot

Geospatial Developer

+49 89 2620 89 924

Hi François,

Thanks for pulling this down.
I would be pleased to discuss about those topics if we can arrange a meeting between developers to hear what everyone would propose to improve code quality and maintenance. It will benefit anyone.

Here some craftman trends which could be useful:

  • always leave the code in a better state that how you found it
  • when fixing a bug: write a failing test which illustrate the bug, try to write passing tests that cover the whole feature, fix the bug so all tests are green
  • constant refactoring, the code is legacy, when you touch a piece, instead of suffering from the legacy, make the code evolve
  • put love in the code you do :slight_smile:
  • add tests, refactoring is fundamental, to be sure you don’t break things while refactoring, you can only rely on tests
  • non tested code is no code (garbage)
  • quality is free, implementing a quick feature or a quick fix may give the impression it’s not costly, but you’ll pay the price afterwards

Actually, when we develop a feature, it’s like were a developing it 2 or 3 times:

  • the first quick shot: “it works”
  • the first issues “actually it does not really work when you are using it in a deeper way” => fixes
  • still not working perfectly => the full implementation with tests

It implies different people, different findings and I think we are all losing from that (energy, time, faith), the feature should be implemented only once.

All what I mention sounds quite of obvious, but reality, deadlines and constraints push us to take shortcuts, which we should avoided. We should really do an effort to stick to better practices.

Also, take code review as a gift, it’s not to blame people, but to try to keep a good code overall. Another craftman guideline says “be kind with the people, but rude with the code”, in other term, all comments that push the people and the code to be better are good, don’t take them personally.

Hope it helps.
Cheers

···

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Florent Gravin
Technical Leader - Architect
+33 4 58 48 20 36

Hi

Some comments:

In the last few days, a couple of feedbacks were pointing out that some (new?) rules have to be followed and that there is some kind of “correct direction”. We also have a concept of frightening stuff which maybe we could rephrase as “experiments” …

An acceptable level of experimentation would be no experimentation at all. GeoNetwork is not a sandbox and thousands of organizations rely on it daily; experimenting on the main branch incurs tremendous migration and bugfixing costs down the line, costs which are draining potential funding for actual quality work on the project.

I’m not sure if this refers to adding new features to GeoNetwork? If that’s the case, I’m fine with that, with a proper review of the PR, test cases (not always possible to add, but that’s another history) and documentation. Also not only unit tests, but describe a test case to reproduce the issue in the PR, so that a tester can verify the fix.

I think it could help to define a release calendar, at least most customers will appreciate it. For example, a major release once or twice a year and monthly releases of the stable version(s), so new features go in the branch for the next release. That should lower the risk of introducing regressions, but also requires frequent releases, otherwise new features wait even for years to end in the stable releases.

For 4.0.x, Francois has taken care of the monthly releases, something that should also be highly appreciated as it allows to keep the project alive, as most users do not have the option to build it.

We tried in the past a couple of “constraints” without always much success eg. jslint. Maybe some tools, existing guidelines or new people’s savoir-faire can now help … We also know that we are facing common OS project issues (eg. PR with no review, long term support of PR work), and that’s an area where we can also make progress.

For me, the main problem with some of these tools has been that they weren’t as well integrated into the build process, so it was pretty easy to forget to run before committing or it took a long time to run and with all the legacy code running during the example. Findbugs required adding many exception cases for the legacy code; otherwise the report was unmanageable.

In any case, this is something to document / improve so for Javascript code it’s executed jslint (or other better tool?) and for Java code SpotBugs, at least for the new code.

We also know that we are facing common OS project issues (eg. PR with no review, long term support of PR work), and that’s an area where we can also make progress.

Fully agree, review of PR sometimes takes a long time, usually also because only a few developers do PR reviews, making it difficult to manage. I am not sure that obtaining funds from OSGeo is an option for this type of task? I think it would make the task easier.

As for suggestions, I would strongly invite anyone and everyone to stop incentivizing customers to fund the UI in the core-geonetwork project: AngularJS will be end of life in 3 months, most libraries it relies on are deprecated as well and maintenance costs are extremely high. The https://github.com/geonetwork/geonetwork-ui project has been created as a successor with higher quality standards in mind and a vision for the future of GeoNetwork as a universal data hub. The project is advancing slowly but steadily, and will benefit from all contributions of any kind.

That might be a good option, but the main problem I see, unless I’m really wrong, is that the project in the current state can’t even replace the GeoNetwork search UI app. Don’t take it as a criticism, as it’s something that requires that the different parties working in GeoNetwork should promote, but from my experience with customers, unless there is a clear work plan to replace all the applications, it is difficult to get them involved.

Also, take code review as a gift, it’s not to blame people, but to try to keep a good code overall. Another craftman guideline says “be kind with the people, but rude with the code”, in other term, all comments that push the people and the code to be better are good, don’t take them personally.

I fully agree, but when doing a review we must be constructive and understand that perhaps the person making a contribution is not an expert in specific areas where the reviewer can be an expert. So we should try to be kind with the code and propose constructive alternatives.

All what I mention sounds quite of obvious, but reality, deadlines and constraints push us to take shortcuts, which we should avoided. We should really do an effort to stick to better practices.

We could start writing these best practices in a WIKI or similar and also start supporting tools that automate some of these best practices. The Bolsena codesprint might be a good place to work on this.

Regards,
Jose García

On Tue, Sep 14, 2021 at 10:30 AM Florent Gravin via GeoNetwork-devel <geonetwork-devel@lists.sourceforge.net> wrote:

Hi François,

Thanks for pulling this down.
I would be pleased to discuss about those topics if we can arrange a meeting between developers to hear what everyone would propose to improve code quality and maintenance. It will benefit anyone.

Here some craftman trends which could be useful:

  • always leave the code in a better state that how you found it
  • when fixing a bug: write a failing test which illustrate the bug, try to write passing tests that cover the whole feature, fix the bug so all tests are green
  • constant refactoring, the code is legacy, when you touch a piece, instead of suffering from the legacy, make the code evolve
  • put love in the code you do :slight_smile:
  • add tests, refactoring is fundamental, to be sure you don’t break things while refactoring, you can only rely on tests
  • non tested code is no code (garbage)
  • quality is free, implementing a quick feature or a quick fix may give the impression it’s not costly, but you’ll pay the price afterwards

Actually, when we develop a feature, it’s like were a developing it 2 or 3 times:

  • the first quick shot: “it works”
  • the first issues “actually it does not really work when you are using it in a deeper way” => fixes
  • still not working perfectly => the full implementation with tests

It implies different people, different findings and I think we are all losing from that (energy, time, faith), the feature should be implemented only once.

All what I mention sounds quite of obvious, but reality, deadlines and constraints push us to take shortcuts, which we should avoided. We should really do an effort to stick to better practices.

Also, take code review as a gift, it’s not to blame people, but to try to keep a good code overall. Another craftman guideline says “be kind with the people, but rude with the code”, in other term, all comments that push the people and the code to be better are good, don’t take them personally.

Hope it helps.
Cheers

On Tue, Sep 14, 2021 at 9:14 AM Olivier Guyot via GeoNetwork-devel <geonetwork-devel@anonymised.comts.sourceforge.net> wrote:

Hi,

An acceptable level of experimentation would be no experimentation at all. GeoNetwork is not a sandbox and thousands of organizations rely on it daily; experimenting on the main branch incurs tremendous migration and bugfixing costs down the line, costs which are draining potential funding for actual quality work on the project.

As for suggestions, I would strongly invite anyone and everyone to stop incentivizing customers to fund the UI in the core-geonetwork project: AngularJS will be end of life in 3 months, most libraries it relies on are deprecated as well and maintenance costs are extremely high. The https://github.com/geonetwork/geonetwork-ui project has been created as a successor with higher quality standards in mind and a vision for the future of GeoNetwork as a universal data hub. The project is advancing slowly but steadily, and will benefit from all contributions of any kind.

Ideally the current GeoNetwork UI would receive some sort of LTS/LTR status and enter a long-term stabilized state, but I’m not sure anyone would be willing to offer this kind of guarantee.

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Olivier Guyot

Geospatial Developer

+49 89 2620 89 924

On Mon, Sep 13, 2021 at 6:58 PM Francois Prunayre via GeoNetwork-devel <geonetwork-devel@anonymised.comt> wrote:

Hi all,

In the last few days, a couple of feedbacks were pointing out that some (new?) rules have to be followed and that there is some kind of “correct direction”. We also have a concept of frightening stuff which maybe we could rephrase as “experiments” …

Would it make sense to discuss and define what are those rules/requirements and direction to have a clear view of what is expected from contributors? Define what level of experimentation is acceptable or not? It would also give more visibility on what level of exigences is required and if it matches supporting projects needs, people knowledge and people efforts dedicated to the project run.

We tried in the past a couple of “constraints” without always much success eg. jslint. Maybe some tools, existing guidelines or new people’s savoir-faire can now help … We also know that we are facing common OS project issues (eg. PR with no review, long term support of PR work), and that’s an area where we can also make progress.

Looking forward to ideas, suggestions and contributions.

Francois

PS: For now, I step down from my role of making an almost monthly maintenance release of version 4. Don’t hesitate to take over…

[1] https://github.com/geonetwork/core-geonetwork/pull/5920
[2] https://github.com/geonetwork/core-geonetwork/pull/5932#issuecomment-906206091
[3] https://github.com/geonetwork/core-geonetwork/pull/5931
[4] https://github.com/geonetwork/core-geonetwork/pull/5941


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Florent Gravin
Technical Leader - Architect
+33 4 58 48 20 36


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

Vriendelijke groeten / Kind regards,

Jose García


Veenderweg 13
6721 WD Bennekom
The Netherlands
T: +31 (0)318 416664

Please consider the environment before printing this email.

Thanks for the level responses.

I think it could help to define a release calendar, at least most customers will appreciate it. For example, a major release once or twice a year and monthly releases of the stable version(s), so new features go in the branch for the next release. That should lower the risk of introducing regressions, but also requires frequent releases, otherwise new features wait even for years to end in the stable releases.

Jose, this is something that was decided I think 3 years ago during Bolsena, and you were there. Where did that take us? Aside from more work on your shoulders and those of François, not much. Deciding that releases should happen and when is not enough, the problem is that there is a need for so many releases at all.

Most releases are just full of bugfixes, and the new features will themselves bring more bugs in weeks/months/years. I am not blaming the contributors of GeoNetwork, but the project’s architecture in itself.

I am utterly convinced that GeoNetwork in itself has a very deep technical debt problem. The following simple reasoning is enough to explain many woes that we are all struggling with today:

  • Project architecture is overly complex (backend and frontend mixed, very large code base…)
  • Thus, refactoring large parts of the project is very difficult and can only be done by a handful of people
  • Thus, many parts of the projects are left without the refactoring/maintenance they need (hello Jeeves services, hello Ext.JS client which is still in the code base)
  • Thus, adding new features would be way too expensive if that included a reduction of the technical debt
  • Thus, new features are added without refactoring (see https://github.com/geonetwork/core-geonetwork/pull/5920)
  • Thus, project architecture and technical debt keep steadily increasing and no one can do anything
  • Repeat.
  • Bonus points if your long-time contributors are exhausted and stressed out and no longer want to work on the project.

We could start writing these best practices in a WIKI or similar and also start supporting tools that automate some of these best practices. The Bolsena codesprint might be a good place to work on this.

The answer cannot be adding new guidelines in the wiki or new automated checks (which was also done in the past). A code base has to be self-explanatory and welcoming for contributors (new and old alike). Rules must be inferred more than enforced.

That might be a good option, but the main problem I see, unless I’m really wrong, is that the project in the current state can’t even replace the GeoNetwork search UI app. Don’t take it as a criticism, as it’s something that requires that the different parties working in GeoNetwork should promote, but from my experience with customers, unless there is a clear work plan to replace all the applications, it is difficult to get them involved.

Having a search UI is definitely possible, although basic: there are search results, facets, full text search, plus a few others. We @ Camptocamp have a project currently running where we’re building a more complete search UI inc. dataviz capabilities. What I’m saying is that I think we should clearly point the customers who want custom/extensible catalog UIs towards geonetwork-ui.

This below is my opinion for the direction the project should take in order to thrive and live long:

  1. all medium to large development budgets should be redirected to geonetwork-microservices and/or geonetwork-ui

  2. the catalog backend should not work with raw XML records any more, but rely on a pivot format (POJO): this is IMO absolutely mandatory to get rid the huge XSL foundation that core-geonetwork currently sits (and the costs that come with it), and also mandatory if GeoNetwork is to open up to the open data ecosystem and become part of it
    I’m not sure what would the alternatives be, but I don’t think any of them would be very desirable.

Thanks for reading!

···

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Olivier Guyot

Geospatial Developer

+49 89 2620 89 924

Hey Francois:

Although I have been trying to take part more in core-geonetwork, I have also met the push back you experienced on “experiments”.

Really we need experiments to have a project that is alive. We have well established communication patterns (proposals, steering committee) if individuals wish to have input into planning to control risk. But lack of movement is also a bigger risk, we are all aware of examples like the user interface, elasticsearch, java 17 compatibility, … or even small build changes to introduce version numbers.

So I do not know what encouragement you need Francois, but anything that can be done to make a healthy environment for core-geonetwork experimenting life has my support.

As for those having trouble keeping up with changes, I have some ideas in the https://github.com/GeoCat/experiment-hnap on how to manage change and customizations without forking. Indeed this being quite the successful direction allowing GeoCat to better support our customers while more directly taking part in core-geonetwork. I have a presentation on using war-overlays at foss4g next week for anyone wishing to learn more.

···


Jody Garnett

Olivier:

Can you make a proposal to this effect, i.e. putting a LTS/LTR status on the current GeoNetwork UI. Given that AngularJS is wrapping up the proposal may include an EOL.

···


Jody Garnett

Jose:

All what I mention sounds quite of obvious, but reality, deadlines and constraints push us to take shortcuts, which we should avoided. We should really do an effort to stick to better practices.

We could start writing these best practices in a WIKI or similar and also start supporting tools that automate some of these best practices. The Bolsena codesprint might be a good place to work on this.

That is a good sprint topic, I recommend creating a pull request template with a checklist (so the information is presented right where it is needed).

Jody

Hi Jody,

Ok, that is actually a great idea. I’ll find the time to offer something through a Pull Request. Thanks,

···

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Olivier Guyot

Geospatial Developer

+49 89 2620 89 924

Oliver:

If I understand correctly there is a consistent theme here of more complexity and technical debt then can be managed and released. I know I personally struggled to understand Jeeves and provide javadocs for many of the geonetwork components this year.

I would like to learn more about pivot (POJO) backend idea, the version of saxxon used to support the XSL is quite dated and I have been seeking ideas how it could be replace (migrating the XSL+java functions approach to pure XSL I am not too fond of).

Thanks for the blunt assessment and architecture ideas.

···


Jody Garnett

Hey Jody,

No problem, happy to indulge.

So my analysis is that GeoNetwork, while presenting itself as a Java application, is at its core nothing more than an XSL engine. All operations that relate to its primary concerns, namely metadata, are done through XSL, be it in plugins or not.

As such, metadata is treated as a highly complex source of truth of indeterminate shape, serialized as a huge text blob in the database. Reading its content has to be done using XSL, as is modifying it. This complexity then logically spreads to the whole GeoNetwork application, following an entropy-like rule which is commonplace in software (complexity will always increase when adding features on top of each other).

Some examples of incongruous situations that stem from this:

  • The responsibility of complying with the schemas validation rules falls on the user since he/she has full control over the XML documents; then appears a frightening (again) experience of validating the document manually and painfully fixing each of the dozens of errors that logically appear; this of course varies for each schema, each set of rule and so on, much to the user disbelief
  • Something as simple as adding a reference to an online resource (pretty much the only reason metadata are created at all) requires convoluted XSL logic in case we’re adding, removing, replacing etc.; this has to be duplicated for single and multi-language context, and of course for each schema
  • The need for converting from one schema to the other must have arisen soon in the life of the project (e.g. for offering a CSW service); this is simple enough when going from Dublin Core to ISO19139, but anyone can sense that this will become unmanageable when the number of schemas goes up; indeed, now that ISO19115-3 became much more desirable and that many would like DCAT2 to became a first class citizen in GN as well, this conversion principle becomes a severe hindrance
  • As a consequence of this, it appeared necessary to have some kind of common format (aka pivot format) that we could rely on for conversions; unfortunately, this need was never addressed directly and kind of organically came to a resolution, resulting in the recent OGC API Records service using the ElasticSearch documents as a source to output DCAT2; this of course resonates with the recent changes in Elastic license policy and the external risk it represents
  • Do I need to talk about subtemplates and templates of subtemplates? The name says it all: XML documents are so complex that we need templates, we need subtemplates, and we need templates of subtemplates – just for adding a reference to a contact in a metadata record (again, a most basic operation).

This is all, as far as I know, the product of a paradigm which states that GeoNetwork should be schema-agnostic and work with an abstraction layer made of XSL, probably so that valuable metadata documents are kept intact in its database (which they’re not, for that matter).

This paradigm has most likely stood for 10+ years, and while it might have made sense at some point, it is in my opinion entirely obsolete.

We as a community must go past this and get rid of the XSL foundation of GeoNetwork.

My proposed solution is to :

  1. Rewrite a new GeoNetwork core to revolve around a CatalogRecord object model, written simply in Java; storing these objects will most likely be trivial, even with many fields and child objects (contacts, online resources, etc.)
  2. Offer a basic CRUD API over this collection
  3. Write transformations from XML to this POJO and back, once for each schema; this is where the real value lies and a critical part of the process; for this I suggest working with Groovy which offers great capabilities both in reading and outputting XML; Groovy was used until some time ago in GeoNetwork to provide a very efficient and maintainable formatter, sadly removed since
  4. Write a separate service which receives notifications through an event bus and is in charge of indexing the records in a search engine (e.g. ElasticSearch); this means the new geonetwork-ui could seamlessly use this new backend
    Then the community of contributors can gradually address more and more use cases, reusing existing GN code where possible while keeping the complexity of the solution in check. Hopefully, reaching out to the open data ecosystem will bring more funds and help bring this “new GeoNetwork” (which I guess could be GeoNetwork 5) to an audience as wide as GeoNetwork currently has.

I will finish my rant by saying that GeoNetwork needs Open Data as much as Open Data needs GeoNetwork. Not acknowledging this is, I think, a mistake that will eventually seal the project in its figurative grave.

Thanks again for reading this far!

···

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Olivier Guyot

Geospatial Developer

+49 89 2620 89 924

Hello,

I agree with most of the points raised here and would like to be part of efforts to rearchitect GeoNetwork to simplify it and make it more useful and sustainable,

What I would really like is if in this new architecture we could use RDF (DCAT or something like it) as the intermediary format,

The ease of linking metadata fragments, definitions, codelist, thesauri, etc would be huge advantages. And being open world the ability to losslessly move between the various metadata standards and profiles would be greatly simplified.

Cheers,

Byron Cochrane

OpenWork Ltd
Nelson, New Zealand
byron@anonymised.com
+64 21 794 501
https://www.openwork.nz/

“The whole problem with the world is that fools and fanatics are so certain of themselves, yet wiser people so full of doubt” - Bertrand Russell
"Never doubt that a small group of thoughtful, committed citizens can change the world; indeed, it’s the only thing that ever has.” - Margaret Mead

···

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Olivier Guyot

Geospatial Developer

+49 89 2620 89 924

Hi Olivier

Some comments:

  • The responsibility of complying with the schemas validation rules falls on the user since he/she has full control over the XML documents; then appears a frightening (again) experience of validating the document manually and painfully fixing each of the dozens of errors that logically appear; this of course varies for each schema, each set of rule and so on, much to the user disbelief

Not sure if I get this, you mean users get errors due to working directly with the XML document? Users should use instead of the editor form, not edit the raw XML content.

Another thing is that xsd validation errors are not very descriptive and hard to understand, but not really related to GeoNetwork. I can also agree that the distinction between xsd (structure validation) / schematron rules (content validation) can be confusing for users, but again is not something that GeoNetwork has invented. For schematron validation rules, has been a standard way to validate xml documents content for a long time.

If we want to use a Java model and implement the validation rules in another way, it sounds fine to me. But I think the current implementation should be also reviewed in perspective: the current technologies used when the project started 20 years ago were the standard way to handle XML in that moment.

My point, let’s define what we want to do and move forward instead of blaming all the time about the bad technologies (considering that now), but those were probably the best technologies when the project started and for several years after.

  1. Rewrite a new GeoNetwork core to revolve around a CatalogRecord object model, written simply in Java; storing these objects will most likely be trivial, even with many fields and child objects (contacts, online resources, etc.)

You mean to create a generic object model for different metadata standards to be stored in the database? And I guess use some conversion process to generate the xml in Dublin Core, ISO19139 or ISO19115-3, for example?

I am not against that, but it can be difficult to unify all the different properties of the metadata standards (Dublin Core, ISO19139, etc). Also if you check the xsd for ISO19139 or ISO19115-3 it has hundreds of properties, not counting extensions to the schemas that some countries use. Just saying that the object model should be defined properly to consider all these cases.

Also “Rewrite a new GeoNetwork core” sounds “cool” from a technical view, but also it’s a huge effort of months or even years of work. Again, not against this, but we should plan the work and don’t give false expectations.

  1. Write transformations from XML to this POJO and back, once for each schema; this is where the real value lies and a critical part of the process; for this I suggest working with Groovy which offers great capabilities both in reading and outputting XML; Groovy was used until some time ago in GeoNetwork to provide a very efficient and maintainable formatter, sadly removed since

“Groovy was used until some time ago in GeoNetwork to provide a very efficient and maintainable formatter, sadly removed since”, seriously?

I’m going to try to be cautious on this, but come on … have you tried to debug the formatter code in Java in case you get an error or not the expected result? Have you spent days in a customer set up trying to get the formatter working, trying to figure out cryptic errors due to apparently some issues with libraries used by Groovy, without any result?

Please check in https://github.com/geonetwork/core-geonetwork/issues/1350, at least the last 2 comments in the ticket: another person spending 10 days with issues related to the Groovy formatter, for him was “ok” as got a fix. Unfortunately, for our customer the solution didn’t work and after 2 days we had to give up with Groovy formatter, to get the system working for production and switched to the xslt formatter, which worked fine.

  1. Write a separate service which receives notifications through an event bus and is in charge of indexing the records in a search engine (e.g. ElasticSearch); this means the new geonetwork-ui could seamlessly use this new backend

We have the “micro-services” projects that seem good projects for research, which is really good, without losing the focus on getting a production ready system in the future, but I lack a clear architecture of the components to implement and how to integrate them.

This is my very personal opinion, please don’t take it as a criticism as it’s not the intention. I’ve worked in some of these services and there is quite a lot of effort. For sure you can disagree, as I could be missing stuff here.

About the “geonetwork-ui”, I haven’t checked that much, but as indicated in a previous mail, it looks like a bunch of components, which is really good, but that requires developing different applications to replace the current system, and that is an effort of months. Which is fine, but as also indicated in a previous mail, customers are not willing to fund work around this, unless there is a clear roadmap, and that is our responsibility to define it.

Regards,
Jose García

On Fri, Sep 24, 2021 at 5:31 AM Byron Cochrane via GeoNetwork-devel <geonetwork-devel@lists.sourceforge.net> wrote:

Hello,

I agree with most of the points raised here and would like to be part of efforts to rearchitect GeoNetwork to simplify it and make it more useful and sustainable,

What I would really like is if in this new architecture we could use RDF (DCAT or something like it) as the intermediary format,

The ease of linking metadata fragments, definitions, codelist, thesauri, etc would be huge advantages. And being open world the ability to losslessly move between the various metadata standards and profiles would be greatly simplified.

Cheers,

Byron Cochrane

OpenWork Ltd
Nelson, New Zealand
byron@anonymised.com
+64 21 794 501
https://www.openwork.nz/

“The whole problem with the world is that fools and fanatics are so certain of themselves, yet wiser people so full of doubt” - Bertrand Russell
"Never doubt that a small group of thoughtful, committed citizens can change the world; indeed, it’s the only thing that ever has.” - Margaret Mead

On 24/09/2021, at 8:14 AM, Olivier Guyot via GeoNetwork-devel <geonetwork-devel@lists.sourceforge.net> wrote:

Hey Jody,

No problem, happy to indulge.

So my analysis is that GeoNetwork, while presenting itself as a Java application, is at its core nothing more than an XSL engine. All operations that relate to its primary concerns, namely metadata, are done through XSL, be it in plugins or not.

As such, metadata is treated as a highly complex source of truth of indeterminate shape, serialized as a huge text blob in the database. Reading its content has to be done using XSL, as is modifying it. This complexity then logically spreads to the whole GeoNetwork application, following an entropy-like rule which is commonplace in software (complexity will always increase when adding features on top of each other).

Some examples of incongruous situations that stem from this:

  • The responsibility of complying with the schemas validation rules falls on the user since he/she has full control over the XML documents; then appears a frightening (again) experience of validating the document manually and painfully fixing each of the dozens of errors that logically appear; this of course varies for each schema, each set of rule and so on, much to the user disbelief
  • Something as simple as adding a reference to an online resource (pretty much the only reason metadata are created at all) requires convoluted XSL logic in case we’re adding, removing, replacing etc.; this has to be duplicated for single and multi-language context, and of course for each schema
  • The need for converting from one schema to the other must have arisen soon in the life of the project (e.g. for offering a CSW service); this is simple enough when going from Dublin Core to ISO19139, but anyone can sense that this will become unmanageable when the number of schemas goes up; indeed, now that ISO19115-3 became much more desirable and that many would like DCAT2 to became a first class citizen in GN as well, this conversion principle becomes a severe hindrance
  • As a consequence of this, it appeared necessary to have some kind of common format (aka pivot format) that we could rely on for conversions; unfortunately, this need was never addressed directly and kind of organically came to a resolution, resulting in the recent OGC API Records service using the ElasticSearch documents as a source to output DCAT2; this of course resonates with the recent changes in Elastic license policy and the external risk it represents
  • Do I need to talk about subtemplates and templates of subtemplates? The name says it all: XML documents are so complex that we need templates, we need subtemplates, and we need templates of subtemplates – just for adding a reference to a contact in a metadata record (again, a most basic operation).

This is all, as far as I know, the product of a paradigm which states that GeoNetwork should be schema-agnostic and work with an abstraction layer made of XSL, probably so that valuable metadata documents are kept intact in its database (which they’re not, for that matter).

This paradigm has most likely stood for 10+ years, and while it might have made sense at some point, it is in my opinion entirely obsolete.

We as a community must go past this and get rid of the XSL foundation of GeoNetwork.

My proposed solution is to :

  1. Rewrite a new GeoNetwork core to revolve around a CatalogRecord object model, written simply in Java; storing these objects will most likely be trivial, even with many fields and child objects (contacts, online resources, etc.)
  2. Offer a basic CRUD API over this collection
  3. Write transformations from XML to this POJO and back, once for each schema; this is where the real value lies and a critical part of the process; for this I suggest working with Groovy which offers great capabilities both in reading and outputting XML; Groovy was used until some time ago in GeoNetwork to provide a very efficient and maintainable formatter, sadly removed since
  4. Write a separate service which receives notifications through an event bus and is in charge of indexing the records in a search engine (e.g. ElasticSearch); this means the new geonetwork-ui could seamlessly use this new backend
    Then the community of contributors can gradually address more and more use cases, reusing existing GN code where possible while keeping the complexity of the solution in check. Hopefully, reaching out to the open data ecosystem will bring more funds and help bring this “new GeoNetwork” (which I guess could be GeoNetwork 5) to an audience as wide as GeoNetwork currently has.

I will finish my rant by saying that GeoNetwork needs Open Data as much as Open Data needs GeoNetwork. Not acknowledging this is, I think, a mistake that will eventually seal the project in its figurative grave.

Thanks again for reading this far!

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Olivier Guyot

Geospatial Developer

+49 89 2620 89 924

On Thu, Sep 23, 2021 at 6:09 PM Jody Garnett <jody.garnett@anonymised.com> wrote:

Oliver:

If I understand correctly there is a consistent theme here of more complexity and technical debt then can be managed and released. I know I personally struggled to understand Jeeves and provide javadocs for many of the geonetwork components this year.

I would like to learn more about pivot (POJO) backend idea, the version of saxxon used to support the XSL is quite dated and I have been seeking ideas how it could be replace (migrating the XSL+java functions approach to pure XSL I am not too fond of).

Thanks for the blunt assessment and architecture ideas.


Jody Garnett

On Tue, 14 Sept 2021 at 06:13, Olivier Guyot via GeoNetwork-devel <geonetwork-devel@lists.sourceforge.net> wrote:

Thanks for the level responses.

I think it could help to define a release calendar, at least most customers will appreciate it. For example, a major release once or twice a year and monthly releases of the stable version(s), so new features go in the branch for the next release. That should lower the risk of introducing regressions, but also requires frequent releases, otherwise new features wait even for years to end in the stable releases.

Jose, this is something that was decided I think 3 years ago during Bolsena, and you were there. Where did that take us? Aside from more work on your shoulders and those of François, not much. Deciding that releases should happen and when is not enough, the problem is that there is a need for so many releases at all.

Most releases are just full of bugfixes, and the new features will themselves bring more bugs in weeks/months/years. I am not blaming the contributors of GeoNetwork, but the project’s architecture in itself.

I am utterly convinced that GeoNetwork in itself has a very deep technical debt problem. The following simple reasoning is enough to explain many woes that we are all struggling with today:

  • Project architecture is overly complex (backend and frontend mixed, very large code base…)
  • Thus, refactoring large parts of the project is very difficult and can only be done by a handful of people
  • Thus, many parts of the projects are left without the refactoring/maintenance they need (hello Jeeves services, hello Ext.JS client which is still in the code base)
  • Thus, adding new features would be way too expensive if that included a reduction of the technical debt
  • Thus, new features are added without refactoring (see https://github.com/geonetwork/core-geonetwork/pull/5920)
  • Thus, project architecture and technical debt keep steadily increasing and no one can do anything
  • Repeat.
  • Bonus points if your long-time contributors are exhausted and stressed out and no longer want to work on the project.

We could start writing these best practices in a WIKI or similar and also start supporting tools that automate some of these best practices. The Bolsena codesprint might be a good place to work on this.

The answer cannot be adding new guidelines in the wiki or new automated checks (which was also done in the past). A code base has to be self-explanatory and welcoming for contributors (new and old alike). Rules must be inferred more than enforced.

That might be a good option, but the main problem I see, unless I’m really wrong, is that the project in the current state can’t even replace the GeoNetwork search UI app. Don’t take it as a criticism, as it’s something that requires that the different parties working in GeoNetwork should promote, but from my experience with customers, unless there is a clear work plan to replace all the applications, it is difficult to get them involved.

Having a search UI is definitely possible, although basic: there are search results, facets, full text search, plus a few others. We @ Camptocamp have a project currently running where we’re building a more complete search UI inc. dataviz capabilities. What I’m saying is that I think we should clearly point the customers who want custom/extensible catalog UIs towards geonetwork-ui.

This below is my opinion for the direction the project should take in order to thrive and live long:

  1. all medium to large development budgets should be redirected to geonetwork-microservices and/or geonetwork-ui

  2. the catalog backend should not work with raw XML records any more, but rely on a pivot format (POJO): this is IMO absolutely mandatory to get rid the huge XSL foundation that core-geonetwork currently sits (and the costs that come with it), and also mandatory if GeoNetwork is to open up to the open data ecosystem and become part of it
    I’m not sure what would the alternatives be, but I don’t think any of them would be very desirable.

Thanks for reading!

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Olivier Guyot

Geospatial Developer

+49 89 2620 89 924

On Tue, Sep 14, 2021 at 2:12 PM Jose Garcia <jose.garcia@anonymised.com> wrote:

Hi

Some comments:

In the last few days, a couple of feedbacks were pointing out that some (new?) rules have to be followed and that there is some kind of “correct direction”. We also have a concept of frightening stuff which maybe we could rephrase as “experiments” …

An acceptable level of experimentation would be no experimentation at all. GeoNetwork is not a sandbox and thousands of organizations rely on it daily; experimenting on the main branch incurs tremendous migration and bugfixing costs down the line, costs which are draining potential funding for actual quality work on the project.

I’m not sure if this refers to adding new features to GeoNetwork? If that’s the case, I’m fine with that, with a proper review of the PR, test cases (not always possible to add, but that’s another history) and documentation. Also not only unit tests, but describe a test case to reproduce the issue in the PR, so that a tester can verify the fix.

I think it could help to define a release calendar, at least most customers will appreciate it. For example, a major release once or twice a year and monthly releases of the stable version(s), so new features go in the branch for the next release. That should lower the risk of introducing regressions, but also requires frequent releases, otherwise new features wait even for years to end in the stable releases.

For 4.0.x, Francois has taken care of the monthly releases, something that should also be highly appreciated as it allows to keep the project alive, as most users do not have the option to build it.

We tried in the past a couple of “constraints” without always much success eg. jslint. Maybe some tools, existing guidelines or new people’s savoir-faire can now help … We also know that we are facing common OS project issues (eg. PR with no review, long term support of PR work), and that’s an area where we can also make progress.

For me, the main problem with some of these tools has been that they weren’t as well integrated into the build process, so it was pretty easy to forget to run before committing or it took a long time to run and with all the legacy code running during the example. Findbugs required adding many exception cases for the legacy code; otherwise the report was unmanageable.

In any case, this is something to document / improve so for Javascript code it’s executed jslint (or other better tool?) and for Java code SpotBugs, at least for the new code.

We also know that we are facing common OS project issues (eg. PR with no review, long term support of PR work), and that’s an area where we can also make progress.

Fully agree, review of PR sometimes takes a long time, usually also because only a few developers do PR reviews, making it difficult to manage. I am not sure that obtaining funds from OSGeo is an option for this type of task? I think it would make the task easier.

As for suggestions, I would strongly invite anyone and everyone to stop incentivizing customers to fund the UI in the core-geonetwork project: AngularJS will be end of life in 3 months, most libraries it relies on are deprecated as well and maintenance costs are extremely high. The https://github.com/geonetwork/geonetwork-ui project has been created as a successor with higher quality standards in mind and a vision for the future of GeoNetwork as a universal data hub. The project is advancing slowly but steadily, and will benefit from all contributions of any kind.

That might be a good option, but the main problem I see, unless I’m really wrong, is that the project in the current state can’t even replace the GeoNetwork search UI app. Don’t take it as a criticism, as it’s something that requires that the different parties working in GeoNetwork should promote, but from my experience with customers, unless there is a clear work plan to replace all the applications, it is difficult to get them involved.

Also, take code review as a gift, it’s not to blame people, but to try to keep a good code overall. Another craftman guideline says “be kind with the people, but rude with the code”, in other term, all comments that push the people and the code to be better are good, don’t take them personally.

I fully agree, but when doing a review we must be constructive and understand that perhaps the person making a contribution is not an expert in specific areas where the reviewer can be an expert. So we should try to be kind with the code and propose constructive alternatives.

All what I mention sounds quite of obvious, but reality, deadlines and constraints push us to take shortcuts, which we should avoided. We should really do an effort to stick to better practices.

We could start writing these best practices in a WIKI or similar and also start supporting tools that automate some of these best practices. The Bolsena codesprint might be a good place to work on this.

Regards,
Jose García

On Tue, Sep 14, 2021 at 10:30 AM Florent Gravin via GeoNetwork-devel <geonetwork-devel@lists.sourceforge.net> wrote:

Hi François,

Thanks for pulling this down.
I would be pleased to discuss about those topics if we can arrange a meeting between developers to hear what everyone would propose to improve code quality and maintenance. It will benefit anyone.

Here some craftman trends which could be useful:

  • always leave the code in a better state that how you found it
  • when fixing a bug: write a failing test which illustrate the bug, try to write passing tests that cover the whole feature, fix the bug so all tests are green
  • constant refactoring, the code is legacy, when you touch a piece, instead of suffering from the legacy, make the code evolve
  • put love in the code you do :slight_smile:
  • add tests, refactoring is fundamental, to be sure you don’t break things while refactoring, you can only rely on tests
  • non tested code is no code (garbage)
  • quality is free, implementing a quick feature or a quick fix may give the impression it’s not costly, but you’ll pay the price afterwards

Actually, when we develop a feature, it’s like were a developing it 2 or 3 times:

  • the first quick shot: “it works”
  • the first issues “actually it does not really work when you are using it in a deeper way” => fixes
  • still not working perfectly => the full implementation with tests

It implies different people, different findings and I think we are all losing from that (energy, time, faith), the feature should be implemented only once.

All what I mention sounds quite of obvious, but reality, deadlines and constraints push us to take shortcuts, which we should avoided. We should really do an effort to stick to better practices.

Also, take code review as a gift, it’s not to blame people, but to try to keep a good code overall. Another craftman guideline says “be kind with the people, but rude with the code”, in other term, all comments that push the people and the code to be better are good, don’t take them personally.

Hope it helps.
Cheers

On Tue, Sep 14, 2021 at 9:14 AM Olivier Guyot via GeoNetwork-devel <geonetwork-devel@lists.sourceforge.net> wrote:

Hi,

An acceptable level of experimentation would be no experimentation at all. GeoNetwork is not a sandbox and thousands of organizations rely on it daily; experimenting on the main branch incurs tremendous migration and bugfixing costs down the line, costs which are draining potential funding for actual quality work on the project.

As for suggestions, I would strongly invite anyone and everyone to stop incentivizing customers to fund the UI in the core-geonetwork project: AngularJS will be end of life in 3 months, most libraries it relies on are deprecated as well and maintenance costs are extremely high. The https://github.com/geonetwork/geonetwork-ui project has been created as a successor with higher quality standards in mind and a vision for the future of GeoNetwork as a universal data hub. The project is advancing slowly but steadily, and will benefit from all contributions of any kind.

Ideally the current GeoNetwork UI would receive some sort of LTS/LTR status and enter a long-term stabilized state, but I’m not sure anyone would be willing to offer this kind of guarantee.

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Olivier Guyot

Geospatial Developer

+49 89 2620 89 924

On Mon, Sep 13, 2021 at 6:58 PM Francois Prunayre via GeoNetwork-devel <geonetwork-devel@lists.sourceforge.net> wrote:

Hi all,

In the last few days, a couple of feedbacks were pointing out that some (new?) rules have to be followed and that there is some kind of “correct direction”. We also have a concept of frightening stuff which maybe we could rephrase as “experiments” …

Would it make sense to discuss and define what are those rules/requirements and direction to have a clear view of what is expected from contributors? Define what level of experimentation is acceptable or not? It would also give more visibility on what level of exigences is required and if it matches supporting projects needs, people knowledge and people efforts dedicated to the project run.

We tried in the past a couple of “constraints” without always much success eg. jslint. Maybe some tools, existing guidelines or new people’s savoir-faire can now help … We also know that we are facing common OS project issues (eg. PR with no review, long term support of PR work), and that’s an area where we can also make progress.

Looking forward to ideas, suggestions and contributions.

Francois

PS: For now, I step down from my role of making an almost monthly maintenance release of version 4. Don’t hesitate to take over…

[1] https://github.com/geonetwork/core-geonetwork/pull/5920
[2] https://github.com/geonetwork/core-geonetwork/pull/5932#issuecomment-906206091
[3] https://github.com/geonetwork/core-geonetwork/pull/5931
[4] https://github.com/geonetwork/core-geonetwork/pull/5941


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Florent Gravin
Technical Leader - Architect
+33 4 58 48 20 36


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

Vriendelijke groeten / Kind regards,

Jose García


Veenderweg 13
6721 WD Bennekom
The Netherlands
T: +31 (0)318 416664

Please consider the environment before printing this email.


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

Vriendelijke groeten / Kind regards,

Jose García


Veenderweg 13
6721 WD Bennekom
The Netherlands
T: +31 (0)318 416664

Please consider the environment before printing this email.

Thanks for your responses Byron and Jose.

Jose:

Not sure if I get this, you mean users get errors due to working directly with the XML document? Users should use instead of the editor form, not edit the raw XML content.

Regarding validation: even when using the editor form, the user is very close to the actual XML code and as such, has a great freedom in how he/she authors the metadata. Adding and removing elements, setting attributes, etc. Last time I worked on a project which did validation, the user would still often end up with dozens of errors when requesting a validation. And what I want to emphasize is that these errors should be none of the user’s business: the editor should simply state which field are required or not, which values are allowed etc., and then proceed to produce a valid XML output.

If we want to use a Java model and implement the validation rules in another way, it sounds fine to me. But I think the current implementation should be also reviewed in perspective: the current technologies used when the project started 20 years ago were the standard way to handle XML in that moment.

My point, let’s define what we want to do and move forward instead of blaming all the time about the bad technologies (considering that now), but those were probably the best technologies when the project started and for several years after.

That is exactly my intention! Technological choices are often only relevant in a certain period or context, blaming them endlessly is pointless. Let’s move forward and reassess these choices.

Also “Rewrite a new GeoNetwork core” sounds “cool” from a technical view, but also it’s a huge effort of months or even years of work. Again, not against this, but we should plan the work and don’t give false expectations.

100% agreed too. As the saying goes: “the best time to plant a tree is 20 years ago. The second best time is now.”

I’m going to try to be cautious on this, but come on … have you tried to debug the formatter code in Java in case you get an error or not the expected result? Have you spent days in a customer set up trying to get the formatter working, trying to figure out cryptic errors due to apparently some issues with libraries used by Groovy, without any result?

Ok, I did not have that side of the story. Your reaction is understandable then :slight_smile: Honestly though, Groovy was just a technological suggestion. IMO the only requirements are that the language used is 1/ still “active” (e.g. maintained, with a community etc.) and 2/ allows running tests. Basically a schema transformation should come with fixtures for input/output in both directions, and produce the expected result in both directions. That’s the end of it.

Byron:

What I would really like is if in this new architecture we could use RDF (DCAT or something like it) as the intermediary format,

I agree, RDF seems like such a good fit for metadata. I wouldn’t store the metadata as RDF though, that sounds like another can of worms. My idea was actually to get most of the inspiration for the POJO format from the DCAT2 (or 3) and subsequent applications profiles (DCAT-AP, StatDCAT-AP, GeoDCAT-AP…), as it feels like a very good universal baseline for describing metadata. This of course has to be thoroughly discussed as it is a strategical matter.

Cheers,

(attachments)

openwork_favicon__white96x96.png

···

camptocamp
INNOVATIVE SOLUTIONS
BY OPEN SOURCE EXPERTS

Olivier Guyot

Geospatial Developer

+49 89 2620 89 924

I really appreciate you clear communication, and overall vision. I am afraid I had too little experience with the project at the last Bolsena to contribute to such planning.

I like the strong architecture direction you propose, much of the difficult I have found in the codebase is a side effect of not keeping very clear api boundaries and gradual fixes softening across systems. Nothing big and all understandable at the time I am sure; but adding up to a system which is error prone to modify.

I also want to provide some positive feedback also:

Although I am not good at it I perfectly understand why XSL was an amazing architecture choice. The ability to drop in random schema required by different regions and get something up on screen visually, and also to edit - is a pretty darn cool accomplishment. I have certainly helped make worse things (the GeoTools schema driven parsers come to mind).

I also want to recognize that folks, included myself, have an obligation to existing customers and codebase. As someone who has faced very hard struggles to find necessary RND (most noticbley the JAI —> ImageN project) it is very hard to begrduge development continuing on a foundation that needs to be replaced. It has been very hard to watch more and more functionality be developed on JAI, when the project was abandoned in 2006. While I would love to have that “investment” head into the ImageN project … I have to recognize that operational funding is not the same thing as development funding.

Personally for core-geonetwork it means I have been fighting to introduce … version numbers and shore up service context api boundaries on an operational basis, and have not have an opportunity to join you on geonetwork RnD.

One thing that has impressed me about the core-geonetwork community is the ability to surface medium and long term issues and do some fundraising and sprints (and contracts?). Having a user meeting with effected parties and so on really go a long way towards keeping the community together in the face of change. This is a lesson I am trying to learn and apply to my other projects that are even more limited by operational constraints.

Jody

···


Jody Garnett

Indeed Jody, a project to be alive needs experiments and can we pretend that we will implement a feature at once with so much varieties in our user base from local to global level. User needs and expectations are probably not even equivalent regarding the same feature. Implementing a thesaurus manager based on RDF in 2006 was an experiment, Faceted and multilingual search in Lucene was an experiment with still annoying drawbacks in 3.x, Subtemplates and Xlinks are experiments (with features/fixes in some forks), Even the indexing progress bar is an experiment which does not work well when more than one catalogue are in the same container, Draft workflow was made with a couple of experiments over years and live testing and is now stabilizing…

GeoNetwork is not a library dealing with a couple of specs. If we look at the PR history, we are daily working on a large scope of themes because users are diverse and GeoNetwork is a multifacet project (maybe too many - and that’s a difficulty). In the last days, organisations supporting version 3.12 & 4 and updating weekly their test (and even prod) environments focus on various part of the application:

  • EEA is promoting thematic portals (and would like to have a simplified management interface for their configurations) https://sdi.eea.europa.eu/catalogue/srv/api/sources
  • BRGM is bridging their codelist registry with the catalogue, is looking forward better citation of datasets in scientific publication and start using ISO19115-3
  • Ifremer is noticing minor glitches in faceted search, is working on DOI management
  • Metawal is working on thematic portal focusing on INSPIRE conformity and is also chasing ghost records in a quite old CSW harvester
  • RZA is harvesting more than 20 nodes and we still discover room for improvements in OAI-PMH and dublin core
  • OIEau is managing more than 10 catalogues in various regions of the world with focus on map visualizations
  • HNAP team is improving pluggable schema and advanced editor features, keycloak, CMIS, and propose new ideas on bridging GeoServer and GeoNetwork security

An issue is that we do not always have people/companies to provide maintenance on the long run once the feature is made and the supporting project is over. Another difficulty we have is also to get rid of “old stuff”. For sure a more modular structure would help on that, but also having more people energy focusing on that could also be of great help - eg. Jeeves footprint is decreasing but there is still part of the API to migrate …

Francois

PS: Minor details:

Le jeu. 23 sept. 2021 à 17:49, Jody Garnett <jody.garnett@anonymised.com.31…> a écrit :

Hey Francois:

Although I have been trying to take part more in core-geonetwork, I have also met the push back you experienced on “experiments”.

Really we need experiments to have a project that is alive. We have well established communication patterns (proposals, steering committee) if individuals wish to have input into planning to control risk. But lack of movement is also a bigger risk, we are all aware of examples like the user interface, elasticsearch, java 17 compatibility, … or even small build changes to introduce version numbers.

So I do not know what encouragement you need Francois, but anything that can be done to make a healthy environment for core-geonetwork experimenting life has my support.

As for those having trouble keeping up with changes, I have some ideas in the https://github.com/GeoCat/experiment-hnap on how to manage change and customizations without forking. Indeed this being quite the successful direction allowing GeoCat to better support our customers while more directly taking part in core-geonetwork. I have a presentation on using war-overlays at foss4g next week for anyone wishing to learn more.


Jody Garnett

On Mon, 13 Sept 2021 at 09:58, Francois Prunayre via GeoNetwork-devel <geonetwork-devel@anonymised.comts.sourceforge.net> wrote:

Hi all,

In the last few days, a couple of feedbacks were pointing out that some (new?) rules have to be followed and that there is some kind of “correct direction”. We also have a concept of frightening stuff which maybe we could rephrase as “experiments” …

Would it make sense to discuss and define what are those rules/requirements and direction to have a clear view of what is expected from contributors? Define what level of experimentation is acceptable or not? It would also give more visibility on what level of exigences is required and if it matches supporting projects needs, people knowledge and people efforts dedicated to the project run.

We tried in the past a couple of “constraints” without always much success eg. jslint. Maybe some tools, existing guidelines or new people’s savoir-faire can now help … We also know that we are facing common OS project issues (eg. PR with no review, long term support of PR work), and that’s an area where we can also make progress.

Looking forward to ideas, suggestions and contributions.

Francois

PS: For now, I step down from my role of making an almost monthly maintenance release of version 4. Don’t hesitate to take over…

[1] https://github.com/geonetwork/core-geonetwork/pull/5920
[2] https://github.com/geonetwork/core-geonetwork/pull/5932#issuecomment-906206091
[3] https://github.com/geonetwork/core-geonetwork/pull/5931
[4] https://github.com/geonetwork/core-geonetwork/pull/5941


GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork