[Geoserver-devel] Consider switching CatalogInfo identity generation to UUID

Hi,
nowadays all CatalogInfo objects identifier is generated by using the object type plus a local UID:

String uid = new UID().toString();
OwsUtils.set( o, “id”, o.getClass().getSimpleName() + “-”+uid );

I’m looking into a user case where it would be beneficial, instead, to use a plain UUID, that is,
something like:

String uuid = UUID.newRandomUUID().toString();
OwsUtils.set( o, “id”, uuid);

The rationale for the change lies in organizations that have several GeoServer installed
and in use, by different departments, controlled by different people, and running different
versions (the case I’m looking at has 20+ GeoServer installations maintained under separate
admins, with no short term opportunity to consolidate them in a single install that has
the same version and plugin set for everybody).

To provide a little management to the above situation they want to at least have a
centralized catalog that contains info about all the servers, a catalog that uses
RDF style triplets instead of the classic Geonetwork harvesting.

Each GeoServer would then be equipped with a simple catalog listener that informs the central
catalog about new layers, in such a way that renames do not affect the central system,
thus the idea to use UUIDs to identify each layer.
Those could be generated by a separate plugin and stored in the metadata map of each
CatalogInfo object, but it would be just simpler if the ids were UUIDs in the first place.

Since we never advertised the GeoServer internal ID format it would not be a breaking change,
and in fact all that we want is uniqueness, not a particular format.
Switching to UUID would also be helpful in situation where we have a GeoServer cluster in
multi-master configuration, as it would ensure the different masters cannot come up with
the same ID for different objects.

The change would affect only trunk.

Opinions?

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf


+1

On Tue, Apr 24, 2012 at 11:22 AM, Andrea Aime
<andrea.aime@anonymised.com> wrote:

Hi,
nowadays all CatalogInfo objects identifier is generated by using the object
type plus a local UID:

        String uid = new UID\(\)\.toString\(\);
        OwsUtils\.set\( o, &quot;id&quot;, o\.getClass\(\)\.getSimpleName\(\) \+ &quot;\-&quot;\+uid \);

I'm looking into a user case where it would be beneficial, instead, to use a
plain UUID, that is,
something like:

        String uuid = UUID\.newRandomUUID\(\)\.toString\(\);
        OwsUtils\.set\( o, &quot;id&quot;, uuid\);

The rationale for the change lies in organizations that have several
GeoServer installed
and in use, by different departments, controlled by different people, and
running different
versions (the case I'm looking at has 20+ GeoServer installations maintained
under separate
admins, with no short term opportunity to consolidate them in a single
install that has
the same version and plugin set for everybody).

To provide a little management to the above situation they want to at least
have a
centralized catalog that contains info about all the servers, a catalog that
uses
RDF style triplets instead of the classic Geonetwork harvesting.

Each GeoServer would then be equipped with a simple catalog listener that
informs the central
catalog about new layers, in such a way that renames do not affect the
central system,
thus the idea to use UUIDs to identify each layer.
Those could be generated by a separate plugin and stored in the metadata map
of each
CatalogInfo object, but it would be just simpler if the ids were UUIDs in
the first place.

Since we never advertised the GeoServer internal ID format it would not be a
breaking change,
and in fact all that we want is uniqueness, not a particular format.
Switching to UUID would also be helpful in situation where we have a
GeoServer cluster in
multi-master configuration, as it would ensure the different masters cannot
come up with
the same ID for different objects.

The change would affect only trunk.

Opinions?

Cheers
Andrea

--
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-------------------------------------------------------

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

I am ok with the change but I don't quite understand the rationale. So a
few questions, and excuse the ignorance here.

1. What is the significance of switching from UID to UUID? Is the former
only unique to the host it is generated on?
2. What is the significance of dropping the class name prefix? Is that so
it can more easily be parsed by other tools using UUIDs?

Like you say there is no external contract for the id's other than they be
unique inside of geoServer so i am fine with the change, but just want to
clarify my understanding. Although having the classname as a prefix is nice
to provide a bit of context to the identifier as to know what type of
object it is referencing. Although admittedly I don't spend too much time
looking at catalog object id's out in the wild in isolation.

On Tue, Apr 24, 2012 at 10:22 AM, Andrea Aime
<andrea.aime@anonymised.com>wrote:

Hi,
nowadays all CatalogInfo objects identifier is generated by using the
object type plus a local UID:

            String uid = new UID().toString();
            OwsUtils.set( o, "id", o.getClass().getSimpleName() + "-"+uid
);

I'm looking into a user case where it would be beneficial, instead, to use
a plain UUID, that is,
something like:

            String uuid = UUID.newRandomUUID().toString();
            OwsUtils.set( o, "id", uuid);

The rationale for the change lies in organizations that have several
GeoServer installed
and in use, by different departments, controlled by different people, and
running different
versions (the case I'm looking at has 20+ GeoServer installations
maintained under separate
admins, with no short term opportunity to consolidate them in a single
install that has
the same version and plugin set for everybody).

To provide a little management to the above situation they want to at
least have a
centralized catalog that contains info about all the servers, a catalog
that uses
RDF style triplets instead of the classic Geonetwork harvesting.

Each GeoServer would then be equipped with a simple catalog listener that
informs the central
catalog about new layers, in such a way that renames do not affect the
central system,
thus the idea to use UUIDs to identify each layer.
Those could be generated by a separate plugin and stored in the metadata
map of each
CatalogInfo object, but it would be just simpler if the ids were UUIDs in
the first place.

Since we never advertised the GeoServer internal ID format it would not be
a breaking change,
and in fact all that we want is uniqueness, not a particular format.
Switching to UUID would also be helpful in situation where we have a
GeoServer cluster in
multi-master configuration, as it would ensure the different masters
cannot come up with
the same ID for different objects.

The change would affect only trunk.

Opinions?

Cheers
Andrea

--
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-------------------------------------------------------

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Tue, Apr 24, 2012 at 8:34 PM, Justin Deoliveira <jdeolive@anonymised.com>wrote:

I am ok with the change but I don't quite understand the rationale. So a
few questions, and excuse the ignorance here.

1. What is the significance of switching from UID to UUID? Is the former
only unique to the host it is generated on?

UID is unique only within the host, UUID is globally unique.

2. What is the significance of dropping the class name prefix? Is that so

it can more easily be parsed by other tools using UUIDs?

Correct, no need to setup particular string parsing rules to extract the
UUID out of the string

Like you say there is no external contract for the id's other than they be
unique inside of geoServer so i am fine with the change, but just want to
clarify my understanding. Although having the classname as a prefix is nice
to provide a bit of context to the identifier as to know what type of
object it is referencing. Although admittedly I don't spend too much time
looking at catalog object id's out in the wild in isolation.

Yeah, me neither. Also, in some environments setting up the config by hand
is (unfortunately) the only allowed option,
and people pull quite some hair trying to understand what the "expected
format" is. A UUID is pretty recognizable instead.

Cheers
Andrea

--
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-------------------------------------------------------

Cool, sounds good. +1.

What will be the upgrade strategy? Will UUID's be used only for new
configuration objects that are created. Or will old id's be recoded on
startup?

On Tue, Apr 24, 2012 at 3:59 PM, Andrea Aime
<andrea.aime@anonymised.com>wrote:

On Tue, Apr 24, 2012 at 8:34 PM, Justin Deoliveira <jdeolive@anonymised.com>wrote:

I am ok with the change but I don't quite understand the rationale. So a
few questions, and excuse the ignorance here.

1. What is the significance of switching from UID to UUID? Is the former
only unique to the host it is generated on?

UID is unique only within the host, UUID is globally unique.

2. What is the significance of dropping the class name prefix? Is that so

it can more easily be parsed by other tools using UUIDs?

Correct, no need to setup particular string parsing rules to extract the
UUID out of the string

Like you say there is no external contract for the id's other than they
be unique inside of geoServer so i am fine with the change, but just want
to clarify my understanding. Although having the classname as a prefix is
nice to provide a bit of context to the identifier as to know what type of
object it is referencing. Although admittedly I don't spend too much time
looking at catalog object id's out in the wild in isolation.

Yeah, me neither. Also, in some environments setting up the config by hand
is (unfortunately) the only allowed option,
and people pull quite some hair trying to understand what the "expected
format" is. A UUID is pretty recognizable instead.

Cheers
Andrea

--
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-------------------------------------------------------

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Peter Vretanos has a WFS change request "Add service id field to service identification section" (OGC 11-117):
https://portal.opengeospatial.org/files/?artifact_id=45706

This would expose an optional service identifier UUID via the capabilities document to support catalogue harvesting. Do you think this should be extended to layers / feature types?

On 24/04/12 22:22, Andrea Aime wrote:

Hi,
nowadays all CatalogInfo objects identifier is generated by using the object type plus a local UID:

             String uid = new UID().toString();
             OwsUtils.set( o, "id", o.getClass().getSimpleName() + "-"+uid );

I'm looking into a user case where it would be beneficial, instead, to use a plain UUID, that is,
something like:

             String uuid = UUID.newRandomUUID().toString();
             OwsUtils.set( o, "id", uuid);

The rationale for the change lies in organizations that have several GeoServer installed
and in use, by different departments, controlled by different people, and running different
versions (the case I'm looking at has 20+ GeoServer installations maintained under separate
admins, with no short term opportunity to consolidate them in a single install that has
the same version and plugin set for everybody).

To provide a little management to the above situation they want to at least have a
centralized catalog that contains info about all the servers, a catalog that uses
RDF style triplets instead of the classic Geonetwork harvesting.

Each GeoServer would then be equipped with a simple catalog listener that informs the central
catalog about new layers, in such a way that renames do not affect the central system,
thus the idea to use UUIDs to identify each layer.
Those could be generated by a separate plugin and stored in the metadata map of each
CatalogInfo object, but it would be just simpler if the ids were UUIDs in the first place.

Since we never advertised the GeoServer internal ID format it would not be a breaking change,
and in fact all that we want is uniqueness, not a particular format.
Switching to UUID would also be helpful in situation where we have a GeoServer cluster in
multi-master configuration, as it would ensure the different masters cannot come up with
the same ID for different objects.

The change would affect only trunk.

Opinions?

Cheers
Andrea

--
-------------------------------------------------------
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

-------------------------------------------------------

--
Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com>
Software Engineer
CSIRO Earth Science and Resource Engineering
Australian Resources Research Centre

Yep, I guess that would be a good idea, provided that it is optional. Very often people forget about setting unique namespaces, and in WMS those are missing anyways.

Cheers
Andrea

Il giorno 27/apr/2012 05:44, “Ben Caradoc-Davies” Ben.Caradoc-Davies@anonymised.com ha scritto:

Peter Vretanos has a WFS change request “Add service id field to service identification section” (OGC 11-117):
https://portal.opengeospatial.org/files/?artifact_id=45706

This would expose an optional service identifier UUID via the capabilities document to support catalogue harvesting. Do you think this should be extended to layers / feature types?

On 24/04/12 22:22, Andrea Aime wrote:

Hi,
nowadays all CatalogInfo objects identifier is generated by using the object type plus a local UID:

String uid = new UID().toString();
OwsUtils.set( o, “id”, o.getClass().getSimpleName() + “-”+uid );

I’m looking into a user case where it would be beneficial, instead, to use a plain UUID, that is,
something like:

String uuid = UUID.newRandomUUID().toString();
OwsUtils.set( o, “id”, uuid);

The rationale for the change lies in organizations that have several GeoServer installed
and in use, by different departments, controlled by different people, and running different
versions (the case I’m looking at has 20+ GeoServer installations maintained under separate
admins, with no short term opportunity to consolidate them in a single install that has
the same version and plugin set for everybody).

To provide a little management to the above situation they want to at least have a
centralized catalog that contains info about all the servers, a catalog that uses
RDF style triplets instead of the classic Geonetwork harvesting.

Each GeoServer would then be equipped with a simple catalog listener that informs the central
catalog about new layers, in such a way that renames do not affect the central system,
thus the idea to use UUIDs to identify each layer.
Those could be generated by a separate plugin and stored in the metadata map of each
CatalogInfo object, but it would be just simpler if the ids were UUIDs in the first place.

Since we never advertised the GeoServer internal ID format it would not be a breaking change,
and in fact all that we want is uniqueness, not a particular format.
Switching to UUID would also be helpful in situation where we have a GeoServer cluster in
multi-master configuration, as it would ensure the different masters cannot come up with
the same ID for different objects.

The change would affect only trunk.

Opinions?

Cheers
Andrea

Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf



Ben Caradoc-Davies Ben.Caradoc-Davies@anonymised.com
Software Engineer
CSIRO Earth Science and Resource Engineering
Australian Resources Research Centre