[GeoNetwork-users] Duplication of records

Hi
I am new at this Geonetwork thing so this may be a really dumb question,
sorry if that is so!

I have successfully got my little geonetwork site running but have
discovered it is very easy to add the same metadata records more than once,
so they then appear twice in the results list. Is there a way to prevent
this happening within the software? I am generally using the Batch Upload
for XML files. I couldn't find anything obvious in the manual for this.

Any help/suggestions would be much appreciated.

Thanks very much

Caroline Levey
Senior GIS Developer
SeaZone Solutions Limited

SeaZone provides Marine Geographic Information Solutions from instrument to
desktop, supporting decision making in the Marine Environment and Coastal
Zone

BETTER DATA, BETTER SCIENCE, IMPROVED DECISIONS<

Tel: +44 (0) 870 013 0607
Fax: +44 (0) 870 013 0608

Email: caroline.levey@anonymised.com <mailto:info@anonymised.com>
Web: www.SeaZone.com <http://www.seazone.com/&gt;

Post: Red Lion House, Bentley, Hampshire GU10 5HY, United Kingdom
Registered in England and Wales: No. 4969561.
Registered Office: Admiralty Way, Taunton, Somerset, TA1 2DN

----------------------------------------------------------------------------
--------
This email and any files transmitted with it are confidential
and intended only for the use by the addressee. If you
receive this email by mistake pleases notify SeaZone
immediately.
----------------------------------------------------------------------------
--------

Hmm. Sounds difficult. Do you mean the content of the metadata is exactly the same (and I mean down to the last byte)?

I suppose you could think about adding a constraint in the database to prevent duplicate metadata records, but AFAIK, there is nothing to prevent you from uploading duplicate records through the user interface.

I think the only way would be to sort out the duplicate records before ever trying to upload them to Geonetwork at this point.

Regards,
Jason

--
Jason P. Pickering
Consultant
WHO/Ministry of Health Zambia
email: pickeringj@anonymised.com
tel: +260968395190

________________________________

From: Caroline Levey [mailto:Caroline.Levey@anonymised.com]
Sent: Mon 2009-09-07 13:25
To: geonetwork-users@lists.sourceforge.net
Subject: [GeoNetwork-users] Duplication of records

Hi
I am new at this Geonetwork thing so this may be a really dumb question,
sorry if that is so!

I have successfully got my little geonetwork site running but have
discovered it is very easy to add the same metadata records more than once,
so they then appear twice in the results list. Is there a way to prevent
this happening within the software? I am generally using the Batch Upload
for XML files. I couldn't find anything obvious in the manual for this.

Any help/suggestions would be much appreciated.

Thanks very much

Caroline Levey
Senior GIS Developer
SeaZone Solutions Limited

SeaZone provides Marine Geographic Information Solutions from instrument to
desktop, supporting decision making in the Marine Environment and Coastal
Zone

BETTER DATA, BETTER SCIENCE, IMPROVED DECISIONS<

Tel: +44 (0) 870 013 0607
Fax: +44 (0) 870 013 0608

Email: caroline.levey@anonymised.com <mailto:info@anonymised.com>
Web: www.SeaZone.com <http://www.seazone.com/&gt;

Post: Red Lion House, Bentley, Hampshire GU10 5HY, United Kingdom
Registered in England and Wales: No. 4969561.
Registered Office: Admiralty Way, Taunton, Somerset, TA1 2DN

----------------------------------------------------------------------------
--------
This email and any files transmitted with it are confidential
and intended only for the use by the addressee. If you
receive this email by mistake pleases notify SeaZone
immediately.
----------------------------------------------------------------------------
--------

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

Perhaps you could describe how it is that the duplicates came to exist in the import set? Or is this a situation where you merely wish to guard against the possibility of duplicates?

---
A. Soroka
Digital Research and Scholarship R & D
the University of Virginia Library

On Sep 7, 2009, at 1:03 PM, Pickering, Jason Paul wrote:

Hmm. Sounds difficult. Do you mean the content of the metadata is exactly the same (and I mean down to the last byte)?

I suppose you could think about adding a constraint in the database to prevent duplicate metadata records, but AFAIK, there is nothing to prevent you from uploading duplicate records through the user interface.

I think the only way would be to sort out the duplicate records before ever trying to upload them to Geonetwork at this point.

Regards,
Jason

--
Jason P. Pickering
Consultant
WHO/Ministry of Health Zambia
email: pickeringj@anonymised.com
tel: +260968395190

________________________________

From: Caroline Levey [mailto:Caroline.Levey@anonymised.com]
Sent: Mon 2009-09-07 13:25
To: geonetwork-users@lists.sourceforge.net
Subject: [GeoNetwork-users] Duplication of records

Hi
I am new at this Geonetwork thing so this may be a really dumb question,
sorry if that is so!

I have successfully got my little geonetwork site running but have
discovered it is very easy to add the same metadata records more than once,
so they then appear twice in the results list. Is there a way to prevent
this happening within the software? I am generally using the Batch Upload
for XML files. I couldn't find anything obvious in the manual for this.

Any help/suggestions would be much appreciated.

Thanks very much

Caroline Levey
Senior GIS Developer
SeaZone Solutions Limited

SeaZone provides Marine Geographic Information Solutions from instrument to
desktop, supporting decision making in the Marine Environment and Coastal
Zone

BETTER DATA, BETTER SCIENCE, IMPROVED DECISIONS<

Tel: +44 (0) 870 013 0607
Fax: +44 (0) 870 013 0608

Email: caroline.levey@anonymised.com <mailto:info@anonymised.com>
Web: www.SeaZone.com <http://www.seazone.com/&gt;

Post: Red Lion House, Bentley, Hampshire GU10 5HY, United Kingdom
Registered in England and Wales: No. 4969561.
Registered Office: Admiralty Way, Taunton, Somerset, TA1 2DN

----------------------------------------------------------------------------
--------
This email and any files transmitted with it are confidential
and intended only for the use by the addressee. If you
receive this email by mistake pleases notify SeaZone
immediately.
----------------------------------------------------------------------------
--------

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

a possible solution could be if we'd added a "find similar" function.
GeoNetwork uses Lucene, and Lucene has excellent facilities for this,
usually used in some "did you mean.. ?" function when a search returns no or
few results. But we could apply it for this, if people think it'd be useful.

Kind regards
Heikki Doeleman

On Mon, Sep 7, 2009 at 7:08 PM, <ajs6f@anonymised.com> wrote:

Perhaps you could describe how it is that the duplicates came to exist
in the import set? Or is this a situation where you merely wish to
guard against the possibility of duplicates?

---
A. Soroka
Digital Research and Scholarship R & D
the University of Virginia Library

On Sep 7, 2009, at 1:03 PM, Pickering, Jason Paul wrote:

> Hmm. Sounds difficult. Do you mean the content of the metadata is
> exactly the same (and I mean down to the last byte)?
>
> I suppose you could think about adding a constraint in the database
> to prevent duplicate metadata records, but AFAIK, there is nothing
> to prevent you from uploading duplicate records through the user
> interface.
>
> I think the only way would be to sort out the duplicate records
> before ever trying to upload them to Geonetwork at this point.
>
> Regards,
> Jason
>
>
>
> --
> Jason P. Pickering
> Consultant
> WHO/Ministry of Health Zambia
> email: pickeringj@anonymised.com
> tel: +260968395190
>
> ________________________________
>
> From: Caroline Levey [mailto:Caroline.Levey@anonymised.com]
> Sent: Mon 2009-09-07 13:25
> To: geonetwork-users@lists.sourceforge.net
> Subject: [GeoNetwork-users] Duplication of records
>
>
>
> Hi
> I am new at this Geonetwork thing so this may be a really dumb
> question,
> sorry if that is so!
>
> I have successfully got my little geonetwork site running but have
> discovered it is very easy to add the same metadata records more
> than once,
> so they then appear twice in the results list. Is there a way to
> prevent
> this happening within the software? I am generally using the Batch
> Upload
> for XML files. I couldn't find anything obvious in the manual for
> this.
>
> Any help/suggestions would be much appreciated.
>
> Thanks very much
>
> Caroline Levey
> Senior GIS Developer
> SeaZone Solutions Limited
>
> SeaZone provides Marine Geographic Information Solutions from
> instrument to
> desktop, supporting decision making in the Marine Environment and
> Coastal
> Zone
>
>> BETTER DATA, BETTER SCIENCE, IMPROVED DECISIONS<
>
>
> Tel: +44 (0) 870 013 0607
> Fax: +44 (0) 870 013 0608
>
> Email: caroline.levey@anonymised.com <mailto:info@anonymised.com>
> Web: www.SeaZone.com <http://www.seazone.com/&gt;
>
> Post: Red Lion House, Bentley, Hampshire GU10 5HY, United Kingdom
> Registered in England and Wales: No. 4969561.
> Registered Office: Admiralty Way, Taunton, Somerset, TA1 2DN
>
>
>
----------------------------------------------------------------------------
> --------
> This email and any files transmitted with it are confidential
> and intended only for the use by the addressee. If you
> receive this email by mistake pleases notify SeaZone
> immediately.
>
----------------------------------------------------------------------------
> --------
>
>
>
>
------------------------------------------------------------------------------
> Let Crystal Reports handle the reporting - Free Crystal Reports 2008
> 30-Day
> trial. Simplify your report design, integration and deployment - and
> focus on
> what you do best, core application coding. Discover what's new with
> Crystal Reports now. http://p.sf.net/sfu/bobj-july
> _______________________________________________
> GeoNetwork-users mailing list
> GeoNetwork-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geonetwork-users
> GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork
>
>
>
------------------------------------------------------------------------------
> Let Crystal Reports handle the reporting - Free Crystal Reports 2008
> 30-Day
> trial. Simplify your report design, integration and deployment - and
> focus on
> what you do best, core application coding. Discover what's new with
> Crystal Reports now. http://p.sf.net/sfu/bobj-july
> _______________________________________________
> GeoNetwork-users mailing list
> GeoNetwork-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geonetwork-users
> GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus
on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork

Caroline,

I think if you use the default database provided with GN which is McKoi
it lets you have records with duplicate fileIdentifiers (uuid) in the database.

However yif ou use a database like Oracle it won't let you upload records with
duplicate identifier. Maybe this could help you.

Andrew

----- Original Message ----- From: "Caroline Levey" <Caroline.Levey@anonymised.com>
To: <geonetwork-users@lists.sourceforge.net>
Sent: Monday, September 07, 2009 9:25 PM
Subject: [GeoNetwork-users] Duplication of records

Hi
I am new at this Geonetwork thing so this may be a really dumb question,
sorry if that is so!

I have successfully got my little geonetwork site running but have
discovered it is very easy to add the same metadata records more than once,
so they then appear twice in the results list. Is there a way to prevent
this happening within the software? I am generally using the Batch Upload
for XML files. I couldn't find anything obvious in the manual for this.

Any help/suggestions would be much appreciated.

Thanks very much

Caroline Levey
Senior GIS Developer
SeaZone Solutions Limited

SeaZone provides Marine Geographic Information Solutions from instrument to
desktop, supporting decision making in the Marine Environment and Coastal
Zone

BETTER DATA, BETTER SCIENCE, IMPROVED DECISIONS<

Tel: +44 (0) 870 013 0607
Fax: +44 (0) 870 013 0608

Email: caroline.levey@anonymised.com <mailto:info@anonymised.com> Web: www.SeaZone.com <http://www.seazone.com/&gt;

Post: Red Lion House, Bentley, Hampshire GU10 5HY, United Kingdom
Registered in England and Wales: No. 4969561. Registered Office: Admiralty Way, Taunton, Somerset, TA1 2DN

----------------------------------------------------------------------------
--------
This email and any files transmitted with it are confidential
and intended only for the use by the addressee. If you receive this email by mistake pleases notify SeaZone immediately.
----------------------------------------------------------------------------
--------

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork

Thanks all who have responded so far, I will respond to each in this email:

I am not sure if we are doing things very differently here to how others do
it but we create ISO19139 compliant XML files using a separate system which
are currently being uploaded into an Oracle system and just being stored as
an XML object, however the end user has a requirement to publish the
metadata so more people have access to the information. Having looked
around we decided GeoNetwork would be suitable to do that so we are working
out the process to get the data from the XML files into GeoNetwork and the
batch process works well, other than there is no checking to see if it
already exists in the db.

Jason/ A. Soroka,
It is to guard against the potential. We are setting up a system which will
allow a variety of people to upload metadata so there is the potential that
two different people could upload the same metadata (it would be the same
XML file uploaded twice). As the Batch Import just does all files in the
specified folder someone could have left a file in there that had already
been uploaded, or even just by accidentally clicking Go on the Batch Import
it would do it again.

Heikki,
I would certainly think it would be useful, but maybe I am the only one who
doesn't trust their users!!!

Andrew,
We are intending to use PostGreSQL in the background (which I am also a
newbie with) so maybe they have similar setup, will investigate that,
thanks.

Thanks all, and any further ideas are always welcome.

Cheers
Caroline

-----Original Message-----
From: awalsh [mailto:awalsh@anonymised.com]
Sent: 08 September 2009 00:25
To: Caroline Levey
Cc: geonetwork-users@lists.sourceforge.net
Subject: Re: [GeoNetwork-users] Duplication of records

Caroline,

I think if you use the default database provided with GN which is McKoi it
lets you have records with duplicate fileIdentifiers (uuid) in the database.

However yif ou use a database like Oracle it won't let you upload records
with duplicate identifier. Maybe this could help you.

Andrew

----- Original Message -----
From: "Caroline Levey" <Caroline.Levey@anonymised.com>
To: <geonetwork-users@lists.sourceforge.net>
Sent: Monday, September 07, 2009 9:25 PM
Subject: [GeoNetwork-users] Duplication of records

Hi
I am new at this Geonetwork thing so this may be a really dumb question,
sorry if that is so!

I have successfully got my little geonetwork site running but have
discovered it is very easy to add the same metadata records more than

once,

so they then appear twice in the results list. Is there a way to prevent
this happening within the software? I am generally using the Batch Upload
for XML files. I couldn't find anything obvious in the manual for this.

Any help/suggestions would be much appreciated.

Thanks very much

Caroline Levey
Senior GIS Developer
SeaZone Solutions Limited

SeaZone provides Marine Geographic Information Solutions from instrument

to

desktop, supporting decision making in the Marine Environment and Coastal
Zone

BETTER DATA, BETTER SCIENCE, IMPROVED DECISIONS<

Tel: +44 (0) 870 013 0607
Fax: +44 (0) 870 013 0608

Email: caroline.levey@anonymised.com <mailto:info@anonymised.com>
Web: www.SeaZone.com <http://www.seazone.com/&gt;

Post: Red Lion House, Bentley, Hampshire GU10 5HY, United Kingdom
Registered in England and Wales: No. 4969561.
Registered Office: Admiralty Way, Taunton, Somerset, TA1 2DN

----------------------------------------------------------------------------

--------
This email and any files transmitted with it are confidential
and intended only for the use by the addressee. If you
receive this email by mistake pleases notify SeaZone
immediately.

----------------------------------------------------------------------------

--------

----------------------------------------------------------------------------
--

Let Crystal Reports handle the reporting - Free Crystal Reports 2008

30-Day

trial. Simplify your report design, integration and deployment - and focus

on

what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at

http://sourceforge.net/projects/geonetwork

----------------------------------------------------------------------------
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus
on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork