[SAC] svnsync

A number of months ago, Frank filed a ticket for using svnsync:

http://trac.osgeo.org/osgeo/ticket/91

I spent a little time this afternoon investigating what is required to get this going, and it turns out there is absolutely nothing we need to do on osgeo1 to support this. I propose that we stand up mirrors of all osgeo repositories on a TelaScience blade, and have them do their sync every three hours. Additionally, I propose that we stop doing daily full dumps on osgeo1 and incremental svn dumps on osgeo1.

These can be done on a mirror if we still want them, but they generate a lot of load on osgeo1, and we are looking to offload some of our static backup costs.

I will coordinate this effort and report back when it is done. No action should be required on osgeo1, and there will be no outage (planned ones, anyway :wink: )

Howard

Hi Howard,

2007/12/12, Howard Butler <hobu.inc@gmail.com>:

A number of months ago, Frank filed a ticket for using svnsync:

http://trac.osgeo.org/osgeo/ticket/91

I spent a little time this afternoon investigating what is required to
get this going, and it turns out there is absolutely nothing we need
to do on osgeo1 to support this. I propose that we stand up mirrors

you right, today I successfully used svnsync command to maintain live
read-only offsite backups of GRASS subversion repository. Working like
a charm:-)

Regards, Martin Landa

--
Martin Landa <landa.martin@gmail.com> * http://gama.fsv.cvut.cz/~landa *

I can now report that we have an svn mirror:

http://svnmirror.osgeo.org

It uses svnsync to mirror all OSGeo project repositories hourly, including ones like OpenLayers and MapBuilder. QGIS and GeoTools cannot be mirrored with svnsync because the server versions of those repositories are too old (1.3 and 1.1, respectively). To be able to pull for svnsync, the server version must be 1.4.

The mirror is running on a TelaScience machine. AFAIK, only Mateusz, John, and myself have the ability to login to this machine at this time (it isn't on the LDAP or anything).

I still need to do a little work to email in the event that the syncing cron fails and sends output to stderr. If anyone has some code/ideas to make that simple, let me know.

I propose we let it stand for a week or so, and if things look like they are behaving to our satisfaction, we discontinue the 'svnadmin dump' full backups that we are running nightly that cause us lots of i/o and cpu grief.

Howard

On Dec 11, 2007, at 9:02 PM, Howard Butler wrote:

A number of months ago, Frank filed a ticket for using svnsync:

#91 (Use svnsync for offsite sync'ed subversion backup repositories) – OSGeo

I spent a little time this afternoon investigating what is required to get this going, and it turns out there is absolutely nothing we need to do on osgeo1 to support this. I propose that we stand up mirrors of all osgeo repositories on a TelaScience blade, and have them do their sync every three hours. Additionally, I propose that we stop doing daily full dumps on osgeo1 and incremental svn dumps on osgeo1.

These can be done on a mirror if we still want them, but they generate a lot of load on osgeo1, and we are looking to offload some of our static backup costs.

I will coordinate this effort and report back when it is done. No action should be required on osgeo1, and there will be no outage (planned ones, anyway :wink: )

Howard

Cool. I suggest you tell OSGeo-Discuss.

Mapbuilder have had our svn repository go down before, including loss of some of our history. This svn backup service would have been very useful.

A more difficult, but equally useful service would be the backup of our JIRA issues and Confluence wiki.

Howard Butler wrote:

I can now report that we have an svn mirror:

http://svnmirror.osgeo.org

It uses svnsync to mirror all OSGeo project repositories hourly, including ones like OpenLayers and MapBuilder. QGIS and GeoTools cannot be mirrored with svnsync because the server versions of those repositories are too old (1.3 and 1.1, respectively). To be able to pull for svnsync, the server version must be 1.4.

The mirror is running on a TelaScience machine. AFAIK, only Mateusz, John, and myself have the ability to login to this machine at this time (it isn't on the LDAP or anything).

I still need to do a little work to email in the event that the syncing cron fails and sends output to stderr. If anyone has some code/ideas to make that simple, let me know.

I propose we let it stand for a week or so, and if things look like they are behaving to our satisfaction, we discontinue the 'svnadmin dump' full backups that we are running nightly that cause us lots of i/o and cpu grief.

Howard

On Dec 11, 2007, at 9:02 PM, Howard Butler wrote:

A number of months ago, Frank filed a ticket for using svnsync:

http://trac.osgeo.org/osgeo/ticket/91

I spent a little time this afternoon investigating what is required to get this going, and it turns out there is absolutely nothing we need to do on osgeo1 to support this. I propose that we stand up mirrors of all osgeo repositories on a TelaScience blade, and have them do their sync every three hours. Additionally, I propose that we stop doing daily full dumps on osgeo1 and incremental svn dumps on osgeo1.

These can be done on a mirror if we still want them, but they generate a lot of load on osgeo1, and we are looking to offload some of our static backup costs.

I will coordinate this effort and report back when it is done. No action should be required on osgeo1, and there will be no outage (planned ones, anyway :wink: )

Howard

_______________________________________________
Sac mailing list
Sac@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/sac

--
Cameron Shorter
Geospatial Systems Architect
Tel: +61 (0)2 8570 5050
Mob: +61 (0)419 142 254

Think Globally, Fix Locally
Commercial Support for Geospatial Open Source Solutions
http://www.lisasoft.com/LISAsoft/SupportedProducts.html

On Dec 22, 2007, at 4:04 PM, Howard Butler wrote:

I propose we let it stand for a week or so, and if things look like they are behaving to our satisfaction, we discontinue the 'svnadmin dump' full backups that we are running nightly that cause us lots of i/o and cpu grief.

I have discontinued the full backups of subversion repositories on osgeo1 by commenting out the calls to do so in the scripts. The mirror is up and syncing hourly, and it has been running without trouble for almost a week. This should free up considerable storage and cpu resources, especially during the backup time (~10am European time).

We will still maintain an incremental dump, however, which essentially ends up being a single copy of the repository that is added to every three hours with the latest transactions.

The subversion dumps should no longer be needed as part of the backup(s), and it hopefully will free up the space to get us under the wire. Shawn and Tyler, can you guys take care of trimming the full dump svn stuff as part of the peer1 backup purging?

Howard

On 27-Dec-07, at 11:10 AM, Howard Butler wrote:

The subversion dumps should no longer be needed as part of the backup(s), and it hopefully will free up the space to get us under the wire. Shawn and Tyler, can you guys take care of trimming the full dump svn stuff as part of the peer1 backup purging?

Okay, I think I follow but really I don't understand all the systems we have running and this is part of a pretty big picture. I'd much rather have other SAC folks prepare a list of what folders we do and do not need backed up, then we can submit a ticket to PEER1.

Regarding SVN.. what has changed - i.e. what folders do we not need backed up anymore? Also, which other folders should we be asking to specifically backup? Can I please leave this to SAC to figure this out and then tell me what the plan is so I can help make sure it gets communicated back to PEER1?

Otherwise all I can think of is backing up /var/www which is the main part I know best :wink:

Best wishes,
Tyler

Tyler Mitchell (OSGeo) wrote:

On 27-Dec-07, at 11:10 AM, Howard Butler wrote:

The subversion dumps should no longer be needed as part of the backup(s), and it hopefully will free up the space to get us under the wire. Shawn and Tyler, can you guys take care of trimming the full dump svn stuff as part of the peer1 backup purging?

Okay, I think I follow but really I don't understand all the systems we have running and this is part of a pretty big picture. I'd much rather have other SAC folks prepare a list of what folders we do and do not need backed up, then we can submit a ticket to PEER1.

Regarding SVN.. what has changed - i.e. what folders do we not need backed up anymore? Also, which other folders should we be asking to specifically backup? Can I please leave this to SAC to figure this out and then tell me what the plan is so I can help make sure it gets communicated back to PEER1?

Otherwise all I can think of is backing up /var/www which is the main part I know best :wink:

Tyler,

I'll take a whack at writing notes on what needs to be backed up today.

Best regards,
--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | President OSGeo, http://osgeo.org