[SAC] Ideas for the upgrades - discussion

Alex_M · June 3, 2014, 1:52am

I've been pondering options for how to approach the next round of upgrades.
http://wiki.osgeo.org/wiki/Infrastructure_Transition_Plan_2014

There seem to be 2 major ways of going which seem somewhat incompatible
with each other. Part of our current troubles are that we are mixing
these 2 solutions.

1. We move to software Raid 5 or Raid 10 on machines, keep going the
redundant power route. Backup plan is if something goes down to create
the service fresh on another machine that is up. With this plan I would
suggest we use VPS setup using LXC (Maybe with Docker) or OpenVZ. Each
project would get it's own LXC instance 100% separate from other
projects. Only the host needs kernel and OS updates, each of the
sub-containers can run it's own separate service stack. One method of
making redeployment faster in case of failure is if all the sites are
scripted setups using Chef,Puppet,Juju,Docker or some other means.
Single machines in this setup would likely be bigger with more disks and
power redundancy. ~$4-5000

2. Move to a cloud oriented configuration. We didn't really do this
right last time because we didn't know how it all worked then. The ideal
is that you have a series of relatively identical machines, usually
without raid, each Virtual Machine is live mirrored to one of the other
machines in the cluster. If a particular disk goes out you simply switch
to the hot-copy failover while you fix the original. By distributing the
VMs 2nd disk around to different machines you balance the cluster such
that if any one machine goes down it's pretty quick to spin up the
failovers on the remaining hardware. Disk contention is avoided by VMs
not sharing real disks much at all. This is the ideal setup of Ganeti or
OpenStack. Single machines in this setup would likely be smaller. ~$2-3000

In either option we do have the possibility of implementing large
storage separate from serving. By large storage I mean the large number
of static files we are starting to accumulate now that many projects use
something like Sphinx to generate websites instead of DB driven sites.
Then we could NFS or ISCSI mount it to a finely tuned front end for
service. Why would we do this? Well if we want to have vastly different
disk configurations, or if we wanted a BSD box so we could use ZFS.
AstroDog can explain this method more.

A few things are more clear to me:
1. We will start using XFS more for sphinx sites, and probably try to
get OS systems on at least ext4 if not XFS
2. New hard disks are likely to be a mix of SSD (120-256GB each) and
7200 rpm SATA (probably 2.5" 1TB)
3. More thought will go into which disks are used for what.
4. We need to leverage CDNs, EU mirrors, and tighter security (OWASP) to
handle nefarious traffic that brings up our loads.

Thanks,
Alex

PS: Any volunteers want to track down the liasons from each OSGeo
project to make sure they are aware of the planning and request their
input on their projected 3-5 year needs?
http://wiki.osgeo.org/wiki/Project_Steering_Committees

Harrison_Grundy · June 3, 2014, 11:58am

On Mon, 02 Jun 2014 20:52:28 -0500, Alex Mandel <tech_dev@wildintellect.com> wrote:

I've been pondering options for how to approach the next round of upgrades.
http://wiki.osgeo.org/wiki/Infrastructure_Transition_Plan_2014

There seem to be 2 major ways of going which seem somewhat incompatible
with each other. Part of our current troubles are that we are mixing
these 2 solutions.

1. We move to software Raid 5 or Raid 10 on machines, keep going the
redundant power route. Backup plan is if something goes down to create
the service fresh on another machine that is up. With this plan I would
suggest we use VPS setup using LXC (Maybe with Docker) or OpenVZ. Each
project would get it's own LXC instance 100% separate from other
projects. Only the host needs kernel and OS updates, each of the
sub-containers can run it's own separate service stack. One method of
making redeployment faster in case of failure is if all the sites are
scripted setups using Chef,Puppet,Juju,Docker or some other means.
Single machines in this setup would likely be bigger with more disks and
power redundancy. ~$4-5000

2. Move to a cloud oriented configuration. We didn't really do this
right last time because we didn't know how it all worked then. The ideal
is that you have a series of relatively identical machines, usually
without raid, each Virtual Machine is live mirrored to one of the other
machines in the cluster. If a particular disk goes out you simply switch
to the hot-copy failover while you fix the original. By distributing the
VMs 2nd disk around to different machines you balance the cluster such
that if any one machine goes down it's pretty quick to spin up the
failovers on the remaining hardware. Disk contention is avoided by VMs
not sharing real disks much at all. This is the ideal setup of Ganeti or
OpenStack. Single machines in this setup would likely be smaller. ~$2-3000

In either option we do have the possibility of implementing large
storage separate from serving. By large storage I mean the large number
of static files we are starting to accumulate now that many projects use
something like Sphinx to generate websites instead of DB driven sites.
Then we could NFS or ISCSI mount it to a finely tuned front end for
service. Why would we do this? Well if we want to have vastly different
disk configurations, or if we wanted a BSD box so we could use ZFS.
AstroDog can explain this method more.

The basic idea is that hosts are not longer stuck with whatever disk subsystem they were originally configured with, while avoiding the nastiness of things like NFS cross-mounts.

Rather than having to, say, replace the SSD drives in a host with 7.2k 1TB drives because a particular project's needs have changed, or deal with migrating their VM onto a new host with some new storage configuration, you simply allocate a new volume from the storage pool and attach it to the VM.

Also, by periodically mirroring the VMs onto the array, projects would also have the ability to snapshot their VMs, without consuming expensive SSD storage on the hosts themselves or the hassle of dealing with the full blown backup system.

As it relates more specifically to the two options Alex outlined above, the large array allows SAC to focus on I/O performance with the local configurations, rather than try to find and keep the appropriate balance between speed and capacity while projects' needs change.

A few things are more clear to me:
1. We will start using XFS more for sphinx sites, and probably try to
get OS systems on at least ext4 if not XFS
2. New hard disks are likely to be a mix of SSD (120-256GB each) and
7200 rpm SATA (probably 2.5" 1TB)
3. More thought will go into which disks are used for what.
4. We need to leverage CDNs, EU mirrors, and tighter security (OWASP) to
handle nefarious traffic that brings up our loads.

Thanks,
Alex

PS: Any volunteers want to track down the liasons from each OSGeo
project to make sure they are aware of the planning and request their
input on their projected 3-5 year needs?
http://wiki.osgeo.org/wiki/Project_Steering_Committees
_______________________________________________

As I run into them on the OSGeo4BSD project I can ask or send them in someone's direction. Let me know.

--- Harrison