[OSGeo] #3342: tracsv went down

#3342: tracsv went down
---------------------------+---------------------------------------------
Reporter: robe | Owner: robe
     Type: task | Status: assigned
Priority: normal | Milestone: Sysadmin Contract 2025-I (robe)
Component: SysAdmin/Trac | Keywords:
---------------------------+---------------------------------------------
Trac went down around about an hour and a half ago.

Just brought it back up.

Key issue was it ran out of diskspace. Conclusion is that the postgresql
logging was logging too much.

The last postgresql-log was 38GB so had to delete that first and also
clear some backup snapshots.

This impacted trac, gitea, and svn
--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/3342&gt;
OSGeo <Gter - OSGeo;
OSGeo committee and general foundation issue tracker.

#3342: tracsv went down
---------------------------+----------------------------------------------
Reporter: robe | Owner: robe
     Type: task | Status: closed
Priority: normal | Milestone: Sysadmin Contract 2025-I (robe)
Component: SysAdmin/Trac | Resolution: fixed
Keywords: |
---------------------------+----------------------------------------------
Changes (by robe):

* resolution: => fixed
* status: assigned => closed

Comment:

To fix it didn't seem sufficient to clear disk space and restart tracsvn.
I also had to restart osgeo7 nginx, so not sure if it had a block rule or
something.

I waited for about 20 minutes to see if trac would come back on its own,
and while the services (both gitea and apache2) showed they were up the
associated websites were still giving 503 gateway error.

I also noticed in the logs there seem to be hits to the old grass trac
repos, so I'm going to put a block on those. Thought we had done that
already.
--
Ticket URL: <#3342 (tracsv went down) – OSGeo;
OSGeo <Gter - OSGeo;
OSGeo committee and general foundation issue tracker.

#3342: tracsv went down
---------------------------+----------------------------------------------
Reporter: robe | Owner: robe
     Type: task | Status: closed
Priority: normal | Milestone: Sysadmin Contract 2025-I (robe)
Component: SysAdmin/Trac | Resolution: fixed
Keywords: |
---------------------------+----------------------------------------------
Comment (by robe):

Looks like it went down again ran out of disk space. This time the
postgresql logs weren't that big. It could be osgeo4 locked on backup
holding on to last. Will check on that later.
--
Ticket URL: <#3342 (tracsv went down) – OSGeo;
OSGeo <Gter - OSGeo;
OSGeo committee and general foundation issue tracker.

#3342: tracsv went down
---------------------------+----------------------------------------------
Reporter: robe | Owner: robe
     Type: task | Status: closed
Priority: normal | Milestone: Sysadmin Contract 2025-I (robe)
Component: SysAdmin/Trac | Resolution: fixed
Keywords: |
---------------------------+----------------------------------------------
Comment (by robe):

Okay it was still loosing disk, I think space is locked by server so I
stopped the orginal renamed it tracsvn-old, snapshoted it copied the
snapshot back to old name tracsvn, so I created a copy and that renamed
the original to trac-svn-old.

{{{
lxc snapshot tracsvn #note the snapshot name
lxc stop tracsvn --force
lxc cp tracsvn/snap1688 tracsvn-new
lxc mv tracsvn tracsvn-old
lxc mv tracsvn-new tracsvn
lxc start tracsvn
lxc snapshot tracsvn
# I had to do below since the ip changed
lxc exec nginx -- systemctl restart nginx

}}}

Now

{{{
lxc exec tracsvn --df -h

tech_dev@osgeo7:~$ lxc exec tracsvn -- df -h
Filesystem Size Used Avail Use% Mounted on
osgeo7/containers/tracsvn 1.1T 204G 883G 19% /
none 492K 4.0K 488K 1% /dev
udev 63G 0 63G 0% /dev/tty
tmpfs 100K 0 100K 0% /dev/lxd
none 63G 0 63G 0% /dev/shm
tmpfs 100K 0 100K 0% /dev/.lxd-mounts
tmpfs 26G 144K 26G 1% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup

}}}
--
Ticket URL: <#3342 (tracsv went down) – OSGeo;
OSGeo <Gter - OSGeo;
OSGeo committee and general foundation issue tracker.