[SAC] [OSGeo] #2718: osgeo4 backup is failing

#2718: osgeo4 backup is failing
---------------------------+--------------------------------------
Reporter: robe | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone: Sysadmin Contract 2022-I
Component: Systems Admin | Keywords:
---------------------------+--------------------------------------
This might be result of upgrade or something.

{{{

Error: Failed to run: zfs destroy osgeo7/containers/secure@snapshot-for-
osgeo4: cannot destroy snapshot osgeo7/containers/secure@snapshot-for-
osgeo4: dataset is busy
Error: Failed to run: zfs destroy osgeo7/containers/wordpress@snapshot-
for-osgeo4: cannot destroy snapshot osgeo7/containers/wordpress@snapshot-
for-osgeo4: dataset is busy
Error: Failed to run: zfs destroy osgeo7/containers/dronie-server
@snapshot-for-osgeo4: cannot destroy snapshot osgeo7/containers/dronie-
server@snapshot-for-osgeo4: dataset is busy

}}}

It sometimes resolves itself but sometimes doesn't and then a reboot of
osgeo7 is required or explicitly unmount / mount the containers.

I think the script also needs some work as it looks like it deleted the
backups even though taking a snapshot failed.

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/2718&gt;
OSGeo <https://osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#2718: osgeo4 backup is failing
---------------------------+---------------------------------------
Reporter: robe | Owner: sac@…
     Type: task | Status: new
Priority: normal | Milestone: Sysadmin Contract 2022-I
Component: Systems Admin | Resolution:
Keywords: |
---------------------------+---------------------------------------

Comment (by robe):

Okay this is a different issue than what I though. Usually when this kind
of thing happens with making snapshots, I can't make snapshots at all and
it gives a different error.

I can snapshot these servers fine with

{{{
lxc snapshot secure
lxc snapshot wordpress
lxc snapshot dronie-server
}}}

However if I try to delete

{{{
lxc rm dronie-server/for-osgeo4
}}}

I get this error:

{{{
  Failed to run: zfs destroy osgeo7/containers/dronie-server@snapshot-for-
osgeo4: cannot destroy snapshot osgeo7/containers/dronie-server@snapshot-
for-osgeo4: dataset is busy
}}}

This I have only seen happen if osgeo4 is in the middle backing up the
container in question.

Checking osgeo4 - it is in middle of backup of pretalx on osgeo3 which
shouldn't impact osgeo7

{{{
  sudo ps -faux | grep "lxc copy"
}}}

shows:

{{{
lxc copy osgeo3:pretalx/for-osgeo4 pretalx-backup
}}}

So only thought I have is osgeo4 must still have a hold on it when it
tried to delete the snapshot to reuse the name. I suspect rebooting
osgeo4 should resolve this. But should wait till it's done with backups.

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/2718#comment:1&gt;
OSGeo <https://osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.

#2718: osgeo4 backup is failing
---------------------------+---------------------------------------
Reporter: robe | Owner: sac@…
     Type: task | Status: closed
Priority: normal | Milestone: Sysadmin Contract 2022-I
Component: Systems Admin | Resolution: fixed
Keywords: |
---------------------------+---------------------------------------
Changes (by robe):

* status: new => closed
* resolution: => fixed

Comment:

closing this out since last scheduled backup on these ran fine. dronie-
server one I manually ran since that only backs up every 2 days.

--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/2718#comment:2&gt;
OSGeo <https://osgeo.org/&gt;
OSGeo committee and general foundation issue tracker.