[GRASS-dev] [GRASS GIS] #3042: Patches to make the build reproducible (fileordering, randomness)

#3042: Patches to make the build reproducible (fileordering, randomness)
----------------------+-------------------------
Reporter: sebastic | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 7.0.5
Component: Default | Version: 7.0.4
Keywords: | CPU: Unspecified
Platform: Linux |
----------------------+-------------------------
As reported by Alexis Bienvenüe in [https://bugs.debian.org/825092 Debian
Bug #825092]:
> While working on the “[https://wiki.debian.org/ReproducibleBuilds
reproducible builds]” effort, we have noticed that 'grass' could not be
built reproducibly.
>
> There are several reproducibility issues:
>
> 1) File ordering issues - the build result depends on the order of the
files listed with `readdir` or equivalent.
>
> * in `tools/build_modules_xml.py` - see patch `01-sort-build-modules-
list.patch`
>
> * in `lib/db/dbmi_base/dbmscap.c` (this affects options order in the
`html/db.*.html` files) - see patch `02-sort-dbmscap.patch` that builds an
ordered list.
>
> * in `include/Make/Vars.make` (this affects the order in which object
files are merged) - see patch `03-sort-obj-files.patch`
>
> 2) Randomness issue: html/colortables/random.png is built using a
pseudo-random generator seeded with build-time value. See patch `04
-srand48_auto-from-SOURCE_DATE_EPOCH.patch` that uses the [https
://reproducible-builds.org/specs/source-date-epoch/ SOURCE_DATE_EPOCH]
environment variable (when set) to set a seed from last `debian/changelog`
entry date.
>
> 3) Makefile mistake: from
https://buildd.debian.org/status/fetch.php?pkg=grass&arch=i386&ver=7.0.4-1&stamp=1462121195,
it seems to me that the binary NAD files are not installed properly:
> {{{
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/prvi
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/hawaii
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/alaska
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/stgeorge
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/FL
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/WO
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/TN
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/stlrnc
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/stpaul
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/conus
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/MD
> /usr/bin/install -c -m 644 OBJ.i686-pc-linux-gnu/prvi
> /«PKGBUILDDIR»/dist.i686-pc-linux-gnu/etc/proj/nad/WI
> }}}
> The single `OBJ.i686-pc-linux-gnu/prvi` file is here installed to *all*
`/etc/proj/nad` files.
> See the patch `05-binary-nad-install.patch` for a fix.
>
> 4) nad2bin issue: nad2bin has unreproducible output (see Debian Bug
#825088)
>
> Once these proposed patches are applied (and Debian Bug #825088 fixed),
grass can be built reproducibly in our current experimental framework.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.0.5
Component: Default | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------
Changes (by sebastic):

* Attachment "01-sort-build-modules-list.patch" added.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.0.5
Component: Default | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------
Changes (by sebastic):

* Attachment "02-sort-dbmscap.patch" added.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.0.5
Component: Default | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------
Changes (by sebastic):

* Attachment "03-sort-obj-files.patch" added.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.0.5
Component: Default | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------
Changes (by sebastic):

* Attachment "04-srand48_auto-from-SOURCE_DATE_EPOCH.patch" added.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.0.5
Component: Default | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------
Changes (by sebastic):

* Attachment "05-binary-nad-install.patch" added.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: normal | Milestone: 7.0.5
Component: Default | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by sebastic):

All patches target the 7.0 branch, but apply to trunk too. Except ` 01
-sort-build-modules-list.patch`, `build_modules_xml.py` is no longer used
there.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042#comment:1&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------
Changes (by neteler):

* priority: normal => critical
* component: Default => Compiling

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042#comment:2&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by glynn):

Replying to [ticket:3042 sebastic]:

> 2) Randomness issue: html/colortables/random.png is built using a
pseudo-random generator seeded with build-time value. See patch `04
-srand48_auto-from-SOURCE_DATE_EPOCH.patch` that uses the [https
://reproducible-builds.org/specs/source-date-epoch/ SOURCE_DATE_EPOCH]
environment variable (when set) to set a seed from last `debian/changelog`
entry date.

While I agree with the idea, G_srand48_auto() shouldn't be using this
variable directly, but something more appropriate, e.g. GRASS_RANDOM_SEED
(or GRASS_RND_SEED, which is what r.mapcalc used prior to the RNG
changes).

Debian can then set that variable from $SOURCE_DATE_EPOCH as part of their
build process.

In theory, modules which use randomness should be providing a command-line
option to get the seed, with G_srand48_auto() only being used if the user
specifically wants a non-deterministic seed. But that probably isn't going
to happen in the foreseeable future.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042#comment:3&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by neteler):

added to
https://grasswiki.osgeo.org/wiki/GRASS_Community_Sprint_Bonn_2016#Bug_squashing

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042#comment:4&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by martinl):

In [changeset:"69214" 69214]:
{{{
#!CommitTicketReference repository="" revision="69214"
Patches to make the build reproducible (fileordering, randomness)
Patches applied: 01-sort-build-modules-list.patch, 02-sort-dbmscap.patch,
03-sort-obj-files.patch (05-binary-nad-install.patch not needed any more)
To be solved: 04-srand48_auto-from-SOURCE_DATE_EPOCH.patch
(see #3042)
}}}

--
Ticket URL: </ticket/3042#comment:5>
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by martinl):

In [changeset:"69215" 69215]:
{{{
#!CommitTicketReference repository="" revision="69215"
Patches to make the build reproducible (fileordering, randomness)
Patches applied: 01-sort-build-modules-list.patch, 02-sort-dbmscap.patch,
03-sort-obj-files.patch, 05-binary-nad-install.patch
To be solved: 04-srand48_auto-from-SOURCE_DATE_EPOCH.patch
(see #3042)
}}}

--
Ticket URL: </ticket/3042#comment:6>
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by martinl):

In [changeset:"69216" 69216]:
{{{
#!CommitTicketReference repository="" revision="69216"
Patches to make the build reproducible (fileordering, randomness)
Patches applied: 01-sort-build-modules-list.patch, 02-sort-dbmscap.patch,
03-sort-obj-files.patch, 05-binary-nad-install.patch
To be solved: 04-srand48_auto-from-SOURCE_DATE_EPOCH.patch
(see #3042)
}}}

--
Ticket URL: </ticket/3042#comment:7>
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by martinl):

Replying to [comment:3 glynn]:
> While I agree with the idea, G_srand48_auto() shouldn't be using this
variable directly, but something more appropriate, e.g. GRASS_RANDOM_SEED
(or GRASS_RND_SEED, which is what r.mapcalc used prior to the RNG
changes).

Updated patch attached - attachment:04-srand48_auto-from-
GRASS_RANDOM_SEED.patch. Can be applied in this way?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042#comment:8&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------
Changes (by martinl):

* Attachment "04-srand48_auto-from-GRASS_RANDOM_SEED.patch" added.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by sebastic):

Replying to [comment:8 martinl]:
> Replying to [comment:3 glynn]:
> > While I agree with the idea, G_srand48_auto() shouldn't be using this
variable directly, but something more appropriate, e.g. GRASS_RANDOM_SEED
(or GRASS_RND_SEED, which is what r.mapcalc used prior to the RNG
changes).
>
> Updated patch attached - attachment:04-srand48_auto-from-
GRASS_RANDOM_SEED.patch. Can be applied in this way?

The variable should be renamed, because it now uses `GRASS_RANDOM_SEED`
instead of `SOURCE_DATE_EPOCH`, i.e.
`s/source_date_epoch/grass_random_seed/g`.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042#comment:9&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by martinl):

In [changeset:"69217" 69217]:
{{{
#!CommitTicketReference repository="" revision="69217"
If GRASS_RANDOM_SEED is set, use it to seed the random generator when
G_srand48_auto is called. This helps makeing the build reproducible
(html/random.png)
Based on 04-srand48_auto-from-SOURCE_DATE_EPOCH.patch, see #3042
}}}

--
Ticket URL: </ticket/3042#comment:10>
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by martinl):

Replying to [comment:9 sebastic]:
> The variable should be renamed, because it now uses `GRASS_RANDOM_SEED`
instead of `SOURCE_DATE_EPOCH`, i.e.
`s/source_date_epoch/grass_random_seed/g`.

right, modified version applied in trunk as r69217. If no objection I do
backport to relbr72/70 in the next days.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042#comment:11&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by sebastic):

No objection from me, I'm happy to drop these patches from the Debian
package.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3042#comment:12&gt;
GRASS GIS <https://grass.osgeo.org>

#3042: Patches to make the build reproducible (fileordering, randomness)
--------------------------+-------------------------
  Reporter: sebastic | Owner: grass-dev@…
      Type: defect | Status: new
  Priority: critical | Milestone: 7.0.5
Component: Compiling | Version: 7.0.4
Resolution: | Keywords:
       CPU: Unspecified | Platform: Linux
--------------------------+-------------------------

Comment (by martinl):

In [changeset:"69243" 69243]:
{{{
#!CommitTicketReference repository="" revision="69243"
If GRASS_RANDOM_SEED is set, use it to seed the random generator when
G_srand48_auto is called. This helps makeing the build reproducible
(html/random.png)
Based on 04-srand48_auto-from-SOURCE_DATE_EPOCH.patch, see #3042
(merge r69217 from trunk)
}}}

--
Ticket URL: </ticket/3042#comment:13>
GRASS GIS <https://grass.osgeo.org>