[GRASS-dev] Backport sync: merge mess

(cc grass-dev, to avoid that it gets lost)

On Tue, Apr 7, 2020 at 3:24 PM Vaclav Petras <notifications@github.com> wrote:

Fortunately, having Squash and merge saves the day here. Couple notes:

  1. If you end up doing git pull instead of git rebase, you can try setting git config --global pull.rebase true for git pull to be automatically turned into git pull --rebase. I don’t have experience with how this plays out with the upstream+fork setup, so you need to test. However, the point here is that Git won’t do merge unless you ask it to do it, well, git pull is one of the ways you are telling Git to do merge, although it does not seem that way. I recommend the git man page generator to mediate the pain from this.
  2. Squash and merge allows editing the commit message.
  3. Unless you do or already did something to fix it, your releasebranch_7_8 (i.e., releasebranch_7_8 in the neteler fork) still contains all this and you need to clean it before you can backport again. You need to trash (assuming it’s trash) what you have in your local releasebranch_7_8 branch and force it to be whatever the upstream state of the branch is. The git reset command should do the trick with these parameters: git reset --hard upstream/releasebranch_7_8 (related SO answer). You can confirm by looking at commits on GitHub and locally with git log.
  4. If you are doing a PR to update any branch, you need to do that from a new dedicated branch in your fork. If you don’t do that, your branch (here: neteler:releasebranch_7_8) will diverge from the upstream one (here: OSGeo:releasebranch_7_8) and you will end up doing steps in number 3 in order to stop propagating the unwanted commits. This is due to the difference between doing push directly to upstream (git push upstream releasebranch_7_8) which just pushes local commit(s) you have and PR workflow where a new commit is always created in the upstream and thus you don’t have it locally.

As I mentioned previously on mailing list, note that the randomly generated Git man pages actually need to explicitly say that they are not the real documentation.


Context: view it on GitHub (#484)

Hi Vaclav,

Thanks for your detailed reply, some comments inline:

On Tue, Apr 7, 2020 at 3:24 PM Vaclav Petras <notifications@github.com> wrote:

Fortunately, having Squash and merge saves the day here. Couple notes:

1. If you end up doing git pull instead of git rebase,

Indeed I so far used this (for months!), in the separately checked our
releasebranch78:

✘-1 ~/software/grass78_git [releasebranch_7_8 L|…7⚑ 4]
22:43 $ sh -x git_update_78.sh
+ git fetch --all --prune
Fetching upstream
Fetching origin

+ git checkout releasebranch_7_8
Already on 'releasebranch_7_8'

+ git merge upstream/releasebranch_7_8
Already up to date.

+ git push origin releasebranch_7_8
To github.com:neteler/grass.git
! [rejected] releasebranch_7_8 -> releasebranch_7_8
(non-fast-forward)
error: failed to push some refs to 'git@github.com:neteler/grass.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

I already trashed my fork, updated successfully, regenerated the fork
and updated again, getting above troubles.

As mentioned, till a few days ago no such problems. I don't know what happened.

you can try setting git config --global pull.rebase true for git pull to be automatically turned into git pull --rebase. I don't have experience with how this plays out with the upstream+fork setup, so you need to test. However, the point here is that Git won't do merge unless you ask it to do it, well, git pull is one of the ways you are telling Git to do merge, although it does not seem that way. I recommend the git man page generator to mediate the pain from this (https://git-man-page-generator.lokaltog.net/#aa9b72bf4fa689dc39dc9d297f35142d).

ok, trying that:

git pull --rebase upstream releasebranch_7_8

From github.com:OSGeo/grass

* branch releasebranch_7_8 -> FETCH_HEAD
Already up to date.
Current branch releasebranch_7_8 is up to date.

Now, how to update in future this directory? I also want to be able to
cherry-pick from master as I did since the git(hib) migration.

2. Squash and merge allows editing the commit message.

Confirmed. This (single) time I didn't in order to preserve the
different authors.

3. Unless you do or already did something to fix it, your releasebranch_7_8 (i.e., releasebranch_7_8 in the neteler fork) still contains all this and you need to clean it before you can backport again.

(it was all clean...)

You need to trash (assuming it's trash) what you have in your local releasebranch_7_8 branch and force it to be whatever the upstream state of the branch is. The git reset command should do the trick with these parameters: git reset --hard upstream/releasebranch_7_8 (related SO answer, https://stackoverflow.com/a/9210705/574907). You can confirm by looking at commits on GitHub and locally with git log.

Yes, I actually did
git reset --hard upstream/releasebranch_7_8

and git log. Still encountering above error.

4. If you are doing a PR to update any branch, you need to do that from a new dedicated branch in your fork.

Yes, I always open a respective feature branch.

If you don't do that, your branch (here: neteler:releasebranch_7_8) will diverge from the upstream one (here: OSGeo:releasebranch_7_8) and you will end up doing steps in number 3 in order to stop propagating the unwanted commits.

Sure. I did that once last year (like most new-to-git-comers one time)
and then learned to always start a feature branch.

This is due to the difference between doing push directly to upstream (git push upstream releasebranch_7_8) which just pushes local commit(s) you have and PR workflow where a new commit is always created in the upstream and thus you don't have it locally.

Yes.

The things is that I just did "cherry-pick" ing. I have done so many
(also for you :slight_smile: and no problem.
Just recently I must have messed something up.

As I mentioned previously on mailing list, note that the randomly generated Git man pages actually need to explicitly say that they are not the real documentation.

Well, my problem is that I don't know any more what the procedure is.
My workflow which was ok since last year is broken.
Deleting/redoing the fork does not help.

I am using this git version, if it matters:
git version 2.25.1

Markus

On Tue, Apr 7, 2020 at 5:06 PM Markus Neteler <neteler@osgeo.org> wrote:

Thanks for your detailed reply, some comments inline:

On Tue, Apr 7, 2020 at 3:24 PM Vaclav Petras <notifications@github.com> wrote:

Fortunately, having Squash and merge saves the day here. Couple notes:

  1. If you end up doing git pull instead of git rebase,

Indeed I so far used this (for months!), in the separately checked our
releasebranch78:

✘-1 ~/software/grass78_git [releasebranch_7_8 L|…7⚑ 4]
22:43 $ sh -x git_update_78.sh

  • git fetch --all --prune
    Fetching upstream
    Fetching origin

  • git checkout releasebranch_7_8
    Already on ‘releasebranch_7_8’

  • git merge upstream/releasebranch_7_8
    Already up to date.

  • git push origin releasebranch_7_8
    To github.com:neteler/grass.git
    ! [rejected] releasebranch_7_8 → releasebranch_7_8
    (non-fast-forward)
    error: failed to push some refs to ‘git@github.com:neteler/grass.git’
    hint: Updates were rejected because the tip of your current branch is behind
    hint: its remote counterpart. Integrate the remote changes (e.g.
    hint: ‘git pull …’) before pushing again.
    hint: See the ‘Note about fast-forwards’ in ‘git push --help’ for details.

I already trashed my fork, updated successfully, regenerated the fork
and updated again, getting above troubles.

The git push origin is really optional here (alhough good for keeping things organized). Because it says that neteler:releasebranch_7_8 contains commits which are not in your local releasebranch_7_8 branch, it seems that the fork has something which the upstream:releasebranch_7_8 does not have. This is strange since you are saying you redid your fork, but looking at your fork, it has a lot of Merge… commits. I think I’m still missing something in the way these work with the upstream, but anyway in this case (your error), it seems that Git complains because these are not part of your local branch.

As mentioned, till a few days ago no such problems. I don’t know what happened.

When somebody does backport in the middle of you backporting, particularly between your update script and your subsequent push, the push will fail, git you a messge like the one above which suggests pull. However, by cherry-picking, you already commited. Hence pull does merge which creates a Merge… commit.

you can try setting git config --global pull.rebase true for git pull to be automatically turned into git pull --rebase. I don’t have experience with how this plays out with the upstream+fork setup, so you need to test. However, the point here is that Git won’t do merge unless you ask it to do it, well, git pull is one of the ways you are telling Git to do merge, although it does not seem that way. I recommend the git man page generator to mediate the pain from this (https://git-man-page-generator.lokaltog.net/#aa9b72bf4fa689dc39dc9d297f35142d).

ok, trying that:

git pull --rebase upstream releasebranch_7_8

From github.com:OSGeo/grass

  • branch releasebranch_7_8 → FETCH_HEAD
    Already up to date.
    Current branch releasebranch_7_8 is up to date.

Maybe, you can ignore this suggestion. You are not using git pull in your regular workflow, but git merge, so that’s where the Merge… commits are comming from - at least those in your current fork. Unless of course, you did some git pull somewhere at some point.

Now, how to update in future this directory? I also want to be able to
cherry-pick from master as I did since the git(hib) migration.

Same as before. Although you can get into situation in Git nobody except Git masters is able to resolve, I don’t see why you would not be able to reset things into their original state. In other words, I don’t think this is caused by any change in how Git or GitHub work.

  1. Squash and merge allows editing the commit message.

Confirmed. This (single) time I didn’t in order to preserve the
different authors.

I see. If I understand this correctly, the co-authors are really just the last text lines with special syntax. You can chnage the rest or even add or edit these lines manually. I think I did that, but I can’t recall which commit it was to show it.

  1. Unless you do or already did something to fix it, your releasebranch_7_8 (i.e., releasebranch_7_8 in the neteler fork) still contains all this and you need to clean it before you can backport again.

(it was all clean…)

Just to be clear, by clean I mean exactly the same commits as they are on the given branch in the upstream repo.

You need to trash (assuming it’s trash) what you have in your local releasebranch_7_8 branch and force it to be whatever the upstream state of the branch is. The git reset command should do the trick with these parameters: git reset --hard upstream/releasebranch_7_8 (related SO answer, https://stackoverflow.com/a/9210705/574907). You can confirm by looking at commits on GitHub and locally with git log.

Yes, I actually did
git reset --hard upstream/releasebranch_7_8

and git log. Still encountering above error.

You need to check your fork as well since you are trying to push there. For git log, you can use sth like to match what you see on GitHub:

git log --graph --oneline --abbrev-commit --max-count=10

Since you are updating the branch in your fork, you need to also trash everything there and replace it with the local state with force push:

git push --force origin releasebranch_7_8

(BTW, that’s what is used to update PRs after rebasing the branch.)

  1. If you are doing a PR to update any branch, you need to do that from a new dedicated branch in your fork.

Yes, I always open a respective feature branch.

Sorry, the what I meant to draw attention to here is that doing a PR from your neteler:releasebranch_7_8 makes your local releasebranch_7_8 and the neteler:releasebranch_7_8 itself diverge from origin:releasebranch_7_8 in the way that after merging the PR your local releasebranch_7_8 and neteler:releasebranch_7_8 contain same commits, but different than upstream:releasebranch_7_8.

If you don’t do that, your branch (here: neteler:releasebranch_7_8) will diverge from the upstream one (here: OSGeo:releasebranch_7_8) and you will end up doing steps in number 3 in order to stop propagating the unwanted commits.

Sure. I did that once last year (like most new-to-git-comers one time)
and then learned to always start a feature branch.

This is due to the difference between doing push directly to upstream (git push upstream releasebranch_7_8) which just pushes local commit(s) you have and PR workflow where a new commit is always created in the upstream and thus you don’t have it locally.

Yes.

The things is that I just did “cherry-pick” ing.

git cherry-pick does a new commit, so same rules and complexities apply. Unfortunatelly, there is no “just”. The complexities I’m thinking about are asynchronous commits and need for PR-based workflows.

I have done so many
(also for you :slight_smile: and no problem.

Thanks for that. Ironically, the fact that you do many of these yourself may have contributed to smoothness of your workflow at least based on by wild guess below.

Just recently I must have messed something up.

This might be a wild guess because I don’t have a full grasp of all authors and dates metadata in Git, but looking at commits on OSGeo:releasebranch_7_8 on GitHub and commits in #484 (backport sync: merge mess), I would say you cherry picked over a period of couple days (Mar 30 till Apr 1) and then tried to push it, but in between Anna did some backports (Mar 31). I don’t know why this would happen only now or why it would not resolve as before, so again a wild guess.

With that I must say I don’t know why there are no Merge… commits in OSGeo:releasebranch_7_8 untill now. However, if nobody actually managed to do backport in between your git merge upstream... and your git push upstream... before, then your git push upstream... always worked smoothly without asking your to incorporate remote changes resulting in no Merge… commits. In other words, git merge in absence of local changes just does fast-forward merge, so no new (Merge…) commit is created, but if you did cherry-pick and then you git push upstream... but somebody just finished backport, git push fails and you either do git pull which it suggests or you do the update again which does git merge in both cases resulting in a Merge… commit.

As I mentioned previously on mailing list, note that the randomly generated Git man pages actually need to explicitly say that they are not the real documentation.

Well, my problem is that I don’t know any more what the procedure is.

I though I would have to critically examine it before doing any backport myself, but I did not feel I’m in postion to do so since there are different views on how to use Git. I guess at this point it seems that the current procedure does not work as expected in some cases and we have established that the rebasing locally and disabling merges of PR other than squash work best, so I would say the procedure should change.

The git merge should be replaced by git rebase and when git push upstream... won’t let you (bacause of other people’s backports), you need to do git rebase and not what it says (it says do plain pull).

My workflow which was ok since last year is broken.
Deleting/redoing the fork does not help.

I see some 8 days old commits in releasebranch_7_8 of your fork which are not in upstream, so it does not seem new to me. Redoing fork is anyway more than needed here. Force pushing the branch should be enough and you should see the result imediatelly (be sure to force push to your fork, not the upstream).

Vaclav

PS: I’ve heard that Linus Torvalds uses CVS at home.

Hi Vaclav,

Thanks for your extensive explanations! Much appreciated.

I think that I have probably found the missing part: I should have
deleted the *entire* directory. By just deleting my fork and re-adding
it as a "remote" I probably still had all the clutter in my local
copy.

Today I have
- trashed the entire directory
- clones the repo
- deleted my fork and created it again
- added the fork as a "remote"

Magically (well, not really), it seems to be clean.

Might that be the explanation, that I didn't properly start from scratch?

It is true that git has a higher level of complexity compared to SVN
but I hope to be back on track now.

What might have happened:
- I cherry-picked while travelling and tried to push over very bad
network connection
- I modified the local copy without updating from repo beforehand,
then trying to push
- as you guessed, I probably cherry-picked, didn't commit due to my
network issues on the road, then other backports happened and I tried
to push on top of that

Question: would the (somewhere) existing backport bot help here?

thanks again for your support,
Markus

On Wed, Apr 8, 2020 at 5:13 PM Markus Neteler <neteler@osgeo.org> wrote:

Hi Vaclav,

Thanks for your extensive explanations! Much appreciated.

I think that I have probably found the missing part: I should have
deleted the entire directory. By just deleting my fork and re-adding
it as a “remote” I probably still had all the clutter in my local
copy.

Today I have

  • trashed the entire directory
  • clones the repo
  • deleted my fork and created it again
  • added the fork as a “remote”

This does not sound like recommended workflow: clone fork as origin, add OSGeo as upstream. Are you aware of that?

Related to that, I’m not sure why most projects using this workflow use the name origin for one’s own fork. Calling it fork, myfork or seems more explicit. Perhaps something to consider.

Magically (well, not really), it seems to be clean.

Might that be the explanation, that I didn’t properly start from scratch?

Since I saw old Merge… commits in your fork, they were preserved locally or in the fork and propagated through pull/fetch or push.

What might have happened:

  • as you guessed, I probably cherry-picked, didn’t commit due to my
    network issues on the road, then other backports happened and I tried
    to push on top of that

…which is hard to avoid which in turn creates the Merge… commits which we now consider unnecessary. So I updated the wiki (how to backport a single commit and made a mess sections including some explanations):

https://trac.osgeo.org/grass/wiki/HowToGit#Backportingtoreleasebranches

I did not test the rebase workflow myself, but it turned out, I suggested the workflow to Anna who was successfully using it for some time now (although I think she did not get into the same situation as you did, so not really tested for that, you can test with two clones if you want).

Question: would the (somewhere) existing backport bot help here?

The new workflow with rebase should give us now the expected behavior, so I don’t think it is needed for that.

As far as automating backports, with GitHub Actions automating should be quite possible.

https://github.com/marketplace/actions/backport-commit
https://github.com/marketplace/actions/backporting
(and there is more)

For anything like that we need to spell out our backporting policies. For example, I’m just against immediate backports because I think the change needs to settle. We had several cases of immediate backport and then quick series of fixes or reverts which does not seem right as it lowers the stability of a the stable branch. I don’t see a setting of a delay in settings of the above actions which would make me happy. Using two labels one for humans and another for the bot would work in the same way without the possibly confusing delay and including some human consideration. Anyway, my point is that it is possible, but it requires more research. On top of that, the GitHub Actions (both the infrastructure and the available actions) are a moving target.

Vaclav