The lower-post-volume people behind the software in Debian. (List of feeds.)

I just encountered an interesting cherry-pick failure.

The change I was trying to cherry-pick was to remove a hunk of text. Its patch conceptually looked like this:

@@ ... @@
 A
-B
 C

even though the pre-context A, removed text B, and post-context C are all multi-line block.
After doing a significant rewrite to the same original codebase (i.e. that had A, B and then C next to each other), the code I wanted to cherry-pick the above commit moved the text around and the block corresponding to B is now done a lot later. A diff between that state and the original perhaps looked like this:

@@ ... @@
 A
-B
 C
@@ ... @@
 D
+B
 E

And cherry-picking the above change succeeded without doing anything (!?!?).

Logically, this behaviour "makes sense", in the sense that it can be explained. The change wants to make A and C adjacent by removing B, and the three-way merge noticed that the updated codebase already had that removal, so there is nothing that needs to be done. In this particular case, I did not remove B but moved it elsewhere, so what cherry-pick did was wrong, but in other cases I may indeed have removed it without adding the equivalent to anywhere else, so it could have been correct. We simply cannot say. I wonder if we should at least flag this "both sides appear to have removed" case as conflicting, but I am not sure how that should be implemented (let alone implemented efficiently). After all, the moved block B might have gone to a completely different file. Would we scan for the matching block of text for the entire working tree?

This is why you should always look at the output from "git show" for the commit being cherry-picked and the output from "git diff HEAD" before concluding the cherry-pick to see if anything is amiss.


Posted Sun Apr 26 05:56:00 2015 Tags:

It’s Penguicon 2015 at the Westin in Southfield, Michigan, and time for the 2015 Friends of Armed & Dangerous party.

9PM tonight, room 314. Nuclear ghost-pepper brownies will be featured.

Posted Fri Apr 24 13:49:43 2015 Tags:
Screenshot of an old CDK-based
JChemPaint, from the first CDK paper.
CC-BY :)
Already a while ago, the American Chemical Society (ACS) decided to allow the Creative Commons Attribution license (version 4.0) to be used on their papers, via their Author Choice program. ACS members pay $1500, which is low for a traditional publisher. While I even rather seem them move to a gold Open Access journal, it is a very welcome option! For the ACS business model it means a guaranteed sell of some 40 copies of this paper (at about $35 dollar each), because it will not immediately affect the sale of the full journal (much). Some papers may sell more than that had the paper remained closed access, but many for papers that sounds like a smart move money wise. Of course, they also buy themselves some goodwill and green Open Access is just around the corner anyway.

Better, perhaps, is that you can also use this option to make a past paper Open Access under a CC-BY license! And that is exactly what Christoph Steinbeck did with five of his papers, including two on which I am co-author. And these are not the least papers either. The first is the first CDK paper from 2003 (doi:10.1021/ci050400b), which featured a screenshot of JChemPaint shown above. Note that in those days, the print journal was still the target, so the screenshot is in gray scale :) BTW, given that this paper is cited 329 times (according to ImpactStory), maybe the ACS could have sold more than 40 copies. But for me, it means that finally people can read this paper about Open Science in chemistry, even after so many years. BTW, there is little chance the second CDK paper will be freed in a similar way.

The second paper that was liberated this way, is the first Blue Obelisk paper (doi:10.1021/ci050400b), which was cited 276 times (see ImpactStory):


This screenshot nicely shows how readers can see the CC-BY license for this paper. Note that it also lists that the copyright is with the ACS, which is correct, because in those days you commonly gave away your copyright to the publisher (I have stopped doing this, bar some unfortunate recent exceptions).

So, head over to your email client and email support@services.acs.org and let them know you also want your JCICS/JCIM paper available under a CC-BY license! No excuse anymore to make your seminal work in cheminformatics not available as gold Open Access!

Of course, submitting your new work to the Journal of Cheminformatics is cheaper and has the advantage that all papers are Open Access!
Posted Sat Apr 18 10:11:00 2015 Tags:

Back in 2012, Poul-Henning-Kamp wrote a disgruntled article in ACM Queue, A Generation Lost in the Bazaar.

It did not occur to me to respond in public at the time, but someone else’s comment on a G+ thread about the article revived the thread. Rereading my reaction, I think it is still worth sharing for the fundamental point about scaling and chaos.

There are quite a lot of defects in the argument of this piece. One is that Kemp (rightly) complains about autoconf, but then leaps from that to a condemnation of the bazaar model without establishing that one implies the other.

I think, also, that when Kamp elevates control by a single person as a necessary way to get quality he is fooling himself about what is even possible at the scale of operating systems like today’s *BSD or Linux, which are far larger than the successful cathedrals of programming legend.

No single person can be responsible at today’s scale; the planning problem is too hard. It isn’t even really possible to “create architecture” because the attempt would exceed human cognitive capacity; the best we can do is make sure that the components of plannable size are clean, hope we get good emergent behavior from the whole system, and try to nudge it towards good outcomes as it evolves.

What this piece speaks of to me is a kind of nostalgia, and a hankering for the control (or just the illusion of control) that we had when our software systems were orders of magnitude smaller. We don’t have the choice that Kamp wants to take anymore, and it may be we only fooled ourselves into thinking we ever had it

Our choices are all chaos – either chaos harnessed by a transparent, self-correcting social process, or chaos hidden and denied and eating at the roots of our software.

Posted Fri Apr 17 00:57:55 2015 Tags:
In case some readers of this blog would be interested in working with Open Source software and VoIP technologies, Be IP (http://www.beip.be) is hiring a developer. Please see http://www.beip.be/BeIP-Job-Offer.pdf for the job description. You can contact me directly.
Posted Wed Apr 15 09:58:06 2015 Tags:

I’ve been sent my panel schedule for Penguicon 2015.

Building the “Great Beast of Malvern” – Saturday 5:00 pm

One of us needed a new computer. One of us kicked off the campaign to
fund it. One of us assembled the massive system. One of us installed the
software. We were never all in the same place at the same time. All of us
blogged about it, and had a great time with the whole folderol. Come hear
how Eric “esr” Raymond got his monster machine, with ‘a little help from
his friends’ scattered all over the Internet.

Dark Chocolate Around The World – Sunday 12:00 pm

What makes one chocolate different from others? It’s not just how much
cocoa or sugar it contains or how it’s processed. Different varieties of
are grown in different parts of the world, and sometimes it’s the type of
beans make for different flavor qualities. Join Cathy and Eric Raymond for
a tasting session designed to show you how to tell West African chocolate
from Ecuadorian.

Eric S. Raymond: Ask Me Anything – Sunday 3:00 pm

Ask ESR Anything. What’s he been working on? What’s he shooting?
What’s he thinking about? What’s he building in there?

We do also intend to run the annual “Friends of Armed & Dangerous” party, but don’t yet know if we’re in a party-floor room.

“Geeks With Guns” is already scheduled.

Posted Sun Apr 12 00:28:28 2015 Tags:

This is the fourth part of my series of posts explaining the bitcoin Lightning Networks 0.5 draft paper.  See Part I, Part II and Part III.

The key revelation of the paper is that we can have a network of arbitrarily complicated transactions, such that they aren’t on the blockchain (and thus are fast, cheap and extremely scalable), but at every point are ready to be dropped onto the blockchain for resolution if there’s a problem.  This is genuinely revolutionary.

It also vindicates Satoshi’s insistence on the generality of the Bitcoin scripting system.  And though it’s long been suggested that bitcoin would become a clearing system on which genuine microtransactions would be layered, it was unclear that we were so close to having such a system in bitcoin already.

Note that the scheme requires some solution to malleability to allow chains of transactions to be built (this is a common theme, so likely to be mitigated in a future soft fork), but Gregory Maxwell points out that it also wants selective malleability, so transactions can be replaced without invalidating the HTLCs which are spending their outputs.  Thus it proposes new signature flags, which will require active debate, analysis and another soft fork.

There is much more to discover in the paper itself: recommendations for lightning network routing, the node charging model, a risk summary, the specifics of the softfork changes, and more.

I’ll leave you with a brief list of requirements to make Lightning Networks a reality:

  1. A soft-fork is required, to protect against malleability and to allow new signature modes.
  2. A new peer-to-peer protocol needs to be designed for the lightning network, including routing.
  3. Blame and rating systems are needed for lightning network nodes.  You don’t have to trust them, but it sucks if they go down as your money is probably stuck until the timeout.
  4. More refinements (eg. relative OP_CHECKLOCKTIMEVERIFY) to simplify and tighten timeout times.
  5. Wallets need to learn to use this, with UI handling of things like timeouts and fallbacks to the bitcoin network (sorry, your transaction failed, you’ll get your money back in N days).
  6. You need to be online every 40 days to check that an old HTLC hasn’t leaked, which will require some alternate solution for occasional users (shut down channel, have some third party, etc).
  7. A server implementation needs to be written.

That’s a lot of work!  But it’s all simply engineering from here, just as bitcoin was once the paper was released.  I look forward to seeing it happen (and I’m confident it will).

Posted Wed Apr 8 03:59:37 2015 Tags:

This is the third part of my series of posts explaining the bitcoin Lightning Networks 0.5 draft paper.

In Part I I described how a Poon-Dryja channel uses a single in-blockchain transaction to create off-blockchain transactions which can be safely updated by either party (as long as both agree), with fallback to publishing the latest versions to the blockchain if something goes wrong.

In Part II I described how Hashed Timelocked Contracts allow you to safely make one payment conditional upon another, so payments can be routed across untrusted parties using a series of transactions with decrementing timeout values.

Now we’ll join the two together: encapsulate Hashed Timelocked Contracts inside a channel, so they don’t have to be placed in the blockchain (unless something goes wrong).

Revision: Why Poon-Dryja Channels Work

Here’s half of a channel setup between me and you where I’m paying you 1c: (there’s always a mirror setup between you and me, so it’s symmetrical)

Half a channel: we will invalidate transaction 1 (in favour of a new transaction 2) to send funds.

The system works because after we agree on a new transaction (eg. to pay you another 1c), you revoke this by handing me your private keys to unlock that 1c output.  Now if you ever released Transaction 1, I can spend both the outputs.  If we want to add a new output to Transaction 1, we need to be able to make it similarly stealable.

Adding a 1c HTLC Output To Transaction 1 In The Channel

I’m going to send you 1c now via a HTLC (which means you’ll only get it if the riddle is answered; if it times out, I get the 1c back).  So we replace transaction 1 with transaction 2, which has three outputs: $9.98 to me, 1c to you, and 1c to the HTLC: (once we agree on the new transactions, we invalidate transaction 1 as detailed in Part I)

Our Channel With an Output for an HTLC

Note that you supply another separate signature (sig3) for this output, so you can reveal that private key later without giving away any other output.

We modify our previous HTLC design so you revealing the sig3 would allow me to steal this output. We do this the same way we did for that 1c going to you: send the output via a timelocked mutually signed transaction.  But there are two transaction paths in an HTLC: the got-the-riddle path and the timeout path, so we need to insert those timelocked mutually signed transactions in both of them.  First let’s append a 1 day delay to the timeout path:

Timeout path of HTLC, with locktime so it can be stolen once you give me your sig3.

Similarly, we need to append a timelocked transaction on the “got the riddle solution” path, which now needs my signature as well (otherwise you could create a replacement transaction and bypass the timelocked transaction):

Full HTLC: If you reveal Transaction 2 after we agree it’s been revoked, and I have your sig3 private key, I can spend that output before you can, down either the settlement or timeout paths.

Remember The Other Side?

Poon-Dryja channels are symmetrical, so the full version has a matching HTLC on the other side (except with my temporary keys, so you can catch me out if I use a revoked transaction).  Here’s the full diagram, just to be complete:

A complete lightning network channel with an HTLC, containing a glorious 13 transactions.

Closing The HTLC

When an HTLC is completed, we just update transaction 2, and don’t include the HTLC output.  The funds either get added to your output (R value revealed before timeout) or my output (timeout).

Note that we can have an arbitrary number of independent HTLCs in progress at once, and open and/or close as many in each transaction update as both parties agree to.

Keys, Keys Everywhere!

Each output for a revocable transaction needs to use a separate address, so we can hand the private key to the other party.  We use two disposable keys for each HTLC[1], and every new HTLC will change one of the other outputs (either mine, if I’m paying you, or yours if you’re paying me), so that needs a new key too.  That’s 3 keys, doubled for the symmetry, to give 6 keys per HTLC.

Adam Back pointed out that we can actually implement this scheme without the private key handover, and instead sign a transaction for the other side which gives them the money immediately.  This would permit more key reuse, but means we’d have to store these transactions somewhere on the off chance we needed them.

Storing just the keys is smaller, but more importantly, Section 6.2 of the paper describes using BIP 32 key hierarchies so the disposable keys are derived: after a while, you only need to store one key for all the keys the other side has given you.  This is vastly more efficient than storing a transaction for every HTLC, and indicates the scale (thousands of HTLCs per second) that the authors are thinking.

Next: Conclusion

My next post will be a TL;DR summary, and some more references to the implementation details and possibilities provided by the paper.

 


[1] The new sighash types are fairly loose, and thus allow you to attach a transaction to a different parent if it uses the same output addresses.  I think we could re-use the same keys in both paths if we ensure that the order of keys required is reversed for one, but we’d still need 4 keys, so it seems a bit too tricky.

Posted Mon Apr 6 11:21:26 2015 Tags:

I’ve released shipper 1.7. The main new feature in this release id that it now knows how to play nice with repository collections managed by gitolite and browseable through gitweb, like this one.

What’s new is that shipper (described in detail here shortly before I shipped the 1.0 version) now treats a gitolite/gitweb colection as just another publishing channel. When you call shipper to announce an update on a project in the collection, it updates the ‘description’ and ‘README.html’ files in the repository from the project control file, thus ensuring that the gitweb view of the collection always displays up-to-date metadata.

This is yet more fallout from the impending Gitorious shutdown. I don’t know if my refugee projects from Gitorious will be hosted on thyrsus.com indefinitely; I’m considering several alternatives. But while they’re there I might as well figure out how to make updates as easy as possible so nobody else has to solve this problem and everyone’s productivity can go up.

Actually, I’m a little surprised that I have received neither bug reports nor feature requests on shipper since issuing the beta in 2013. This hints that either the software is perfect (highly unlikely) or nobody else has the problem it solves – that is, having to ship releases of software so frequently that one must either automate the process details or go mad.

Is that really true? Am I the only hacker with this problem? Or is there something I’m missing here? An enquiring mind wants to know.

Posted Sun Apr 5 11:50:22 2015 Tags:

Last Sunday I was informed by email that I have been nominated for the 2015 John W. Campbell award for best new science-fiction writer. I was also asked not to reveal this in public until 4 April.

This is a shame.. I had a really elaborate April Fool’s joke planned where I was going to announce my nomination in the style of a U.S. presidential campaign launch. Lots of talk about a 50-state strategy and my hopes of appealing to swing voters disaffected with both the SJW and Evil League of Evil extremists, invented polling results, and nine yards of political bafflegab.

The plan was to write it so over-the-top that everyone would go “Oh, ha ha, great AFJ but you can’t fool us”…and then, three days later, the other shoe drops. Alas, I checked in with the organizers and they squelched the idea.

It is, of course, a considerable honor to be nominated, and one I am somewhat doubtful I actually deserve. But after considering the ramifications, I have decided not to decline the nomination, but rather to leave the decision on the merits up to the voters.

I make this choice because, even if I myself doubt that my single story is more than competent midlist work, and I want no part of the messy tribal politics in which I seem to have become partly swept up, there is something I don’t mind representing and giving people the opportunity to vote for.

That something is the proud tradition of classic SF, the Golden Age good stuff and its descendants today. It may be that I am among the least and humblest of those descendants, but I think both the virtues and the faults of Sucker Punch demonstrate vividly where I come from and how much that tradition has informed who I am as a writer and a human being.

If you choose to vote for Sucker Punch as a work which, individually flawed as it may be, upholds that tradition and carries it forward, that will make me happy and proud.

Posted Sat Apr 4 07:47:13 2015 Tags:
This release has a few changes in the user-visible output from Porcelain commands. These are not meant to be parsed by scripts, but the users still may want to be aware of the changes.
  • Output from "git log --decorate" (and "%d" format specifier used in the userformat "--format=<string>" parameter "git log" family of commands take) used to list "HEAD" just like other branch names, separated with a comma in between. E.g.

         $ git log --decorate -1 master
         commit bdb0f6788fa5e3cacc4315e9ff318a27b2676ff4 (HEAD, master)
         ...
    This release updates the output slightly when HEAD refers to the tip of a branch whose name is also shown in the output.  The above is shown as:

         $ git log --decorate -1 master
         commit bdb0f6788fa5e3cacc4315e9ff318a27b2676ff4 (HEAD -> master)
         ...

  • The phrasing "git branch" uses to describe a detached HEAD has been updated to match that of "git status". When the HEAD is at the same commit as it was originally detached, they now both show "detached at <commit object name>". When the HEAD has moved since it was originally detached, they now both show "detached from <commit object name>". Earlier "git branch" always used "from", even when the user hasn't moved HEAD since it was detached.
Otherwise, there are only minor fixes and documentation updates everywhere, and unusually low number of new and shiny toys ;-)
  • "git log --invert-grep --grep=WIP" will show only commits that do not have the string "WIP" in their messages.
  • "git push" has been taught a "--atomic" option that makes push to update more than one ref an "all-or-none" affair.
  • Extending the "push to deploy" added in 2.3, the behaviour of "git push" when updating the branch that is checked out can now be tweaked by push-to-checkout hook. The "push to deploy" implementation in 2.3 has a bug that makes it impossible to bootstrap an empty repository (or an unborn branch), but it can be worked around by using this hook.
  • "git send-email" used to accept a mistaken "y" (or "yes") as an answer to "What encoding do you want to use [UTF-8]? " without questioning.  Now it asks for confirmation when the answer looks too short to be a valid encoding name.
  • "git archive" can now be told to set the 'text' attribute in the resulting zip archive.
  • "git -C '' subcmd" used to refuse to work in the current directory, unlike "cd ''" which silently behaves as a no-op.
  • The versionsort.prerelease configuration variable can be used to specify that v1.0-pre1 comes before v1.0.
  • A new "push.followTags" configuration turns the "--follow-tags" option on by default for the "git push" command.
Please give it a good beating so that we can ship a successful v2.4 final at around the end of the month without regressions compared to v2.3 series.

Thanks.

Posted Thu Apr 2 22:47:00 2015 Tags:

Gitorious – which I preferred to GitHub for being totally open-source – is shutting down sometime in May. I had no fewer than 26 projects on there, including reposurgeon, cvs-fast-import, doclifter, and INTERCAL.

Now they’ve moved. This won’t affect most of my users, as the web pages and distribution tarballs are still in their accustomed locations at catb.org. If you’re a committer on any of these Gitirious repos, of course, the move actually matters.

Temporarily the repositories are on thyrsus.com; here’s the entire list. They may not stay there, but moving them to thyrsus.com was 90% of the work of moving them anywhere else and now I can consider options at my leisure.

Posted Thu Apr 2 20:26:53 2015 Tags:

In Part I, we demonstrated Poon-Dryja channels; a generalized channel structure which used revocable transactions to ensure that old transactions wouldn’t be reused.

A channel from me<->you would allow me to efficiently send you 1c, but that doesn’t scale since it takes at least one on-blockchain transaction to set up each channel. The solution to this is to route funds via intermediaries;  in this example we’ll use the fictitious “MtBox”.

If I already have a channel with MtBox’s Payment Node, and so do you, that lets me reliably send 1c to MtBox without (usually) needing the blockchain, and it lets MtBox send you 1c with similar efficiency.

But it doesn’t give me a way to force them to send it to you; I have to trust them.  We can do better.

Bonding Unrelated Transactions using Riddles

For simplicity, let’s ignore channels for the moment.  Here’s the “trust MtBox” solution:

I send you 1c via MtBox; simplest possible version, using two independent transactions. I trust MtBox to generate its transaction after I send it mine.

What if we could bond these transactions together somehow, so that when you spend the output from the MtBox transaction, that automatically allows MtBox to spend the output from my transaction?

Here’s one way. You send me a riddle question to which nobody else knows the answer: eg. “What’s brown and sticky?”.  I then promise MtBox the 1c if they answer that riddle correctly, and tell MtBox that you know.

MtBox doesn’t know the answer, so it turns around and promises to pay you 1c if you answer “What’s brown and sticky?”. When you answer “A stick”, MtBox can pay you 1c knowing that it can collect the 1c off me.

The bitcoin blockchain is really good at riddles; in particular “what value hashes to this one?” is easy to express in the scripting language. So you pick a random secret value R, then hash it to get H, then send me H.  My transaction’s 1c output requires MtBox’s signature, and a value which hashes to H (ie. R).  MtBox adds the same requirement to its transaction output, so if you spend it, it can get its money back from me:

Two Independent Transactions, Connected by A Hash Riddle.

Handling Failure Using Timeouts

This example is too simplistic; when MtBox’s PHP script stops processing transactions, I won’t be able to get my 1c back if I’ve already published my transaction.  So we use a familiar trick from Part I, a timeout transaction which after (say) 2 days, returns the funds to me.  This output needs both my and MtBox’s signatures, and MtBox supplies me with the refund transaction containing the timeout:

Hash Riddle Transaction, With Timeout

MtBox similarly needs a timeout in case you disappear.  And it needs to make sure it gets the answer to the riddle from you within that 2 days, otherwise I might use my timeout transaction and it can’t get its money back.  To give plenty of margin, it uses a 1 day timeout:

MtBox Needs Your Riddle Answer Before It Can Answer Mine

Chaining Together

It’s fairly clear to see that longer paths are possible, using the same “timelocked” transactions.  The paper uses 1 day per hop, so if you were 5 hops away (say, me <-> MtBox <-> Carol <-> David <-> Evie <-> you) I would use a 5 day timeout to MtBox, MtBox a 4 day to Carol, etc.  A routing protocol is required, but if some routing doesn’t work two nodes can always cancel by mutual agreement (by creating timeout transaction with no locktime).

The paper refers to each set of transactions as contracts, with the following terms:

  • If you can produce to MtBox an unknown 20-byte random input data R from a known H, within two days, then MtBox will settle the contract by paying you 1c.
  • If two days have elapsed, then the above clause is null and void and the clearing process is invalidated.
  • Either party may (and should) pay out according to the terms of this contract in any method of the participants choosing and close out this contract early so long as both participants in this contract agree.

The hashing and timelock properties of the transactions are what allow them to be chained across a network, hence the term Hashed Timelock Contracts.

Next: Using Channels With Hashed Timelock Contracts.

The hashed riddle construct is cute, but as detailed above every transaction would need to be published on the blockchain, which makes it pretty pointless.  So the next step is to embed them into a Poon-Dryja channel, so that (in the normal, cooperative case) they don’t need to reach the blockchain at all.

Posted Wed Apr 1 11:46:29 2015 Tags:
Your push may fail due to “non fast-forward”. You start from a history that is identical to that of your upstream, commit your work on top of it, and then by the time you attempt to push it back, the upstream may have advanced because somebody else was also working on his own changes.


For example, between the upstream and your repositories, histories may diverge this way (the asterisk denotes the tip of the branch; the time flows from left to right as usual):


Upstream                                You


---A---B---C*      --- fetch -->        ---A---B---C*


                                                    D*
                                                   /
---A---B---C---E*                       ---A---B---C


            D?                                      D*
           /                                       /
---A---B---C---E?   <-- push ---        ---A---B---C


If the push moved the branch at the upstream to point at your commit, you will be discarding other people’s work. To avoid doing so, git push fails with “Non fast-forward”.


The standard recommendation when this happens is to “fetch, merge and then push back”. The histories will diverge and then converge like this:


Upstream                                You


                                                    D*
                                                   /
---A---B---C---E*  --- fetch -->        ---A---B---C---E


                                                       1
                                                    D---F*
                                                   /   /2
---A---B---C---E*                       ---A---B---C---E


               1                                       1
            D---F*                                  D---F*
           /   /2                                  /   /2
---A---B---C---E    <-- push ---        ---A---B---C---E


Now, the updated tip of the branch has the previous tip of the upstream (E) as its parent, so the overall history does not lose other people’s work.


The resulting history, however, is not what the majority of the project participants would appreciate. The merge result records D as its first parent (denoted with 1 on the edge to the parent), as if what happened on the upstream (E) were done as a side branch while F was being prepared and pushed back. In reality, E in the illustration may not be a single commit but can be many commits and many merges done by many people, and these many commits may have been observed as the tips of the upstream’s history by many people before F got pushed.

Even though Git treats all parents of a merge equally at the level of the underlying data model, the users have come to expect that the history they will see by following the first-parent chain tells the overall picture of the shared project history, while second and later parents of merges represent work done on side branches. From this point of view, what "fetch, merge and then push" is not quite a right suggestion to proceed from a failed push due to "non fast-forward".


It is tempting to recommend “fetch, merge backwards and then push back” as an alternative, and it almost works for a simple history:


Upstream                                You


                                                    D*
                                                   /
---A---B---C---E*  --- fetch -->        ---A---B---C---E


                                                       2
                                                    D---F*
                                                   /   /1
---A---B---C---E*                       ---A---B---C---E


               2                                       2
            D---F*                                  D---F*
           /   /1                                  /   /1
---A---B---C---E    <-- push ---        ---A---B---C---E


Then, if you follow the first-parent chain of the history, you will see how the tip of the overall project progressed. This is an improvement over the “fetch, merge and then push back”, but it has a few problems.


One reason why “merge backwards” is wrong becomes apparent when you consider what should happen when the push fails for the second time after the backward merge is made:


Upstream                                You


                                                    D*
                                                   /
---A---B---C---E*  --- fetch -->        ---A---B---C---E


                                                       2
                                                    D---F*
                                                   /   /1
---A---B---C---E*                       ---A---B---C---E


               2                                       2
            D---F?                                  D---F*
           /   /1                                  /   /1
---A---B---C---E---G    <-- push ---    ---A---B---C---E


               2                                       2   2
            D---F?                                  D---F---H*
           /   /1                                  /   /1  /1
---A---B---C---E---G    --- fetch -->   ---A---B---C---E---G


If the upstream side gained another commit G while F was being prepared, “fetch, merge backwards and then push” will end up creating a history like this, hiding D, the only real change you did in the repository, as the tip of the side branch of a side branch!

It also does not solve the problem if the work you did in D is not a single strand of pearls, but has merges from side branches. If D in the above series of illustrations were a few merges X, Y and Z from side branches of independent topics, the picture on your side, after fetching E from the updated upstream, may look like this:


    y---y---y   .
   /         \   .
  .   x---x   \   \
 .   /     \   \   \
.   /       X---Y---Z*
   /       /
---A---B---C---E


That is, hoping that the other people will stay quiet, starting from C, you merged three independent topic branches on top of it with merges X, Y and Z, and hoped that the overall project history would fast-forward to Z. From your perspective, you wanted to make A-B-C-X-Y-Z to be the main history of the project, while x, y, ... were implementation details of X, Y and Z that are hidden behind merges on side branches. And if there were no E, that would indeed have been the overall project history people would have seen after your push.

Merging backwards and pushing back would however make the history’s tip F, with its first parent E, and Z becomes a side branch. The fact that X, Y and Z (more precisely, X^2 and Y^2 and Z^2) were independent topics is lost by doing so:


    y---y---y   .
   /         \   .
  .   x---x   \   \
 .   /     \   \   \
.   /       X---Y---Z
   /       /         \2
---A---B---C---E-------F*
                     1



So "merge backwards" is not a right solution in general. It is only valid if you are building a topic directly on top of the shared integration branch, which is something you should not be doing in the first place. In the earlier illustration of creating a single D on top of C and pushing it, if there were no work from other people (i.e. E), the push would have fast-forwarded, making D as a normal commit directly on the first-parent chain. If there were work from other people like E, “merge in reverse” would instead have recorded D on a side branch. If D is a topic separate and independent from other work being done in parallel, you would consistently want to see such a change appear as a merge of a side branch.

A better recommendation might be to “fetch, rebuild the first-parent chain, and then push back”. That is, you would rebuild X, Y and Z (i.e. “git log --first-parent C..”) on top of the updated upstream E:


    y---y-------y   .
   /             \   .
  .   x-------x   \   \
 .   /         \   \   \
.   /           X’--Y’--Z’*
   /           /
---A---B---C---E


Note that this will work well naturally even when your first-parent chain has non-merge commits. For example, X and Y in the above illustration may be merges while Z is a regular commit that updates the release notes with descriptions of what was recently merged (i.e. X and Y). Rebuilding such a first-parent chain on top of E will make the resulting history very easy to understand when the reader follows the first-parent chain.

The reason why “rebuild the first-parent chain on the updated upstream” works the best is tautological. People do care about the first-parenthood when viewing the history, and you must have cared about the first-parent chain, too, when building your history leading to Z. That first-parenthood you and others care about is what is being preserved here. By definition, we cannot go wrong ;-)

And of course, this will work against a moving upstream that gained new commits while we were fixing things up on our end, because we won't be piling a new merges on top, but will be rebuilding X', Y' and Z' into X'', Y'', and Z'' instead.

To make this work on the pusher’s end, after seeing the initial “non fast-forward” refusal from “git push”, the pusher may need to do something like this:


$ git push ;# fails
$ git fetch
$ git rebase --first-parent @{upstream}


Note that “git rebase --first-parent” does not exist yet; it is one of the topics I would like to see resurrected from old discussions.

But before "rebase --first-parent" materialises, in the scenario illustrated above, the pusher can do these instead of that command:


$ git reset --hard @{upstream}
$ git merge X^2
$ git merge Y^2
$ git merge Z^2


And then, inspect the result thoroughly. As carefully as you checked your work before you attempted your first push that was rejected. After that, hopefully your history will fast-forward the upstream and everybody will be happy.


Posted Mon Mar 30 22:09:00 2015 Tags:

I finally took a second swing at understanding the Lightning Network paper.  The promise of this work is exceptional: instant reliable transactions across the bitcoin network. But the implementation is complex and the draft paper reads like a grab bag of ideas; but it truly rewards close reading!  It doesn’t involve novel crypto, nor fancy bitcoin scripting tricks.

There are several techniques which are used in the paper, so I plan to concentrate on one per post and wrap up at the end.

Revision: Payment Channels

I open a payment channel to you for up to $10

A Payment Channel is a method for sending microtransactions to a single recipient, such as me paying you 1c a minute for internet access.  I create an opening transaction which has a $10 output, which can only be redeemed by a transaction input signed by you and me (or me alone, after a timeout, just in case you vanish).  That opening transaction goes into the blockchain, and we’re sure it’s bedded down.

I pay you 1c in the payment channel. Claim it any time!

Then I send you a signed transaction which spends that opening transaction output, and has two outputs: one for $9.99 to me, and one for 1c to you.  If you want, you could sign that transaction too, and publish it immediately to get your 1c.

Update: now I pay you 2c via the payment channel.

Then a minute later, I send you a signed transaction which spends that same opening transaction output, and has a $9.98 output for me, and a 2c output for you. Each minute, I send you another transaction, increasing the amount you get every time.

This works because:

  1.  Each transaction I send spends the same output; so only one of them can ever be included in the blockchain.
  2. I can’t publish them, since they need your signature and I don’t have it.
  3. At the end, you will presumably publish the last one, which is best for you.  You could publish an earlier one, and cheat yourself of money, but that’s not my problem.

Undoing A Promise: Revoking Transactions?

In the simple channel case above, we don’t have to revoke or cancel old transactions, as the only person who can spend them is the person who would be cheated.  This makes the payment channel one way: if the amount I was paying you ever went down, you could simply broadcast one of the older, more profitable transactions.

So if we wanted to revoke an old transaction, how would we do it?

There’s no native way in bitcoin to have a transaction which expires.  You can have a transaction which is valid after 5 days (using locktime), but you can’t have one which is valid until 5 days has passed.

So the only way to invalidate a transaction is to spend one of its inputs, and get that input-stealing transaction into the blockchain before the transaction you’re trying to invalidate.  That’s no good if we’re trying to update a transaction continuously (a-la payment channels) without most of them reaching the blockchain.

The Transaction Revocation Trick

But there’s a trick, as described in the paper.  We build our transaction as before (I sign, and you hold), which spends our opening transaction output, and has two outputs.  The first is a 9.99c output for me.  The second is a bit weird–it’s 1c, but needs two signatures to spend: mine and a temporary one of yours.  Indeed, I create and sign such a transaction which spends this output, and send it to you, but that transaction has a locktime of 1 day:

The first payment in a lightning-style channel.

Now, if you sign and publish that transaction, I can spend my $9.99 straight away, and you can publish that timelocked transaction tomorrow and get your 1c.

But what if we want to update the transaction?  We create a new transaction, with 9.98c output to me and 2c output to a transaction signed by both me and another temporary address of yours.  I create and sign a transaction which spends that 2c output, has a locktime of 1 day and has an output going to you, and send it to you.

We can revoke the old transaction: you simply give me the temporary private key you used for that transaction.  Weird, I know (and that’s why you had to generate a temporary address for it).  Now, if you were ever to sign and publish that old transaction, I can spend my $9.99 straight away, and create a transaction using your key and my key to spend your 1c.  Your transaction (1a below) which could spend that 1c output is timelocked, so I’ll definitely get my 1c transaction into the blockchain first (and the paper uses a timelock of 40 days, not 1).

Updating the payment in a lightning-style channel: you sent me your private key for sig2, so I could spend both outputs of Transaction 1 if you were to publish it.

So the effect is that the old transaction is revoked: if you were to ever sign and release it, I could steal all the money.  Neat trick, right?

A Minor Variation To Avoid Timeout Fallback

In the original payment channel, the opening transaction had a fallback clause: after some time, it is all spendable by me.  If you stop responding, I have to wait for this to kick in to get my money back.  Instead, the paper uses a pair of these “revocable” transaction structures.  The second is a mirror image of the first, in effect.

A full symmetric, bi-directional payment channel.

So the first output is $9.99 which needs your signature and a temporary signature of mine.  The second is  1c for

meyou.  You sign the transaction, and I hold it.  You create and sign a transaction which has that $9.99 as input, a 1 day locktime, and send it to me.

Since both your and my “revocable” transactions spend the same output, only one can reach the blockchain.  They’re basically equivalent: if you send yours you must wait 1 day for your money.  If I send mine, I have to wait 1 day for my money.  But it means either of us can finalize the payment at any time, so the opening transaction doesn’t need a timeout clause.

Next…

Now we have a generalized transaction channel, which can spend the opening transaction in any way we both agree on, without trust or requiring on-blockchain updates (unless things break down).

The next post will discuss Hashed Timelock Contracts (HTLCs) which can be used to create chains of payments…

Notes For Pedants:

In the payment channel open I assume OP_CHECKLOCKTIMEVERIFY, which isn’t yet in bitcoin.  It’s simpler.

I ignore transaction fees as an unnecessary distraction.

We need malleability fixes, so you can’t mutate a transaction and break the ones which follow.  But I also need the ability to sign Transaction 1a without a complete Transaction 1 (since you can’t expose the signed version to me).  The paper proposes new SIGHASH types to allow this.

[EDIT 2015-03-30 22:11:59+10:30: We also need to sign the other symmetric transactions before signing the opening transaction.  If we released a completed opening transaction before having the other transactions, we might be stuck with no way to get our funds back (as we don’t have a “return all to me” timeout on the opening transaction)]

Posted Mon Mar 30 10:47:32 2015 Tags:
Following up to the previous post, I computed a few numbers for each development cycle in the recent past.

In all the graphs in this article, the horizontal axis counts the number of days into the development cycle, and the vertical axis shows the number of non-merge commits made.

  • The bottom line in each graph shows the number of non-merge commits that went to the contemporary maintenance track.
  • The middle line shows the number of non-merge commits that went to the release but not to the maintenance track (i.e. shiny new toys, oops-fixes to them, and clean-ups that were too minor to be worth merging to the maintenance track), and
  • The top line shows the total number of non-merge commits in the release.

Even though I somehow have a fond memory of v1.5.3, the beginning of the modern Git was unarguably the v1.6.0 release. Its development cycle started in June 2008 and ended in August 2008. We can see that we were constantly adding a lot more new shiny toys (this cycle had the big "no more git-foo in user's $PATH" change) than we were applying fixes to the maintenance track during this period. This cycle lasted for 60 days, 731 commits in total among which 120 went to the maintenance track of its time.


During the development cycle that led to v1.8.0 (August 2012 to October 2012), the pattern is very different. We cook our topics longer in the 'next' branch and we can clearly see that the topics graduate to 'master' in batches, which appear as jumps in the graph.  This cycle lasted for 63 days, 497 commits in total among which 182 went to the maintenance track of its time.


The cycle led to v2.0.0 (February 2014 to June 2014) has a similar pattern, but as another "we now break backward compatibility for ancient UI wart" release, we can see that a large batch of changes were merged in early part of the cycle, hoping to give them better and longer exposure to the testing public; on the other hand, we did not do too many fixes to the maintenance track.  This cycle lasted for 103 days, 475 commits in total among which 90 went to the maintenance track of its time.



The numbers for the current cycle leading to v2.4 (February 2015 to April 2015) are not finalized yet, but we can clearly see that this cycle is more about fixing old bugs than introducing shiny new toys from this graph.  This cycle as of this writing is at its 50th day, 344 commits in total so far among which 115 went to the maintenance track.



Note that we should not be alarmed by the sharp rise at the end of the graph. We just entered the pre-release freeze period and the jump shows the final batch of topics graduating to the 'master' branch. We will have a few more weeks until the final, and during that period the graph will hopefully stay reasonably flat (any rise from this point on would mean we would be doing a last-minute "oops" fixes).

Posted Fri Mar 27 21:14:00 2015 Tags:
Earlier in the day, an early preview release for the next release of Git, 2.4-rc0, was tagged. Unlike many major releases in the past, this development cycle turned out to be relatively calm, fixing many usability warts and bugs, while introducing only a few new shiny toys.

In fact, the ratio of changes that are fixes and clean-ups in this release is unusually higher compared to recent releases. We keep a series of patches around each topic, whether it is a bugfix, a clean-up, or a new shiny toy, on its own topic branch, and each branch is merged to the 'master' branch after reviewing and testing, and then fixes and trivial clean-ups are also merged to the 'maint' branch. Because of this project structure, it is relatively easy to sift fixes and enhancement apart. Among new commits in release X since release (X-1), the ones that appear also in the last maintenance track for release (X-1) are fixes and clean-ups, while the remainder is enhancements.

Among the changes that went into v1.9.0 since v1.8.5, 23% of them were fixes that got merged to v1.8.5.6, for example, and this number has been more or less stable throughout the last year. Among the changes in v2.3.0 since v2.2.0, 18% of them were also in v2.2.2. Today's preview v2.4.0-rc0, however, has 333 changes since v2.3.0, among which 110 are in v2.3.4, which means that 33% of the changes are fixes and clean-ups.

These fixes came from 33 contributors in total, but changes from only a few usual suspects dominate and most other contributors have only one or two changes on the maintenance track. It is illuminating to compare the output between

$ git shortlog --no-merges -n -s ^maint v2.3.0..master
$ git shortlog --no-merges -n -s v2.3.0..maint

to see who prefers to work on new shiny toys and who works on product quality by fixing other people's bugs. The first command sorts the contributors by the number of commits since v2.3.0 that are only in the 'master', i.e. new shiny toys, and the second command sorts the contributors by the number of commits since v2.3.0 that are in the 'maint', i.e. fixes and clean-ups.

The output matches my perception (as the project maintainer, I at least look at, if not read carefully, all the changes) of each contributor's strength and weakness fairly well. Some are always looking for new and exciting things while being bad at tying loose ends, while others are more careful perfectionists.

Posted Fri Mar 27 05:34:00 2015 Tags:
Christian Couder (who is known for his work enhancing the "git bisect" command several years ago) and Thomas Ferris Nicolaisen (who hosts a popular podcast GitMinutes) started producing a newsletter for Git development community and named it Git Rev News.

Here is what the newsletter is about in their words:

Our goal is to aggregate and communicate some of the activities on the Git mailing list in a format that the wider tech community can follow and understand. In addition, we'll link to some of the interesting Git-related articles, tools and projects we come across.

This edition covers what happened during the month of March 2015.

As one of the people who still remembers "Git Traffic", which was meant to be an ongoing summary of the Git mailing list traffic but disappeared after publishing its first and only issue, I find this a very welcome development. Because our mailing list is a fairly high-volume one, it is almost impossible to keep up with everything that happens there, unless you are actively involved in the development process.

I hope their effort will continue and benefit the wider Git ecosystem. You can help them out in various ways if you are interested.

  • They are not entirely happy with how the newsletter is formatted. If you are handy with HTML, CSS or some blog publishing platforms, they would appreciate help in this area.
  • They are not paid full-time editors but doing this as volunteers. They would appreciate editorial help as well.
  • You can contribute by writing your own articles that summarize the discussions you found interesting on the mailing list.

Posted Wed Mar 25 20:58:00 2015 Tags:

heart, everyone knows that, today, things can not be good,cheap jordans for sale, I am afraid of. Soul jade face,cheap Authentic jordans, when Xiao Yan threw three words, and finally is completely chill down, he stared at the latter, after a moment, slowly nodded his head and said: ‘So it would be only First you can kill ah. ‘
‘Boom’
accompanied soul jade pronunciation last fall, as well as the soul of the family all day demon Phoenix family strong, almost invariably, the body of a grudge unreserved broke out, stature flash, that is, people will go far Hsiao round siege.
soul to see family and demon days while Phoenix family hands, smoked child, who is gradually cold cheek,Cheap Jordans, a step forward,cheap jordans, the body of a grudge, running into the sky.
‘soul jade, you really want to cause war between ancient tribe of ethnic fragmentation and soul?’ Gu Qingyang cold shouted.
‘Hey’ war? I am the soul of the family, may have never been afraid of you ancient tribe, so you tranquility so long, it is only to give you?Fills a little more time, I really think you can not move the soul of family fragmentation? ‘Heard that the soul is starting jade face is a blur shadow smile’ immediately turned to awe-inspiring Xiao Yan, said: ‘You are the most recent name first,jordan shoes for sale, in my soul, but does not carry a small family, even the four will always revere missed a while back, when I progenitor is said to be hands on early hands-on, but why those old guys seem very concerned about, and that makes you have to live up to now, but I think this should also coming to an end. ‘
voice down, rich black vindictive,Coach Outlet, self-soul jade suddenly overwhelming storm surge out of the body, a Unit of cold wave, since the body constantly open to diffuse.
feel the TV drama filled body and soul jade majestic open fluctuations on smoked children, who also appeared on the cheek dignified

Posted Wed Mar 25 16:42:22 2015 Tags:

body paint on the ground slippery ten meters, just stop, just stop its stature, two He is rushed to the guard house,Cheap Jordan Shoes, grabbed him, severely The throw back.
‘give you a chance to say,cheap jordans for sale, I can let you go.’ He and everyone on the main palm Xiupao swabbing a bit faint.
‘I’ve said, this is my income in among the mountains of Warcraft.’ Card Gang pale, mouth blood constantly emerge, his body lying on the ground, raised his head, staring eyes tightly He Lord every family , tough road.
swabbing hands slowly stopped, gradually being replaced by a ghastly He and everyone on the main surface, slowly down the steps, after a moment, come to the front of the card post, indifferent eyes looked moribund post cards, mouth emerged grinning touch,luckythechildrensbook.com, soon feet aloft,Cheap Jordans Outlet, then the head is facing the harsh post card stamp down, watch the momentum, if it was stepped on, I am afraid that the post of head of the card will be landing in Lima as burst open like a watermelon .
looking at this scene, on the square suddenly screams rang out round after round.
heard screams around, that He even more ferocious mouth the Lord every family, but on his head away from the post card only Cunxu distance,Coach Outlet Store, a dull sound, but it is quietly sounded at its feet on the square, and its feet, is at this moment suddenly solidified.
‘This foot down, you will you use your head to replace it,’ seven hundred and nineteenth chapter helping hand
seven hundred and nineteenth chapter assistance
slowly dull sound echoed on the training field. So that was all screams are down at the moment solidified,Cheap Jordan Shoes, all everyone is looking around, eyes filled with all kinds of emotions.
in that voice sounded grabbing, He is the Lord every family looking slightly changed, in other words so that was his share of rude Lengheng cry, but it is the foot

Posted Wed Mar 25 16:41:37 2015 Tags: