Differences between revisions 8 and 9
Revision 8 as of 2008-06-27 16:24:49
Size: 7687
Editor: GlennTarbox
Comment:
Revision 9 as of 2008-06-27 16:37:18
Size: 8046
Editor: GlennTarbox
Comment:
Deletions are marked like this. Additions are marked like this.
Line 52: Line 52:

== Clone Toggling ==

There are always developers for whom keeping multiple clones on their local machine is a practical necessity due to the rebuild times when doing major revisions. Git, Bzr, and Hg all support this ability. All thats really required, is a push / merge to a single local repo and as an intermediate step before publishing your repo.

TableOfContents

Developing with DVCS

Distributed Version Control Systems are a huge improvement in the system development process for a number of reasons but its easily researched on Google. So, I'm not going to get into to all the issues. Some thoughts

  1. There are only branches and revisions.
    • Think of code development as a process through time and space.
    • Space is discretized into developers
    • Time is discretized into commits
    • Add a little meta data linking points in the developer / commit sequence and you have a graph describing the evolution of the code. Throw on some collision free identifiers (SHA-1) and you have the complete genetic history of every line of code
    • Repositories are a cache / detail without significance in and of themselves
    • don't forget to push often so others can import your changes
  2. Only push to repos you own
    • This is probably the least understood aspect of DVCS. The key concept to understand here is that as long as you can get other people's work, you should be pulling and merging from their repos, and only pushing to your own. An example is the repo on your laptop being pushed to your repo on the server for publishing.

    • When your repo is published, others can pull / merge / publish. By convention, cerain people's repos, specfically, certain branches on well known repos will have meaning.
    • There are no permissions required which turns version control on its head. You publish your work, make it known (the usual suspects), and it can be pulled by anyone with interest and as part of the process used by a specific project
  3. You are not your code - http://blog.red-bean.com/sussman/?p=96

  4. Two good videos of the concept and Linus in action. The second is What Linus means to say is

  5. Patching is less useful than publishing your repo
    • Much finer grained visibility. Proper branch naming communicate the author's intended state of readiness
    • For developers without a way to publish a repo, patches can be submitted to Trac
    • check the patch for collisions with the "blessed" repo as discussed below
    • the patch will be merged with the repo on a branch depending on its state of testing / modification

Clone Toggling

There are always developers for whom keeping multiple clones on their local machine is a practical necessity due to the rebuild times when doing major revisions. Git, Bzr, and Hg all support this ability. All thats really required, is a push / merge to a single local repo and as an intermediate step before publishing your repo.

On Hg

Hg is the least capable of the 3 real DVCS candidates: Git, Bzr, and Hg. At its foundation, it lacks a proper storage architecture, and the entire discussion really ends there. They choose the “file append” approach to object management because they were more interested in the user facing layers. All that is fine, but it limited where Hg could go...

As DVCS typically becomes religion, Hg has its own cult built up around the “Hg way” including that one can't delete history etc. Of course, this is not a philosophical position, its a technical necessity based on file append approach to storage

  1. The implications are huge and force most development activities into the “clone first, ask questions later” approach to development. In effect, Hg Clone is needed because they can't do real branching. “Named Branching” in Hg is entirely different than Git and Bzr. In Hg's case, a branch is intended to be around for a really long time... in fact, you can never delete a branch without the usual, clone to somewhere back in history, and apply patchset to adjust history.

    • of course, this has spawned various adjunct toolsets like patch queues to try and overcome Hg's fundamental limitations.
  2. Projects tend to devolve to the tarball / patch distributed development approach. Maybe Linus would be happy.

Hg for sage-finance and dsageng

The current model is you push to your own Hg repo's. For most, this will mean pushing to a repo in your home directory on sage.math. I'll pull revisions / branches into my repo which is, by convention only, the official state for finance and dsageng.

  1. It is critical to push to your public repo often. We'll come up with a naming convention but there will be branches marked as your current working state. You shouldn't merge your working branch into your "ready to merge with ghtdak branch" unless you've checked for collisions with the blessed trunk at a minimum... and it would help if you would check for collisions with my primary working branch.
  2. When a feature is ready for review, an aggregate patch should be assembled and posted on trac consistent with the Sage review process
    • This applies to just the finance and dsageng activities at this early stage. (See below)
    • Adjustments made during review can be posted as patches or, preferably, a branch url indicating the commit. The advantage of the latter is removal of ambiguity should bugs arise later. It becomes straightforward to reconstruct the state of the entire tree for that developer when the upgrade was checked in.

To facilitate this process, my repo, can be found at

http://tarbox.org:9000 which can be browsed and "pulled" from using hg. Its also an easy way to determine if you need to merge or whether I've been asleep...

I'll also clone my repo on sage.math from time to time but thats really only useful for those with accounts on that machine and as a backup / disaster mechanism... of course, as we'll be replicating all the changes amongst ourselves, the implied backup strategy of distributed development is very useful.

You should read the mercurial docs... but some hints.

If you want a clean all by itself repo with all the history from the beginning of time:

hg clone http://tarbox.org:9000 sage-ght

remember about branches. Some helpful commands

  • hg help branches
  • hg help branch
  • hg help clone
  • hg update -C finpatch
  • figure out how to get hg view to work
    • hint: get hgk from the mercurial site, adjust your .hgrc
    • this will likely be part of the sage distro soon enough.
  • hg view is nice but doesn't show branch names... I don't get it... but, fortunately, there's hg glog which makes a pretty nice ascii graph which includes branch info. You need to enable the extension... all documented.

On Git

I won't even attempt to describe Git, but suffice it to say that Git and Hg exist at opposite ends of the DVCS spectrum. Git was entirely about the foundation, the object model, and base capabilities. Unfortunately, given the nature of the target audience, the Linux kernel, fancy front ends were less of a concern.

As Git's underlying capabilities are infinitely more flexible, there is an inherent complexity to the documentation. Recent efforts, layers of “porcelean” etc have largely mitigated this difficulty, but there will continue to be a lingering doubt about people being able to figure Git out.

For the reasons mentioned above, I do all my development in Git. Fortunately, we can overlay git and hg managed directories (couple of ignore rules and we're good to go). I imagine this is true for bzr as well.

So, when coordinating with Hg folks, I simply update and publish to Hg. The final step in the code review process is the production of a patch for posting and review in Trac.

My live Git repo is at: http://tarbox.org:8080


CategoryDevelopment