Differences between revisions 12 and 13
Revision 12 as of 2008-06-28 08:37:45
Size: 8873
Editor: GlennTarbox
Comment:
Revision 13 as of 2008-06-28 18:04:11
Size: 5752
Editor: GlennTarbox
Comment:
Deletions are marked like this. Additions are marked like this.
Line 62: Line 62:
= On Hg =

Hg is one of the 3 realistic DVCS choices: Git, Bzr, and Hg. Unfortunately, Hg lacks a proper storage architecture, and the entire discussion really ends there. Hg developers chose the “file append” approach to object management because they were more interested in the user facing layers. And thats a good thing because they were trying to fix the true badness that lies at the heart of CVS / SVN.

DVCS invokes strong opinions and Hg has its own cult built up around the “Hg way” including that one can't delete history etc. Of course, this is not a philosophical position, its a technical necessity based on ''file append'' approach to storage

 1. The implications of the lack of a proper repo store are huge and force most development activities into the “clone first, ask questions later” approach to development. In effect, Hg Clone is needed because they can't do real branching. “Named Branching” in Hg is entirely different than Git and Bzr and is currently under review by the Hg development team. Its simply not a useful mechanism (for example, you can't delete a branch... even after merging). In Hg's case, a branch is intended to be around for a really long time... There are workarounds, and the existence of these approaches are based entirely on the lack of a correct technical foundation.

  * of course, this has spawned various adjunct toolsets like patch queues to try and overcome Hg's fundamental limitations.

 1. Larger projects tend to devolve to the tarball / patch distributed development approach.

  * Maybe Linus would be happy.

  * Of course, its better than SVN as, at least, there's DVCS supporting local development. But the true power of DVCS is lost because the merge process requires getting around Hg's core limitations.
Line 118: Line 102:
I won't even attempt to describe Git, but suffice it to say that Git and Hg exist at opposite ends of the DVCS spectrum. Git was entirely about the foundation, the object model, and base capabilities. Unfortunately, given the nature of the target audience, the Linux kernel, fancy front ends were less of a concern.

As Git's underlying capabilities are infinitely more flexible, there is an inherent complexity to the documentation exacerbated by the same types of difficulties often caused by too many smart guys. They can't explain to those who don't already know. Recent efforts, layers of “porcelean” etc have largely mitigated this difficulty, but there will continue to be a lingering doubt about ''the masses'' being able to ''get Git''.

In reality, Git is currently quite easy to use and its capabilities pay for themselves virtually instantly.

For the reasons mentioned above, I do all my development in Git. Fortunately, we can overlay Git and Hg managed directories (couple of ignore rules and we're good to go). I imagine this is true for bzr as well.

So, when coordinating with Hg folks, I simply update and publish to Hg. The final step in the code review process is the production of a patch for posting and review in Trac.

TableOfContents

Developing with DVCS

Distributed Version Control Systems are a huge improvement in the system development process for a number of reasons but its easily researched on Google. So, I'm not going to get into to all the issues. Some thoughts

  1. There are only branches and revisions.
    • Think of code development as a process through time and space.
    • Space is discretized into developers
    • Time is discretized into commits
    • Add a little meta data linking points in the developer / commit sequence and you have a graph describing the evolution of the code. Throw on some collision free identifiers (SHA-1) and you have the complete genetic history of every line of code
    • Repositories are a cache / detail without significance in and of themselves
    • don't forget to push often so others can import your changes
  2. Only push to repos you own
    • This is probably the least understood aspect of DVCS. The key concept to understand here is that as long as you can get other people's work, you should be pulling and merging from their repos, and only pushing to your own. An example is the repo on your laptop being pushed to your repo on the server for publishing.

    • When your repo is published, others can pull / merge / publish. By convention, cerain people's repos, specfically, certain branches on well known repos will have meaning.
    • There are no permissions required which turns version control on its head. You publish your work, make it known (the usual suspects), and it can be pulled by anyone with interest and as part of the process used by a specific project
  3. You are not your code - http://blog.red-bean.com/sussman/?p=96

  4. Two good videos of the concept and Linus in action. The second is What Linus means to say is

  5. Patching is less useful than publishing your repo
    • Much finer grained visibility. Proper branch naming communicate the author's intended state of readiness
    • For developers without a way to publish a repo, patches can be submitted to Trac
    • check the patch for collisions with the "blessed" repo as discussed below
    • the patch will be merged with the repo on a branch depending on its state of testing / modification

Clone Toggling

There are always developers for whom keeping multiple clones on their local machine is a practical necessity due to the rebuild times when doing major revisions. Git, Bzr, and Hg all support this ability. All thats really required, is a push / merge to a single local repo and as an intermediate step before publishing your repo.

Version Control Systems Compared

Hg for sage-finance and dsageng

The current model is you push to your own Hg repo's. For most, this will mean pushing to a repo in your home directory on sage.math. I'll pull revisions / branches into my repo which is, by convention only, the official state for finance and dsageng.

  1. It is critical to push to your public repo often. We'll come up with a naming convention but there will be branches marked as your current working state. You shouldn't merge your working branch into your "ready to merge with ghtdak branch" unless you've checked for collisions with the blessed trunk at a minimum... and it would help if you would check for collisions with my primary working branch.
  2. When a feature is ready for review, an aggregate patch should be assembled and posted on trac consistent with the Sage review process
    • This applies to just the finance and dsageng activities at this early stage. (See below)
    • Adjustments made during review can be posted as patches or, preferably, a branch url indicating the commit. The advantage of the latter is removal of ambiguity should bugs arise later. It becomes straightforward to reconstruct the state of the entire tree for that developer when the upgrade was checked in.

To facilitate this process, my Hg repo, can be found at

http://tarbox.org:9000 which can be browsed and "pulled" from using hg....

I'll also clone my repo on sage.math from time to time but thats really only useful for those with accounts on that machine and as a backup / disaster mechanism... of course, as we'll be replicating all the changes amongst ourselves, the implied backup strategy of distributed development is very useful.

You should read the mercurial docs... but some hints.

If you want a clean all by itself repo with all the history from the beginning of time:

hg clone http://tarbox.org:9000 sage-ght

remember about branches. Some helpful commands

  • hg help branches
  • hg help branch
  • hg help clone
  • hg update -C finpatch
  • figure out how to get hg view to work
    • hint: get hgk from the mercurial site, adjust your .hgrc
    • this will likely be part of the sage distro soon enough.
  • hg view is nice but doesn't show branch names... I don't get it... but, fortunately, there's hg glog which makes a pretty nice ascii graph which includes branch info. You need to enable the extension... all documented.

On Git

My live Git repo is at: http://tarbox.org:8080


CategoryDevelopment