Sunday, October 11, 2009

Version numbers and (D)VCS

I've spent a lot of time this weekend trying to adapt apply-version.nant, originally written for oopsnet and svn, to ormar and mercurial. It wasn't so easy to find guidance from others with similar goals, so hopefully this post makes the information more accessible.

Definitions


The primary purpose of software version numbers is to identify the code. When we distribute to users, the version number gives them a concise reference for use in communicating about bugs and available features.

A secondary purpose is to define a temporal order on distributions. Versions are normally numeric and monotonically increasing; higher versions are newer and hopefully better than older versions.

Conventionally, there is at least some human involvement in assigning version numbers. In proprietary software, concerns about sales normally are important. Often times, the technical issue of compatibility is difficult to treat formally, but human judgment is used to encode compatibility information in version numbers.

But it's also conventional to let the low-order bits of a version number be determined automatically. Whenever build inputs change, the behavior of the software may change, but nobody wants to be bothered incrementing version numbers for routine small changes.

Besides version numbers, it's common to hear of "build numbers." Sometimes the terms are used interchangeably, but I think it's useful to distinguish between (a) an identifier for the build input and (b) an identifier for the build event. Some people use (b) as a more convenient proxy for (a), and some people apparently really care about (b) itself, although I'm not sure why. Maybe it's because on some teams deployment is part of the build process*, and it's nice to have a formal record of deployments.

Theory and practice


I've used Subversion for nearly my whole career so far. It's a centralized version control system and sensibly enough it identifies historical events with natural numbers; revision 0 is repository creation. So, svn revision numbers are a very convenient basis for version numbers. Just check for consistency of the wc with a revision from the repository and use that revision number for the low-order version bits. Consistency between wc and repository for svn is a question of (a) uncommitted changes and (b) files or directories within the wc at different revisions.

This is a bit harder to do with some other centralized version control systems. In SCCS, revision numbers (SIDs) apply not to repositories but to individual files. Microsoft TFS has SVN-style revision numbers that they call "changeset numbers," but their implementation choices and tools make it difficult and expensive to answer the wc-repository consistency question. But fundamentally, in a cvcs, there's a global clock and that can serve as a basis for version numbering. In every cvcs I've seen, it's practical to use it that way although it might be easier in some cases (svn) and harder in others (tfs).

For distributed version control systems, we have no global clock. Fundamentally, events are temporally ordered only by causal relationships, so you can really only establish the primary property for version numbers: identifying build inputs. There's no general way to establish the secondary property that allows users to compare version numbers and decide which is newer. And yet, the mercurial project itself produces monotonic version numbers! How do they do it? Apparently by manually tagging.

How important is temporal ordering really? Certainly the most important thing is the capability for repeatable builds. Some DVCS projects have concise identifiers for build inputs; in hg and git we have hashcodes. Unfortunately for those of us on the .NET platform, Microsoft provides only 4 * 16 bits of space for version numbers in their assembly metadata specification. This isn't nearly enough for hg 160 bit changeset ids (though it could accommodate the potentially ambiguous 48 bit short form), especially if we want to use one or more of those four fields for encoding compatibility data.

A common special case


There's a very common special case of projects using dvcs for which we can establish an objective order on versions. There's often an official repository, whose event history meets our need.

Well that's fine in theory, but is it practical? Unfortunately for me, hg doesn't allow remote queries of a given repository's local timestamps ("revision numbers"). I hope that's due to an efficiency trade-off and not just a pedantic effort ("these revision numbers aren't guaranteed to match those in your clone; use changeset ids instead!").

The good news is that in hg, revision numbers consistency is preserved under clone and pull operations. If you commit in your repository, you may irreconcilably lose consistency, but as long as you abstain from making changes you and the other repo will agree on the bijective function between revision numbers and changeset ids. So my plan for .NET assembly versioning in my googlecode hg repositories is to use a pristine clone for official builds and at least one separate clone for feature development and bug fixes.


*For the IT web apps I've worked on, we had automated deployment as part of our CI routine, but we were satisfied to have an svn branch per deployment target. Actually, we had one svn branch per deployment target equivalence class representative. Really we had a small number of user communities (e.g., end-users, beta-testers, trainees, programmers), and we had a branch for each of them and the server clusters for each. (back)

Thursday, October 1, 2009

What I did on my summer vacation

I spent the second week of September 2009 in Yellowstone Country. This is my trip report.

Sean, Mike, Aubrey, ColinMy traveling companions were my parents, my brother, and his girlfriend. We met at BZN, where we rented a minivan for the drive to Yellowstone House. It was a rainy afternoon but between showers we caught glimpses of that signature Montana light that plays over the foothills and dramatically highlights the mountain peaks. We ate dinner at the Chico Dining Room, always a safe bet for fine dining. Yellowstone House has wifi and a PowerMac G4.

Barb et al.
For the next two days we went flyfishing with Grossenbacher Guides Bill, Bo, Brad Ehrnman, Brett Seng, and Brian Grossenbacher.

The thing to love or hate about flyfishing is that it emphasizes technique. The motion of the fly is determined not by its own weight (as it is in baitfishing) by the weight distributed along your line; there is physical complexity here that corresponds to a high degree of choice in casting. Beyond that, there is a pretty large space of materials and configurations in the flies, and the trout discriminate taking into account the season, time of day, and past experience. I'm a novice, but more experienced flyfishermen explicitly use ecological knowledge in addition to recent observations of insect activity and fish feeding patterns. So, I can recommend flyfishing if you enjoy skill acquisition.

Someone once expressed surprise that I'd participate in an activity that involves cruelty to fish. Well, I've not noticed a strong trend either way among baitfishermen, but every flyfishermen I've encountered has expressed concern for the health and well-being of the fish. They remove barbs from hooks in order to minimize injury during catch-and-release fishing, they take care to handle fish gently without damaging gills, and they release carefully. Those practices satisfy my moral compass.

We were not the only ones enjoying the river those days; we also observed eagles, ospreys, otters, and mergansers fishing.

On Monday we ate at the Pine Creek Lodge, and it was good. I will say though that the cinnamon fajita tastes like it sounds.

On Tuesday we drove to the North entrance of Yellowstone National Park at Gardiner, MT. On the way through Mammoth Hot Springs to Old Faithful Inn we saw elk, mule deer, bison, and a black bear but very sadly no moose. We stopped off to hike up the Mount Washburn Fire Lookout, where we saw many chipmunks, a couple of marmots, and some big horn sheep. Unfortunately Dad stopped about a half mile from the summit due to pain in his knees.

At Old Faithful InnOld Faithful Inn we had drinks and bar food at the bar and then dinner in the Dining Room. I was disappointed that the Crow's Nest is off-limits, but I did see Old Faithful erupt. Our rooms were in a renovated section (early 1990s), so we had private bathrooms.

On Wednesday we hiked half way to
Artist Point along the Grand Canyon of the Yellowstone. Dad stayed back. The canyonlands supposedly harbor moose, but not surprisingly we didn't see them along the heavily trafficked trail. We drove to view a petrified tree and then we drove on to the Lamar Valley. There we photographed a Bison at Lamar Valleybuffalo at close range and did some fishing in the Lamar River, a tributary of the Yellowstone. At one point I misjudged the depth of the river and waded into chest-high water, drowning my phone.

We tried to eat at Helen's, home of the infamous Hateful Burger, but it was closed and for sale. Instead we ate at Zona Rosa in Livingston, where we had superb Latin American food at very reasonable prices. They were new at that time and did not accept credit cards or serve alcoholic beverages. Nearby landmarks include an enormous Taco John's sign and a car/truck wash.

Thursday we did guided fishing on the Madison River. The fish are generally larger and smarter than those of the Yellowstone. We caught very few whitefish, but a number of Large Brown Troutlarge trout. We ate at the Chico Saloon, which is definitely not served by the same kitchen as the dining room, while we observed some sporting event on TV.

Yellowstone House living roomFriday we hung around Yellowstone House, doing a little fishing from the bank of the Yellowstone River and reading and puzzle-solving. We had dinner again at Chico Hot Springs Dining Room, and then back to Yellowstone House for our last night in Montana.

We didn't see any grizzly bears this year, and I still have yet to see wild wolves or moose. Maybe next time.