Exit CIA.vc, enter irker

Following CIA.vc’s untimely demise, ESR and a small ad hoc group of coders and testers including nenolod (from Atheme) and our very own AI0867 (who has led Wesnoth-UMC-Dev in my absence) finally completed the work required to get irker 1.0 out. irker itself has been a work in progress for a while since the last CIA outage in August.

It’s advertised as a CIA.vc replacement, but in reality it is something far less ambitious in scope: a write-only IRC bot that serves its own message bus. From its own README:

irkerd is a specialized IRC client that runs as a daemon, allowing other programs to ship IRC notifications by sending JSON objects to a listening socket.

It is meant to be used by hook scripts in version-control repositories, allowing them to send commit notifications to project IRC channels. A hook script, irkerhook.py, supporting git and Subversion is included in the dustribution (sic); see the install.txt file for installation instructions.

The author’s intention is for existing code forges to adopt this service, and perhaps optionally run it on their own facilities alongside their VCSes, allowing repository admins to opt in for using hooks that deliver notifications to those internal irker instances. irker’s pipeline is extremely flexible and can be summed up as follows:

Repository hook → irker instance → IRC server

CIA.vc’s pipeline is not entirely clear to me and I did not have the opportunity to inspect it from inside, unlike ESR. However, there’s enough evidence suggesting that it was more or less like the following:

Repository hook → CIA.vc XML-RPC or mail provider → CIA.vc database manager → IRC front-end → IRC server

Note that there was also a web front-end, which was integral to CIA’s mission as it was the only way to define projects and bots. A commit notification occurred for a given project; say, Wesnoth-UMC-Dev. The IRC portion of the pipeline made sure that all relevant bots (each one associated to a single channel from the model standpoint) would report the same commit. A less relevant Web front-end in the pipeline took care of adding the commit to the project page (including statistics and an XML feed).

The IRC portion was flexible enough to accommodate the simplest use case (notifying a single project’s commits in a single channel), and more elaborate yet still reasonable use cases (notifying commits from multiple projects in a single channel) without much hassle, all done by tweaking the bots’ configuration in the web-based configuration front-end. Even more advanced use cases were possible by choosing the Advanced Filtering option in the same front-end. This allowed me to have a bot in ##shadowm on freenode report commits as follows:

  • Commits from wesnoth-umc-dev (already reported in #wesnoth-umc-dev) with paths matching */After_the_Storm/* and */Invasion_from_the_Unknown/*, regardless of author
  • Commits from my own CIA-registered projects (morningstar, weldyn, dorset, etc.) regardless of author
  • Commits from any other CIA-registered project (such as Wesnoth or Frogatto) with an author matching my real name or any of my preferred screen names (fun fact: never got any false positives since I set it up a couple of years ago)

I should emphasize that this required no changes to hooks in each repository. Hooks delivered just a minimal set of information, including the commit hash or number, the commit message, affected path, affected branch (when applicable), affected module (when applicable), the author name, and the project name. Everything else was done on CIA’s side, including deciding which channels should get notified of individual commits.

irker does not do this.

irker’s perceived elegance stems from its very basic and versatile design. Essentially, it serves as a mechanism for a repository hook to interact with IRC without having to establish a short-lived connection to a server for every individual commit or commit batch — an approach that GitHub currently allows via a separate, seldom used IRC service hook. irker is not novel in design by any means, and the hype around it is only justified by the fact that nobody bothered to create and advertise any other service that could properly replace CIA.vc before and be inherently extensible maintainable over time.<

irker’s extensibility and maintainability stems from the fact that a good portion of the work is done by the repository hooks, and irker is near completely stateless — the obvious opposite of CIA.vc’s architecture.

Unfortunately, this renders advanced use cases such as the above ##shadowm CIA ruleset completely incompatible with the irker pipeline.

From ESR’s post on CIA.vc’s design and its shortcomings:

[...] the original designer fell in love with the idea of data-mining and filtering the notification stream. It is quite visible on the CIA site how much of the code is concerned with automatically massaging the commit stream into pretty reports. I’m told there is a complicated and clever feature involving XML rewrite rules that allows one to filter commit reports from any number of projects by the file subtrees they touch, then aggregate the result into a synthetic notification channel distinct from any of the ones those projects declared themselves.

(He somehow got this part slightly wrong. Incidentally, it was me who brought it up in #cia around August 25th in the first place. The projects’ own notification channels were as synthetic as any others from CIA.vc’s point of view. That is to say, not at all. Additionally, they weren’t XML rewrite rules, but rather commit matching rules.)

His opinion is, naturally...

Bletch! Bloat, feature creep, and overkill!

Yes, I admit that it is overkill, but it was a nice thing from our point of view as users of the system. There’s a line between using a service, and administrating it.

On the plus side, seeing as how irker aims to become an actual standard for IRC feeds of any sort (not just for VCSes), it is good that it only implements the most basic functionality by itself. This should later allow us to come up with ingenious applications such as nenolod’s CIA proxy for irker (delivers CIA.vc XML-RPC requests in a format suitable for irker). Some people have even proposing building new services using irker’s protocol, adding an authentication layer on top and integrating it to IRC networks as a hosted service!

But replicating the end-user functionality a few people like me enjoyed will invariably take some additional effort. ESR suggests:

Filtering? Aggregation? As previously noted, they don’t need to be in the transmission path. One or more IRC bots could be watching #commits, generating reports visible on the web, and aggregating synthetic feeds. The only agreement needed to make this happen is minimal regularity in the commit message formats that the hooks ship to IRC, which is really no more onerous than the current requirement to gin up an XML-RPC blob in a documented format.

Of course, if the #commits channel on freenode ever regains its former glory, this would require a bot to listen to and filter possibly thousands of messages per minute, all coming from multiple clients. I don’t think I am fit to become the pioneer who’ll conquer this new land.

Furthermore, since the task of formatting messages for individual commits is exclusive to individual hooks, we may end up with a highly fragmented and inconsistent ecosystem. For example, as things stand right now, no-one is required to include #commits on freenode as a destination for commit notifications, and I imagine very few people will bother to do so in the future.

All in all, it was our own incompetence that allowed CIA.vc to die prematurely despite the multiple calls for replacing it, and the obviously deplorable service conditions. We can’t really complain now.