Wesbreak, IRC-break

Just wanted to note that I am pausing my online Wesnoth and IRC related activities for an indeterminate amount of time while I tend to some matters of greater priority. In case you are wondering, yes, everything is fine, but I do need a break.

You can still message me through Twitter, email (preferably!), or private message through the Wesnoth forums (which result in email notifications) if something absolutely important crops up in the meantime other than the forums being down. I have provided the other Wesnoth.org administrators with instructions in case I’m needed for something specific, and for other issues you can always PM the Forum Moderators or Administrators groups or email the forums support address, which is displayed whenever things go completely wrong.

And of course, I may continue to post about non-Wesnoth things here as I see fit.

irker-svnpoller: Subversion poller and mail filter application for irker

Just as irker’s adoption rate is increasing, I have just completed work on a very simple application for Subversion repositories — two applications, in fact.

irker-svnpoller is a very simple script that polls a single commit log (not data) from a Subversion repository and delivers notifications to any number of channels using an irkerd running on the same host. It mimics the CIA bots’ formatting, much like nenolod’s irker CIA proxy, from which I borrowed a small amount of code.

irker-svnpoller → irkerd

But exactly how is this supposed to be useful to anyone, you may be wondering right now? Well, irker-svnpoller is not really intended to be used stand-alone. A timed poller script that tracks the last notified revision could come in handy, but people could get impatient waiting for their commits to appear in their IRC channels minutes later. I am well familiarized with the defects, quirks, and virtues of my primary audience—the Battle for Wesnoth and Wesnoth-UMC-Dev projects—and this approach would simply not scale well over time.

Enter the first companion script, svnmail-filter. It reads email message headers from stdin to determine a commit’s revision number and the pertinent repository to probe using irker-svnpoller. Configuration is mostly done through a ruleset file using the JSON format.

Of course, svnmail-filter is not that useful on its own either. The idea is that procmail or some other MDA should pipe incoming email headers through svnmail-filter — and preferably, only those from legitimate sources, such as subscribed commit mailing lists. This is actually simpler than it sounds, and it is more or less inspired by CIA.vc’s perpetually broken mail-based SVN poller.

MDA → svnmail-filter → irker-svnpoller → irkerd

Since no service in the pipeline other than irkerd runs persistently in the background, this should be significantly more fault-resilient than CIA.vc’s approach, which apparently required a poller service to listen and act upon local requests. The downside is that the host running irker-svnpoller may need to create many short-lived SVN repository connections for individual commits in a chain. In Wesnoth’s case, SVN commit chains are rare enough, but their size often goes around a dozen individual commits or so. Regardless, this shouldn’t be terribly concerning for a production server with a decent low-latency uplink, and the overhead on the repository provider should be rather small compared to pushing massive commit diffs across the network.

Right now, the Wesnoth and Wesnoth-UMC-Dev projects are using this service as a stopgap measure until their respective providers—Gna.org and SourceForge.net—allow installing a hook that can either talk directly to an irkerd server, or to an instance of the aforementioned CIA proxy using the CIA XML-RPC method.

I am not all that keen on other people using a piece of software I developed and tested within less than three days without any prior experience working with Python. There are also various problems inherent to any application depending upon Subversion and its incompetent network transport layer.

Nonetheless, I published a Git repository on GitHub including a small amount of documentation to get started:

I am open to possible improvements coming from people intending to use this on production servers. In particular, if someone out there works with a commit mailing list where revision numbers can’t be found in mail subjects it would be necessary to adapt svnmail-filter a little to handle that case. Perhaps it might even be possible to skip the irker-svnpoller step for mailing lists where the notification message structure is constant and well documented.

Exit CIA.vc, enter irker

Following CIA.vc’s untimely demise, ESR and a small ad hoc group of coders and testers including nenolod (from Atheme) and our very own AI0867 (who has led Wesnoth-UMC-Dev in my absence) finally completed the work required to get irker 1.0 out. irker itself has been a work in progress for a while since the last CIA outage in August.

It’s advertised as a CIA.vc replacement, but in reality it is something far less ambitious in scope: a write-only IRC bot that serves its own message bus. From its own README:

irkerd is a specialized IRC client that runs as a daemon, allowing other programs to ship IRC notifications by sending JSON objects to a listening socket.

It is meant to be used by hook scripts in version-control repositories, allowing them to send commit notifications to project IRC channels. A hook script, irkerhook.py, supporting git and Subversion is included in the dustribution (sic); see the install.txt file for installation instructions.

The author’s intention is for existing code forges to adopt this service, and perhaps optionally run it on their own facilities alongside their VCSes, allowing repository admins to opt in for using hooks that deliver notifications to those internal irker instances. irker’s pipeline is extremely flexible and can be summed up as follows:

Repository hook → irker instance → IRC server

CIA.vc’s pipeline is not entirely clear to me and I did not have the opportunity to inspect it from inside, unlike ESR. However, there’s enough evidence suggesting that it was more or less like the following:

Repository hook → CIA.vc XML-RPC or mail provider → CIA.vc database manager → IRC front-end → IRC server

Note that there was also a web front-end, which was integral to CIA’s mission as it was the only way to define projects and bots. A commit notification occurred for a given project; say, Wesnoth-UMC-Dev. The IRC portion of the pipeline made sure that all relevant bots (each one associated to a single channel from the model standpoint) would report the same commit. A less relevant Web front-end in the pipeline took care of adding the commit to the project page (including statistics and an XML feed).

The IRC portion was flexible enough to accommodate the simplest use case (notifying a single project’s commits in a single channel), and more elaborate yet still reasonable use cases (notifying commits from multiple projects in a single channel) without much hassle, all done by tweaking the bots’ configuration in the web-based configuration front-end. Even more advanced use cases were possible by choosing the Advanced Filtering option in the same front-end. This allowed me to have a bot in ##shadowm on freenode report commits as follows:

  • Commits from wesnoth-umc-dev (already reported in #wesnoth-umc-dev) with paths matching */After_the_Storm/* and */Invasion_from_the_Unknown/*, regardless of author
  • Commits from my own CIA-registered projects (morningstar, weldyn, dorset, etc.) regardless of author
  • Commits from any other CIA-registered project (such as Wesnoth or Frogatto) with an author matching my real name or any of my preferred screen names (fun fact: never got any false positives since I set it up a couple of years ago)

I should emphasize that this required no changes to hooks in each repository. Hooks delivered just a minimal set of information, including the commit hash or number, the commit message, affected path, affected branch (when applicable), affected module (when applicable), the author name, and the project name. Everything else was done on CIA’s side, including deciding which channels should get notified of individual commits.

irker does not do this.

irker’s perceived elegance stems from its very basic and versatile design. Essentially, it serves as a mechanism for a repository hook to interact with IRC without having to establish a short-lived connection to a server for every individual commit or commit batch — an approach that GitHub currently allows via a separate, seldom used IRC service hook. irker is not novel in design by any means, and the hype around it is only justified by the fact that nobody bothered to create and advertise any other service that could properly replace CIA.vc before and be inherently extensible maintainable over time.<

irker’s extensibility and maintainability stems from the fact that a good portion of the work is done by the repository hooks, and irker is near completely stateless — the obvious opposite of CIA.vc’s architecture.

Unfortunately, this renders advanced use cases such as the above ##shadowm CIA ruleset completely incompatible with the irker pipeline.

From ESR’s post on CIA.vc’s design and its shortcomings:

[...] the original designer fell in love with the idea of data-mining and filtering the notification stream. It is quite visible on the CIA site how much of the code is concerned with automatically massaging the commit stream into pretty reports. I’m told there is a complicated and clever feature involving XML rewrite rules that allows one to filter commit reports from any number of projects by the file subtrees they touch, then aggregate the result into a synthetic notification channel distinct from any of the ones those projects declared themselves.

(He somehow got this part slightly wrong. Incidentally, it was me who brought it up in #cia around August 25th in the first place. The projects’ own notification channels were as synthetic as any others from CIA.vc’s point of view. That is to say, not at all. Additionally, they weren’t XML rewrite rules, but rather commit matching rules.)

His opinion is, naturally...

Bletch! Bloat, feature creep, and overkill!

Yes, I admit that it is overkill, but it was a nice thing from our point of view as users of the system. There’s a line between using a service, and administrating it.

On the plus side, seeing as how irker aims to become an actual standard for IRC feeds of any sort (not just for VCSes), it is good that it only implements the most basic functionality by itself. This should later allow us to come up with ingenious applications such as nenolod’s CIA proxy for irker (delivers CIA.vc XML-RPC requests in a format suitable for irker). Some people have even proposing building new services using irker’s protocol, adding an authentication layer on top and integrating it to IRC networks as a hosted service!

But replicating the end-user functionality a few people like me enjoyed will invariably take some additional effort. ESR suggests:

Filtering? Aggregation? As previously noted, they don’t need to be in the transmission path. One or more IRC bots could be watching #commits, generating reports visible on the web, and aggregating synthetic feeds. The only agreement needed to make this happen is minimal regularity in the commit message formats that the hooks ship to IRC, which is really no more onerous than the current requirement to gin up an XML-RPC blob in a documented format.

Of course, if the #commits channel on freenode ever regains its former glory, this would require a bot to listen to and filter possibly thousands of messages per minute, all coming from multiple clients. I don’t think I am fit to become the pioneer who’ll conquer this new land.

Furthermore, since the task of formatting messages for individual commits is exclusive to individual hooks, we may end up with a highly fragmented and inconsistent ecosystem. For example, as things stand right now, no-one is required to include #commits on freenode as a destination for commit notifications, and I imagine very few people will bother to do so in the future.

All in all, it was our own incompetence that allowed CIA.vc to die prematurely despite the multiple calls for replacing it, and the obviously deplorable service conditions. We can’t really complain now.

CIA.vc is dead

I normally don’t comment or report on other sites’ statuses in here since this is my personal blog, but this situation actually impacts Wesnoth, Wesnoth-UMC-Dev, and me directly; especially me, considering I went to ridiculous lengths the other day to solve a related issue on GitHub.

The point is literally the title of this post: CIA.vc is dead.

You know, CIA.vc; that amazing service which provided real-time VCS commit notifications on various IRC networks and that everyone took for granted. This is by no means the first time it bites the dust, but in this opportunity it’s suspected that nobody really bothered to make backups.

nenolod (who was merely hosting the instance running CIA.vc) explained the situation in freenode’s #cia channel about an hour ago.

Assuming the other people who had admin access don’t have their own recent backups, CIA.vc’s future looks particularly bleak right now. Here’s hoping that a dedicated team of competent coders with access to a suitable server for hosting will quickly build a better replacement within the next few days. (Ha, ha, ha. Right.)

GitHub and long CIA commit notifications

Ever since I started using GitHub I have been greatly bothered by the questionable design decision of sending only the first line of a commit message to CIA.vc — the service that allows us to get instant commit notifications on IRC channels. For people using Git like it was intended to be used, this means you will only see the subject line for each commit in your CIA notifications and the web feed.

That’s not my only reservation about GitHub’s custom CIA hook’s design, in fact. It also limits the amount of notifications sent per push to five; more commits than that get notified as a single commit along with a “(+n more commits)” notice in the message. While everyone knows that CIA is broken by design and all, it just doesn’t feel right to me that GitHub should be in charge of manipulating notifications to avoid flooding and all. Whatever, I say.

Regardless of CIA’s perpetual state of brokenness, there is something that it currently does right. CIA bots won’t flood a channel with more than a few message lines per commit. One could then assume that this renders GitHub’s single-line commit notifications entirely pointless, but there might be people who really want their CIA bot to behave like a continuous git log --format=oneline run without figuring out a complicated CIA formatting ruleset specification.

I am definitely not part of that crowd, but I respect people’s right to choose, so I decided to provide them with the choice to get full commit messages sent to CIA.vc. The relevant pull request has just been accepted.

This is what the CIA hook configuration looks like in production now:

CIA hook configuration

I have gone and enabled the option for every repository I currently host at GitHub that’s already using the hook. If you really care about proper Git commit message formatting (or merging commits from people who do) and you are also using CIA, I urge you do the same.

It should be noted that the single commit in that pull request is my very first attempt at writing code in Ruby at all.

Resolutions!

I have been thinking about some stuff to do during this year for a while, actually. It’s really hard to decide because I’m a person who runs into all sort of trouble while trying to get projects accomplished (including procrastination).

One thing I’m already doing is learning some Japanese, for no particular reason at all — although you’ve got to admit that having multiple languages in your curriculum is worth a bunch of coolness points. 😛 Espreon is helping me along the way with his own recently gained knowledge. It seems quite fun to learn a language in a non-Latin alphabet, if not a tad overwhelming at times, especially with kanji.

It’d be a good idea to lose some weight this year, too. My addiction to sugary stuff isn’t quite compatible with my heart condition! (Nor is coffee, but… meh.)

Screenshot of AtS

Then there’s Wesnoth. I intend to finish the Second Act™ of After the Storm Episode I as soon as I may, even through the means of placeholders — I’m willing to do anything to rescue AtS out of Development Hell before the end of 2011.

Wesnoth RCX’s development is halted right now due to lack of interest on my part to invest energy on writing the rest of the new functionality (i.e. definition of custom ranges and palettes), but I know that once I touch Qt Creator’s awesome interface I can’t stop working for a while — so I may eventually get some inspiration to redesign the main window, which should inevitably lead me to tinker around with the rest of the code.

C# was the first “major” programming language I learned, not counting Visual Basic. I have some fond memories of my first experiments with C#, but after I embarked upon learning and using C++ I kind of forgot about it. I have been considering the possibility of writing an IRC client of sorts using C# just for kicks, and to not forget this language in case I ever need it again. Why IRC? Because clients for this protocol are simple and challenging to implement, both at the same time!

I’ve already started to learn a bit of Lua for my work on the aforementioned Wesnoth campaign — in fact, there’s already some released code within it written in this language, particularly in scenario 5! I have plans to rewrite parts of Invasion from the Unknown in Lua to clean it up a little, thus paving the road for future maintenance work by me or other people (don’t forget that IftU is still abandoned!).

Another software project I intend to tackle in the short term is Rei 2. Sure, she’s doing well and her main command handlers are many and useful enough for channels such as ##shadowm and #wesnoth-umc-dev, but her dependence on Irssi’s core might well be a curse for one of our use cases: Shikadibot (the Second), which runs on a resource-limited VPS where every drop of RAM has got gold value. I’m already brainstorming a possible abstraction layer (codenamed “API 3”) which could allow the Irssi core to be swappable with a custom, native IRC client core (codenamed “Anya”). There’s really not much in the current Irssi-based implementation of the internal interfaces (“API 2”) that make a dependency switch unfeasible.

Finally, I’m not going to stop producing useless updates for my website! Dorset5 0001 is already a reality, although there’s still much I want to do before replacing the current layout. This time I have placed an emphasis on readability and elegance that I don’t think the previous revisions have lived up to so far.

• • •

All in all, I always have so many ideas floating in my mind that I rarely carry to realization, so this can’t be considered a definitive list. There are other possibilities I’m contemplating for the long term regarding my personal life, but that’s a much more volatile subject to discuss so I’m avoiding it for now.

Preparing for yet another year

It’s almost over. Time flies even faster as we get closer to the end of 2010, and apparently there’s a lot to summarize despite we’re not in the finish line yet!

This has been a particularly difficult year for me in a more personal sense, and I’ve faced some trials I won’t speak about and then some, but I’ve also learned new things in the road — things that may be of greater use to me in the future. There’s really a lot that could be said about this year but I’ll restrict it to computer stuff to avoid boring the audience too much bore the audience as much as possible.

Continue reading “Preparing for yet another year

Throwaway code

It has become increasingly common for me to come up with a program for an amazing task one day, to rewrite it the next day.

umcdist, the Wesnoth-UMC-Dev Distribution Tool, has been in development hell for a year mostly for this reason; the other reason is that it seems like it will perform worse than build-external-archive.sh a.k.a. “Scrappy” due to an excessive usage of Perl's system function. I cannot make up my mind and choose between performance and maintainability; Espreon knows that build-external-archive.sh is broken, but I can't be bothered to try to understand that unholy abomination again to fix it.

Meanwhile, umcstat (Wesnoth-UMC-Dev Statistics Service) is still a work in progress, but with more emphasis in progress; there's actual code written already, and I've been using freenode's Eir bot framework to test it.<

While Eir could possibly be a nice way to get rid of umcreg's Net::IRC dependency and code, it's actually a C++ program that can be compiled only with GCC 4.4 at minimum, due to at least one C++0x feature used throughout the code: auto. The target machine runs Debian lenny, unlike my laptop (Squeeze), and therefore doesn't have GCC 4.4!

Instead of sticking with Eir, I'm refactoring umcreg's IRC code into a custom Perl-only framework, umcbotd, and making creative use of eval writing an abomination code-named “Naia”, which I'm rewriting again because the first version I wrote, which worked, was very badly designed and ugly. I know it's a problem when I have three classes or modules all depending upon each other's internals.

The goal of umcbotd/Naia is producing a Net::IRC-based abstraction layer for our bots that treats them as “services” with multiple “modules” (not in the Perl sense, though) that can be easily inserted and removed from the system by adding/removing their files. umcreg already runs with a prototype implementation of this mechanism, but it needs to be generalized further before it can be usable with Naia.

Writing our bots could be this simple, if Naia gets completed:

#!/bin/true
CO_SERVICE_COMPONENT('umcreg');
sub ctcp_version_reply () { "Wesnoth-UMC-Dev Registry Service, using " . Naia::version_string() }
sub eh_ctcp_version
{
my ($self, $event) = @_;
$self->ctcp_reply($event->nick, 'VERSION ' . ctcp_version_reply());
}
# ...
my $bot = Naia::get_bot('umcreg');
my %eh = (
'cversion' => \&eh_ctcp_version,
'msg' => \&eh_msg_private,
'330' => \&eh_whoisloggedin,
'318' => \&eh_endofwhois,
'331' => \&eh_notopic,
'332' => \&eh_topic,
'403' => \&eh_nosuchchannel,
);
$bot->connect();
$bot->register_event_handlers(%eh);

And their modules would be like this, more or less the same as umcreg's modules already are:

#!/bin/true
# token, subroutine, privilege level (A: admin, H: half-admin, U: user)
CO_MODULE('RAW', \&co_raw, 'A');
sub co_raw
{
my ($parent, $nick, $hostmask, $svaname) = (shift, shift, shift, shift);
if(!@_) {
$parent->notice($nick, "Not enough arguments for \002RAW\002.");
return;
}
$parent->sl(join(' ', @_));
broadcast_to_log("RAW [$hostmask]");
}

... And of course it would still involve lots of ugly stuff under the hood (eval magic), but if done right I shouldn't have to touch it whenever I wanted to add or remove a feature from any of our two services.

umcreg

At last, umcreg, Wesnoth-UMC-Dev's Registry Service, is finished, deployed and announced in the forums, thus completing the first part of introducing the Registry system to the project.

There were some changes from its original incarnation, but everything turned out pretty well for a bot written from scratch in approximately 6 days by a Perl fanatic with no knowledge of object-oriented Perl — I actually learned some object-oriented Perl while at this and I feel like I can do anything with it now. 😄

I was in a hurry to get this done right before freenode deployed ircd-seven today (yays!), for no particular reason; this resulted in a security feature not working with hyperion-ircd until I introduced a quickie hack that I'll be retiring later today.

I have already registered our current members using some basic, known information about them — even including their join date. Old projects' registration will be a little slow as I need to research the current structure of the repository, of which I lost track a year ago, and retrieve original timestamps. The registry's web interface, provided by the umcreg::Web and Thoria::Web packages, can be found here.

umcreg is already working at the project's admin channel on freenode too. It only obeys the project staff's orders, though, so there's no point in trying to send messages to it.

The next step in implementing the Registry model is writing the Statistics service, codenamed “Listra”, and most likely going to be named umcstat. That will definitively take much longer than umcreg's development. Meanwhile, Espreon is trying to convince me to take care of umcdist (codename “Blackmore”) first.