Subversion blows

More than one year ago I commented on the consequences of interrupted commit transactions with the Subversion version control system. Back then, SVN was the only VCS I was familiarized with, but nowadays I also have a basic grasp of Git for local and remote repository management.

The thing is, SVN is pretty simple and easy to learn for novice users — which is one of the reasons I haven't decided yet, as the founder admin of the Wesnoth-UMC-Dev Project, to switch to Git. A distributed version control system such as Git or Mercurial are not “better” than SVN, just like Linux cannot be “better” than Windows — they are completely different models for both users and site admins, and switching your version control system isn't as easy as switching from KDE to GNOME as your desktop environment or buying a new printer, especially when you have lots of users and the model conversion isn't easily reversible.

But let's not forget that there's more to SVN, or any other revision tracking system than just the philosophy and the model behind. There is an official client which ships in major Linux distributions such as Debian GNU/Linux, Ubuntu and openSUSE, and which also has shared library code used by third-party GUI front-ends such as kdesvn, or other SVN clients such as the git-svn infrastructure.

I have not seen the code, and I believe I do not want to see it with my eyes, but SVN's network code seems to be crap.

The issue mentioned in my second blog post remains the same after several versions of the vanilla SVN client. Then there are other issues that have been here for a long time, and an issue I only discovered some days ago:

  • It is possible in middle of a networked transaction (commit, update) for the Subversion client process to get stuck if a network error occurs. Subversion normally traps SIGTERMs (and apparently SIGHUPs and SIGINTs too) to perform cleanup routines after such interruptions. However, when it gets stuck, the SIGTERM handler becomes useless and the client ignores the signal forever. This means that SVN can get stuck and sit idle on the terminal (most likely waiting for data from the remote host) until it a SIGKILL is sent to force the destruction of its process. Since SIGHUPs are also trapped, killing the terminal leaves a hidden, waiting SVN process. In other words, you can get a dead SVN client running for months if your power source is stable enough. Wonderful.
  • Clients that terminate abnormally (SIGKILL and such) may leave random crap hidden in your SVN checkout's control directories that are normally cleaned up by the SIGTERM handler. While this is often inoffensive and svn cleanup or svn cleanup .. can handle it all, there are times when this is not enough and SVN gets confused for missing/extra files or directory metadata and refuses to update or cleanup a path. In such cases, removing the path and its contents and re-checking it out with svn update (or, if it was the whole working dir, svn checkout) is necessary.
  • There seems to be a lot of overhead in the SVN subprotocol on any transport class, be it http, https, svn or svn+ssh. Commits containing simple changes to file/dir object properties can take as long as a regular commit diff when they should probably contain less data (if they contain as much data, then...?!). This is very noticeable on low-bandwidth connections for me. In comparison, a SVN commit of about 20 property changes can take longer than a Git push through SSH of about 10 large commits introducing whole new files.

Then there are some odd things with the SVN client library (libsvn), specifically the Perl bindings, namely the issues I mentioned at the start of this month, that leave me very disappointed at this version control system, rumored to be better than CVS (which I haven't ever used...imagine!). The Debian version in Squeeze, and possibly Lenny or upstream too, has a really nasty bug which caused a massive memory leak with Wesnoth-UMC-Dev's umcpropfix tool when I ran it to set properties on a version of Extended Era for Wesnoth, manipulating over 1300 files on multiple dirs. The parent Perl interpreter process allocated over 2 GB of overall virtual memory, making Linux page most memory out to swap, thus hurting performance.

The cause? Pool management. libsvn's Perl bindings are supposed to do automatic memory management unless the client wants to do their own pool management with the library's facilities, but that somehow causes the aforementioned leak instead. The solution turns out to be doing custom pool management by allocating a new pool for every libsvn call, and forgetting the old one. r6567 in the Wesnoth-UMC-Dev repository applies this workaround for our SVN property setting tool.

Honestly, I'm tired of SVN. I use git-svn wherever I can but this isn't a magic solution for the crappy design of SVN's innards. git-svn can skip upstream commits if the connection interrupts during a fetch operation, and it forgets about local commits during pushes (git svn dcommit) after it has sent the first commit and fetched missing ones, which can cause loss of commits and history if the connection to the remote host breaks at that point or git-svn exits in any other fashion.

git-svn doesn't replace SVN's network code either (it uses it instead), so it's still subject to the perceived overhead, but at least it doesn't get stuck forever ignoring SIGTERMs.

What Wesnoth-UMC-Dev needs at the moment is a distributed (yes) version control system that's as more user-friendly as than SVN and has a nice, well documented Windows front-end that's easy to setup, learn and understand. If we can't find that, I'll continue complaining about SVN at any time and on every place where I see fit, here or in IRC.

Stop the planet Earth, I want to get off!

The Wi-Fi router I use for connecting to the Internet stopped working last night (that dreaded “No route to host”). It's back now, but I'd have liked to post an update yesterday. Anyway, no news is good news, no? Er, well, assuming that the communication systems work, that is...which sadly seems not to be the case for the location of the epicenter, Cobquecura.

It's been more than 30 hours after the earthquake that caught me in the bathroom some minutes after 3:30 am yesterday. The aftershocks, while generally small, continue every 5-90 minutes, and at this point it just feels like one long ride on a bus through a badly deteriorated road — and it's getting fucking annoying.

I'd said that we were okay before. First I was scared when all this happened, and then I was sad after the news. Next, I was worried about the aftershocks. Now I'm just annoyed. Really annoyed.

Nonetheless, I'm not letting an earthquake and a never-ending sequence of aftershocks stop me. There's some groundbreaking work going on here — which you can't see, but I can, and I must say it looks really nice.

Now, could we just stop this madness and allow the rescue groups do something in the most affected locations near the epicenter without having the ruins of the buildings shaken every few minutes by a whimsical force of nature?

Earthquake in Chile

As you probably know (if not, Google “earthquake in Chile”), Chile's been struck by an earthquake approx. 8.8 in the Richter scale in the region near Concepción. I live in Santiago, and we have also been affected by this unfortunate event.

(Following comes from this forum post, now missing due to the former autopruning mechanism.)

I am OK right now, but I got trapped in the house during the earthquake (I was in the bathroom and some really heavy objects were in the way to the kitchen, which is the closest way to exit from my bedroom) and thought I wouldn't be able to get out in time. I did (about right at the end of the earthquake, not knowing at the moment if its intensity would continue increasing...), but we are not sure whether the house deteriorated further or not and do not really believe it'd resist a real earthquake with epicenter near Santiago.

There's a good distance between Concepción and Santiago, so it was about 7 in the Richter scale in Santiago according to the authorities last time I checked — about 10 am via FM radio on the car, we didn't have electricity, tap water or Internet at all until around 2 pm and I fell asleep around 1:16 pm after being unable to sleep the whole night with the strong and continuous tremors that followed. I originally posted this around 6:40 pm.

While everything's fine for us here right now, sadly, other areas of this same region didn't have this luck. Including areas where some of our family lives.

Naturally, everything to the south is chaos according to the news and there are still isolated people in coastal areas closer to the epicenter.


There are still tremors as of this writing. There was a small one which cut the power lines for 1 second some minutes ago, followed by a stronger one with lots of underground noise. The movement pattern continues being the same as the original earthquake. Internet is flaky.

If anyone's really interested in speaking to me, I've temporarily opened ##shadowm on

Rewriting the past, and the woes of SVN

Long ago, I wrote a little Bash script — set-properties — for the Wesnoth-UMC-Dev project, to ensure the correctness of SVN properties such as svn:keywords and svn:executable on files. It was pretty simple:

#! /bin/sh
# Set properties on PNG files
for f in $(find -iname *.png); do
svn propdel svn:executable $f
svn propset svn:mime-type image/png $f
# Set properties on Ogg files
for f in $(find -iname *.ogg); do
svn propdel svn:executable $f
svn propset svn:mime-type audio/x-vorbis
# Set properties on PCM files
for f in $(find -iname *.wav); do
svn propdel svn:executable $f
svn propset svn:mime-type audio/x-wav
# Set properties on JPEG files
for f in $(find -iname *.jpg); do
svn propdel svn:executable $f
svn propset svn:mime-type image/jpeg $f
for f in $(find -iname *.jpe); do
svn propdel svn:executable $f
svn propset svn:mime-type image/jpeg $f
for f in $(find -iname *.jpeg); do
svn propdel svn:executable $f
svn propset svn:mime-type image/jpeg $f
# Set properties on CFG file
for f in $(find -iname *.cfg); do
svn propdel svn:executable $f
# Set properties on scripts
for f in $(find -iname *.sh); do
svn propset svn:executable '*' $f
for f in $(find -iname *.cmd); do
svn propset svn:executable '*' $f
for f in $(find -iname *.bat); do
svn propset svn:executable '*' $f
for f in $(find -iname *.py); do
svn propset svn:executable '*' $f

Some time after I learned Perl with my work on Shikadibot 0314, I rewrote that script in Perl to arrange the “ideal” property values in a neat table (hash), check current properties instead of blindly overwriting them in the working copy, and cover plenty of other file types. It also gained a blinking progress bar to display the search progress for some reason.

To have an idea of how known file types are defined in the source, let's take a look at these bits:

# proptab:
# extension => [svn:executable, svn:mime-type, svn:eol-style, svn:keywords]
# properties set to the empty string '' (except svn:executable) are left unchanged;
my %proptab = (
cfg => [FALSE, '', 'native', ''],
ign => [FALSE, '', 'native', ''],
"map" => [FALSE, '', 'native', ''],
# [...]
pl => [TRUE, '', 'native', 'Author Date Id Revision'],
# [...]
gif => [FALSE, 'image/gif', '', ''],
png => [FALSE, 'image/png', '', ''],
# [...]
xcf => [FALSE, '', '', ''],
# [...]

The table we have used since then (around Sept. 2008) has always contained more than 10 extensions with their minimum required property sets. As of this writing, it covers approximately 57 file types. Keep this on mind.

It would be overkill to fork-exec find processes to discover paths that could require SVN property changes, right? So, instead, I used find2perl to generate File::Find client code to embed it into set-properties. So far, so good. But how about running that code (scalar keys %proptab) times (e.g. number-of-extensions-times) anyway? Overkill?

No! It's plain stupid. But definitely less stupid than what you are about to read.

I apparently decided, for some reason, that any matching paths in each cycle should be added to a plain scalar (a text string to be exact) separating individual paths with newlines, of all things. Then, another cycle is performed at the end, (scalar keys %proptab) times again, to put each array of paths matching a certain extension into another hash, then iterating over the newly inserted hash element (array of paths) checking and fixing SVN properties in the same cycle.

Very roughly summarized as the following pseudocode:

FOREACH extension FROM file_extensions
    FIND IN ./ AS file
        ; all while displaying a cute blinking bar!
        IF file MATCHES extension
            APPEND file TO file_list_string
        END IF

FOREACH extension FROM file_extensions
    ; split 'file_list_string' every newline
    FOREACH path FROM (SPLIT /\n/ file_list_string)
        INSERT path IN file_index[extension]

    FOREACH path FROM file_index[extension]
        fix_properties( extension, path )

Careful readers will quickly realize that something is horribly wrong with this pseudocode. I wish I was making it up. This algorithm has actually been in use by the Wesnoth-UMC-Dev set-properties script for one year and 4 months! I must have been on something when I wrote this shit. I've honestly never seen any program so awful as this in my life. Not even (a.k.a. “Scrappy”) can compete with this abomination.

So, yesterday, I took a look at set-properties after noticing how much CPU time it ate working with very scarcely populated directories of the project — and I was blaming the usage of backticks (svn propset foo bar baz and such) as a possible cause of overhead. Then I slowly realized what I had written. There isn't any emoticon here for the expression in my face at that moment.

Thus, set-properties got rewritten with a much cleaner and simpler algorithm:

    ; no more useless cute blinking bar!
    FOREACH extension FROM file_extensions
        IF file MATCHES extension
            fix_properties( extension, path )
        END IF

So yeah. 🙁

But wait, there's more! While optimizing replacing the script I also replaced svn foobar backtick code with invocations of libsvn, via the SVN::Client module. This worked very well at the end, but I discovered a few things in the process:

  • Those SVN::Client methods I used choke on non-absolute path specifications for some reason, causing an assertion failure in libsvn's C back-end and terminating the execution of Perl and the script with a SIGABRT.
  • Despite the documentation's claims for SVN::Client::url_from_path() returning undef if the specified path is not under version control, it actually causes the module to invoke die and get the client script terminated. Which means that I cannot even check if a file is versioned or not safely (e.g. without resorting to .svn/text-base/<FileName>.svn-base existence checks). What the hell?
$ set-properties
perl: /tmp/buildd/subversion-1.6.9dfsg/subversion/libsvn_subr/path.c:114: svn_path_join: Assertion `svn_path_is_canonical(base, pool)' failed.

Turns out the solution is to wrap SVN::Client method calls in eval blocks and handle whatever crap SVN comes up with. Oh, and make sure all paths are absolute using Cwd::realpath so that libsvn doesn't hit an assertion failure killing us, eval or no eval. How nice.

# No point in working with unversioned files.
my $svn_ret = undef;
eval { $svn_ret = $svn->url_from_path($path) };
if($@) {
# fucking libsvn dies if $path isn't under version control;
# the documentation says it should return undef above instead!

With this, I have absolutely lost my faith in Subversion's excellence as a version control system not only as a normal user, but also as a programmer integrating it into my own client applications. And I know I lost my faith as a user after seeing svn sit for two days doing fucking nothing because the damned connection died after 1 minute of running time — and then svn spent the next days ignoring SIGTERMs all the time, no less. It's been like this for so many versions that I'm almost convinced it's intentional.

(Ah, that was relaxing. I should do this more often.)

Wesnoth-UMC-Dev has to continue using SVN because we have several add-on authors using Windows and the few alternatives for using git (I love git, I don't think I could try anything else at the moment) on Windows seem to be rather awkward to install and/or use for the average Non-Computer Person. That's a pity. It's really a pity.

Finally, set-properties got renamed to umcpropfix for a change, to mark its rebirth after I solved the algorithmic mess above last night. It doesn't have a nifty codename like the rest, though, but the new umc-prefixed name is still something to be celebrated now that we are going to have umcdist, umcstat, umcreg and umcbotd. Yes, I know I'm crazy, thank you.

Throwaway code

It has become increasingly common for me to come up with a program for an amazing task one day, to rewrite it the next day.

umcdist, the Wesnoth-UMC-Dev Distribution Tool, has been in development hell for a year mostly for this reason; the other reason is that it seems like it will perform worse than a.k.a. “Scrappy” due to an excessive usage of Perl's system function. I cannot make up my mind and choose between performance and maintainability; Espreon knows that is broken, but I can't be bothered to try to understand that unholy abomination again to fix it.

Meanwhile, umcstat (Wesnoth-UMC-Dev Statistics Service) is still a work in progress, but with more emphasis in progress; there's actual code written already, and I've been using freenode's Eir bot framework to test it.<

While Eir could possibly be a nice way to get rid of umcreg's Net::IRC dependency and code, it's actually a C++ program that can be compiled only with GCC 4.4 at minimum, due to at least one C++0x feature used throughout the code: auto. The target machine runs Debian lenny, unlike my laptop (Squeeze), and therefore doesn't have GCC 4.4!

Instead of sticking with Eir, I'm refactoring umcreg's IRC code into a custom Perl-only framework, umcbotd, and making creative use of eval writing an abomination code-named “Naia”, which I'm rewriting again because the first version I wrote, which worked, was very badly designed and ugly. I know it's a problem when I have three classes or modules all depending upon each other's internals.

The goal of umcbotd/Naia is producing a Net::IRC-based abstraction layer for our bots that treats them as “services” with multiple “modules” (not in the Perl sense, though) that can be easily inserted and removed from the system by adding/removing their files. umcreg already runs with a prototype implementation of this mechanism, but it needs to be generalized further before it can be usable with Naia.

Writing our bots could be this simple, if Naia gets completed:

sub ctcp_version_reply () { "Wesnoth-UMC-Dev Registry Service, using " . Naia::version_string() }
sub eh_ctcp_version
my ($self, $event) = @_;
$self->ctcp_reply($event->nick, 'VERSION ' . ctcp_version_reply());
# ...
my $bot = Naia::get_bot('umcreg');
my %eh = (
'cversion' => \&eh_ctcp_version,
'msg' => \&eh_msg_private,
'330' => \&eh_whoisloggedin,
'318' => \&eh_endofwhois,
'331' => \&eh_notopic,
'332' => \&eh_topic,
'403' => \&eh_nosuchchannel,

And their modules would be like this, more or less the same as umcreg's modules already are:

# token, subroutine, privilege level (A: admin, H: half-admin, U: user)
CO_MODULE('RAW', \&co_raw, 'A');
sub co_raw
my ($parent, $nick, $hostmask, $svaname) = (shift, shift, shift, shift);
if(!@_) {
$parent->notice($nick, "Not enough arguments for \002RAW\002.");
$parent->sl(join(' ', @_));
broadcast_to_log("RAW [$hostmask]");

... And of course it would still involve lots of ugly stuff under the hood (eval magic), but if done right I shouldn't have to touch it whenever I wanted to add or remove a feature from any of our two services.


At last, umcreg, Wesnoth-UMC-Dev's Registry Service, is finished, deployed and announced in the forums, thus completing the first part of introducing the Registry system to the project.

There were some changes from its original incarnation, but everything turned out pretty well for a bot written from scratch in approximately 6 days by a Perl fanatic with no knowledge of object-oriented Perl — I actually learned some object-oriented Perl while at this and I feel like I can do anything with it now. 😄

I was in a hurry to get this done right before freenode deployed ircd-seven today (yays!), for no particular reason; this resulted in a security feature not working with hyperion-ircd until I introduced a quickie hack that I'll be retiring later today.

I have already registered our current members using some basic, known information about them — even including their join date. Old projects' registration will be a little slow as I need to research the current structure of the repository, of which I lost track a year ago, and retrieve original timestamps. The registry's web interface, provided by the umcreg::Web and Thoria::Web packages, can be found here.

umcreg is already working at the project's admin channel on freenode too. It only obeys the project staff's orders, though, so there's no point in trying to send messages to it.

The next step in implementing the Registry model is writing the Statistics service, codenamed “Listra”, and most likely going to be named umcstat. That will definitively take much longer than umcreg's development. Meanwhile, Espreon is trying to convince me to take care of umcdist (codename “Blackmore”) first.

Building arcs

After some weeks of inactivity, I have finally completed the first arc (not the first episode, though) of After the Storm in the Wesnoth-UMC-Dev SVN repository, comprising 7 scenarios out of planned 11; this means that a 0.3.0 release is coming soon. It was about time!

With this new arc completed, I have introduced new background story elements that could be considered controversial if any mainline purist is actually paying attention to the campaign — that's okay, I never intended IftU or any sequels to be mainlined. I still struggle to keep everything as fuzzy as possible to give a certain degree of flexibility to any content authors who decide to take what is said in IftU and AtS as “canon”. It's harder than it sounds, particularly because it must still be clear enough to allow the plot to progress; so I cannot just throw a bunch of nonsense into the campaign and say “hey, look at this, this is our vague excuse for this pathetic plotline!”.

Now that the characters have a decidedly vague excuse for the plot of the next arc, I face a problem that I knew I'd have to handle sooner or later: artwork.

I am not a good pixel artist, but I don't have any loyal slav- pixel artist that could help me either. And even if I could get one, I'm not completely sure I could describe the concepts I have in mind in plain words to tell them what unit sprites I require. There's also the spoileriffic factor; there's a reason that I removed the original (clumsy) storyboard from the SVN repository and departed from the original plans, most notably by removing a main character and introducing two new sidekicks instead. So, it's up to me to create any pixel art needed to make the campaign work; baseframes are enough for this purpose, although I still wish IftU and AtS' original units had animations.

At least I don't need to write a game engine from scratch too, thanks to Wesnoth's scripting flexibility.

Kalari at last

It took me much less time than I expected to put the new layout of the Wesnoth-UMC-Dev website together. Observe.

Okay, that's basically because most of the design was already made long time ago, in the form of the site's earlier incarnation, codenamed “Soradoc”, which looked rather busy and useless with the sidebar and other design elements. The new design, “Kalari”, removes the sidebar, clears the site banner a bit, and blends the site with as far as appearance is concerned. It's not the same design, but it's similar — that should be a good thing considering the purpose of Wesnoth-UMC-Dev.

That site also had a Blosxom-based blog, but I removed it since nobody was making actual use of the space.

The greatest thing about all this is that most of the PHP, “Poison Ivy” was finished in 1 night, while the rest took me just a few additional hours. Now that Poison Ivy is completed, I can reuse its code for the next incarnation of this very website blog.

It's all for teaching some web design and programming basics to myself, really.

Mozilla Firefox 3.5

Long, long ago, I talked about several issues I had with Mozilla Failfox Firefox 3.0 and openSUSE 10.3 for the AMD64/EM64T architecture.

Ever since then, I have learned several things:

  1. Debian's Iceweasel fork doesn't seem to be much ahead of mainline Firefox in terms of bugfixes, as far as I can see. This might be not true for security fixes and such; I admit I haven't done any actual research on this and I'm basing this statement on my user experience.
  2. The Download Day was a trap.
  3. Other people who I have talked to regarding Firefox's stability on Linux claim that is never/rarely crashes, but all of them use x86 kernels and userspace.
  4. Iceweasel 3.0 taints the Debian GNU/Linux 5.0 Lenny distribution on the AMD64/EM64T architecture, with no differences in either of my laptops. This Linux distribution is remarkably stable otherwise, and lived up to my expectatives since I originally switched to it when it was the Testing distribution — this is, comparing it to the released openSUSE 10.3.
  5. Off-line browsing is truly, horribly underestimated, to the point that one of the major web browsers does not support it at all; probably in favor of simplicity and ease of use, and “permanently connected people”. But, STILL... 😕
  6. It's not a good idea to leave a chainsaw and a newspaper near the reach of a cow.

I recently switched to Debian Squeeze, which is still under development (e.g. Testing) as of this writing. Originally, I just got a newer revision of Iceweasel 3.0 with the set. Some weeks ago, I got upgraded to 3.5.

As I mentioned in my previous post in this series, the status bar does still glitch a lot — no, wait — the status bar glitches even more than in 3.0. Scrolling is less laggy but only with smooth scrolling disabled, although I am not exactly using a well-supported video configuration at the moment and I probably should not complain about performance issues with any 2D application unless I'm willing to use the unaccelerated VESA driver for benchmarking or shut up.

The Live Bookmarks feature stopped working after the upgrade until I went and manually reloaded every single Atom/RSS feed I had linked in a neat folder in the bookmarks toolbar. It took me a while to realize that nobody posting anything near Christmas was a bad sign — I didn't miss much anyway, since my feed sources aren't really chatty. Yes, I know I'd be better using an actual feeds reader, but I'm just that lazy, which is also why I don't use Opera as much as I want.

However, this version of Firefox is much, much more stable than 3.0 — as far as this AMD64/EM64T architecture user is concerned, that is. Firefox just got better, really. But it's still rather odd because I've heard comments on IRC of people claiming that it got more unstable instead. Hmm... Well, maybe Windows or x86 Linux users are less lucky this time?

Firefox 3.5 also supports the CSS text-shadow property, which was introduced in the CSS level 2 specification, removed in revision 1 (CSS 2.1), and seems to have been picked up again for CSS 3. No version of Internet Explorer before and including 8.0 supports this (although ISTR that they support a shadow filter using a custom extension to CSS that didn't even follow the specification for naming vendor-specific properties), and current Opera, Safari and Chrome support this property well. That means that I must make more use of it in this site's stylesheets from now on. 😉

Creativity drops

At this point, I should know better than tempting fate in a blog post:

[...] Some days before Xmas, my creativity returned from its long, chaotic journey and my Wesnoth add-on, After the Storm (sequel to Invasion from the Unknown has seen steady progress and two new releases were published in less than two weeks. Keep in mind that this add-on had not seen any public releases for almost a year. [...]

Two days afterwards, my creativity disappeared like a drop of water in the sea, again — which means that After the Storm has seen relatively no progress since then. I hope I get better next week, because this work needs to be completed as soon as it's possible.