Rewriting the past, and the woes of SVN
Long ago, I wrote a little Bash script — set-properties
— for the Wesnoth-UMC-Dev project, to ensure the correctness of SVN properties such as svn:keywords
and svn:executable
on files. It was pretty simple:
#! /bin/sh# Set properties on PNG filesfor f in $(find -iname *.png); dosvn propdel svn:executable $fsvn propset svn:mime-type image/png $fdone# Set properties on Ogg filesfor f in $(find -iname *.ogg); dosvn propdel svn:executable $fsvn propset svn:mime-type audio/x-vorbisdone# Set properties on PCM filesfor f in $(find -iname *.wav); dosvn propdel svn:executable $fsvn propset svn:mime-type audio/x-wavdone# Set properties on JPEG filesfor f in $(find -iname *.jpg); dosvn propdel svn:executable $fsvn propset svn:mime-type image/jpeg $fdonefor f in $(find -iname *.jpe); dosvn propdel svn:executable $fsvn propset svn:mime-type image/jpeg $fdonefor f in $(find -iname *.jpeg); dosvn propdel svn:executable $fsvn propset svn:mime-type image/jpeg $fdone# Set properties on CFG filefor f in $(find -iname *.cfg); dosvn propdel svn:executable $fdone# Set properties on scriptsfor f in $(find -iname *.sh); dosvn propset svn:executable '*' $fdonefor f in $(find -iname *.cmd); dosvn propset svn:executable '*' $fdonefor f in $(find -iname *.bat); dosvn propset svn:executable '*' $fdonefor f in $(find -iname *.py); dosvn propset svn:executable '*' $fdone
Some time after I learned Perl with my work on Shikadibot 0314, I rewrote that script in Perl to arrange the “ideal” property values in a neat table (hash), check current properties instead of blindly overwriting them in the working copy, and cover plenty of other file types. It also gained a blinking progress bar to display the search progress for some reason.
To have an idea of how known file types are defined in the source, let's take a look at these bits:
## proptab:# extension => [svn:executable, svn:mime-type, svn:eol-style, svn:keywords]# properties set to the empty string '' (except svn:executable) are left unchanged;#my %proptab = (cfg => [FALSE, '', 'native', ''],ign => [FALSE, '', 'native', ''],"map" => [FALSE, '', 'native', ''],# [...]pl => [TRUE, '', 'native', 'Author Date Id Revision'],# [...]gif => [FALSE, 'image/gif', '', ''],png => [FALSE, 'image/png', '', ''],# [...]xcf => [FALSE, '', '', ''],# [...]);
The table we have used since then (around Sept. 2008) has always contained more than 10 extensions with their minimum required property sets. As of this writing, it covers approximately 57 file types. Keep this on mind.
It would be overkill to fork-exec find
processes to discover paths that could require SVN property changes, right? So, instead, I used find2perl
to generate File::Find
client code to embed it into set-properties
. So far, so good. But how about running that code (scalar keys %proptab)
times (e.g. number-of-extensions-times) anyway? Overkill?
No! It's plain stupid. But definitely less stupid than what you are about to read.
I apparently decided, for some reason, that any matching paths in each cycle should be added to a plain scalar (a text string to be exact) separating individual paths with newlines, of all things. Then, another cycle is performed at the end, (scalar keys %proptab)
times again, to put each array of paths matching a certain extension into another hash, then iterating over the newly inserted hash element (array of paths) checking and fixing SVN properties in the same cycle.
Very roughly summarized as the following pseudocode:
FOREACH extension FROM file_extensions FIND IN ./ AS file ; all while displaying a cute blinking bar! IF file MATCHES extension APPEND file TO file_list_string END IF END FIND END FOREACH FOREACH extension FROM file_extensions ; split 'file_list_string' every newline FOREACH path FROM (SPLIT /\n/ file_list_string) INSERT path IN file_index[extension] END FOREACH FOREACH path FROM file_index[extension] fix_properties( extension, path ) END FOREACH END FOREACH
Careful readers will quickly realize that something is horribly wrong with this pseudocode. I wish I was making it up. This algorithm has actually been in use by the Wesnoth-UMC-Dev set-properties
script for one year and 4 months! I must have been on something when I wrote this shit. I've honestly never seen any program so awful as this in my life. Not even build-external-archive.sh
(a.k.a. “Scrappy”) can compete with this abomination.
So, yesterday, I took a look at set-properties
after noticing how much CPU time it ate working with very scarcely populated directories of the project — and I was blaming the usage of backticks (svn propset foo bar baz
and such) as a possible cause of overhead. Then I slowly realized what I had written. There isn't any emoticon here for the expression in my face at that moment.
Thus, set-properties
got rewritten with a much cleaner and simpler algorithm:
FOREACH file FROM (FIND IN ./) ; no more useless cute blinking bar! FOREACH extension FROM file_extensions IF file MATCHES extension fix_properties( extension, path ) END IF END FOREACH END FOREACH
So yeah. 🙁
But wait, there's more! While optimizing replacing the script I also replaced svn foobar
backtick code with invocations of libsvn
, via the SVN::Client
module. This worked very well at the end, but I discovered a few things in the process:
- Those
SVN::Client
methods I used choke on non-absolute path specifications for some reason, causing an assertion failure inlibsvn
's C back-end and terminating the execution of Perl and the script with a SIGABRT. - Despite the documentation's claims for
SVN::Client::url_from_path()
returningundef
if the specified path is not under version control, it actually causes the module to invokedie
and get the client script terminated. Which means that I cannot even check if a file is versioned or not safely (e.g. without resorting to.svn/text-base/<FileName>.svn-base
existence checks). What the hell?
$ set-propertiesperl: /tmp/buildd/subversion-1.6.9dfsg/subversion/libsvn_subr/path.c:114: svn_path_join: Assertion `svn_path_is_canonical(base, pool)' failed.Aborted
Turns out the solution is to wrap SVN::Client
method calls in eval
blocks and handle whatever crap SVN comes up with. Oh, and make sure all paths are absolute using Cwd::realpath
so that libsvn
doesn't hit an assertion failure killing us, eval
or no eval
. How nice.
# No point in working with unversioned files.my $svn_ret = undef;eval { $svn_ret = $svn->url_from_path($path) };if($@) {# fucking libsvn dies if $path isn't under version control;# the documentation says it should return undef above instead!return;}
With this, I have absolutely lost my faith in Subversion's excellence as a version control system not only as a normal user, but also as a programmer integrating it into my own client applications. And I know I lost my faith as a user after seeing svn
sit for two days doing fucking nothing because the damned connection died after 1 minute of running time — and then svn
spent the next days ignoring SIGTERMs all the time, no less. It's been like this for so many versions that I'm almost convinced it's intentional.
(Ah, that was relaxing. I should do this more often.)
Wesnoth-UMC-Dev has to continue using SVN because we have several add-on authors using Windows and the few alternatives for using git
(I love git
, I don't think I could try anything else at the moment) on Windows seem to be rather awkward to install and/or use for the average Non-Computer Person. That's a pity. It's really a pity.
Finally, set-properties
got renamed to umcpropfix
for a change, to mark its rebirth after I solved the algorithmic mess above last night. It doesn't have a nifty codename like the rest, though, but the new umc
-prefixed name is still something to be celebrated now that we are going to have umcdist
, umcstat
, umcreg
and umcbotd
. Yes, I know I'm crazy, thank you.