Windows Installer is evil

So, last night, I was trying to uninstall some software while using Windows Vista in “safe mode”.

Windows Installer warning in safe mode

... but apparently you cannot uninstall Windows Installer-based software when running in safe mode. What the hell.

The screenshot comes from Windows 7 RC 1 but this problem also applies to earlier versions. Windows XP and earlier won't display any warning at all though — you will just be mercilessly thrown back to the Add/Remove Software control panel.

Apparently Windows is built upon the assumption that I won't want to save disk space and overall performance by disabling System Restore and, that way I cannot uninstall problematic software while in safe mode. Why would one to do that anyway? Well, some software components may run automatically whenever one logs into their account, and maybe the software isn't working properly and locks up the Windows desktop or maybe the entire systems when it runs. Granted, I could stop the software from automatically running when logging in using the Registry Editor or msconfig.exe... and hopefully it's not a shell extension that locks up the Windows desktop, though — those are harder to remove by hand, but I guess there are third-party tools out there (um, hopefully certified by Microsoft to be safe?!) to help in such cases. Then again, why can't I enjoy mass-removing software with the improved performance that safe mode implies? So, why shouldn't the Windows Installer service be able to run in safe mode? (Why is it supposed to be a system service that runs on the background stealing precious system resources anyway? Okay, Windows doesn't start it automatically unless some application demands it, but it'll continue running on the background after it's finished.)

Maybe all these limitations are useful or bearable to the regular “non-computer person”, but when the software I use doesn't let me do what I want, the way I want, I feel like it's secretly laughing at me as if it was some sort of evil trap.

Fortunately there are still sane software developers out there that don't use the Windows Installer system, even if that means that the user and application programming interfaces for managing software packages in Windows will never be as consistent as they are in GNU/Linux distributions such as Debian (dpkg) and Red Hat (rpm) based ones.

And talking about Debian, dpkg and apt-get run perfectly fine here on the single-user runlevel. YaST2 (which has, amongst other nice things, a rpm front-end) also works under such conditions on openSUSE.

Windows 9x on VirtualBox 3.0.8

Defying the laws of Common Sense™, I have created some virtual machines with VirtualBox 3.0.8 to run Windows 95 OSR 2.0, Windows 98 SE and Windows Me. There's no practical use for them whatsoever, except maybe testing how good modern websites look on old operating systems and browsers or something, or pushing the emulator/virtualizer to its limits, out of the "safe area".

The thing is, they actually work, to various degrees:

  • Windows 95 OSR 2.0 cannot run with hardware (AMD-V) virtualization enabled; otherwise, it will halt the system complaining about a "Windows protection error" when loading some component.
  • Windows 98 SE works somewhat slowly in software emulation mode, and crashes very frequently during the boot process when using hardware virtualization.
  • Windows Me is unusably slow in software emulation mode; however, it works mostly fine and fast with hardware virtualization enabled, bar some occasional boot-time BSODs.

The network card drivers included with each one work just fine, out of the box even, except that Windows 95 OSR 2.0 doesn't install the TCP/IP stack by default and it's necessary to add it by hand with the Network control panel. And we are talking about software that included Internet Explorer 3, forcibly preinstalled even.

For the video controller I am using the VBEMP x86 driver for Windows 9x (more specifically the 2008.10.21 build) which supports extended graphic modes, up to 1600x1200 with a color depth of 32 bpp. It is mostly stable if not a little slow for some operations, especially when running without hardware virtualization; reducing the color depth and/or resolution should help. The only major problem I have spotted so far is that the screen gets garbled when opening a command (DOS) prompt, but that is easily solved by switching the prompt's fullscreen mode a few times. It also happens at random times when running Windows Me because of some stupid background process that runs from time to time with a minimized console; in such cases, I use CTRL+ESC and then R to bring the Run dialog and start a DOS prompt (command.com) from there and toggle fullscreen mode as required.

It goes without saying that performance may be increased by disabling some features such as window animations, showing window contents while moving, etc.

There's a somewhat detailed tutorial on the VirtualBox forums about running Windows 9x as well, but I didn't follow it, and I found about the VBE drivers from a qemu-related FAQ instead.

Obviously, this is just experimenting with VirtualBox a little too much, and for real work it's better to use Windows 2000 or XP instead. The main motivation for trying these operating systems in it despite Sun's recommendations is that qemu's Cirrus Logic emulation has gone downhill ever since 0.10.0, making the VBE drivers almost a requirement, and the introduction of a resizable window frame in 0.11.0 is more a bug than a feature for me. It is only annoying when running operating systems in resolutions greater than what my KDE 3.5 setup supports because it doesn't play nice with KWin, and when the window gets resized there's no apparent easy way to get rid of the blurry appearance.

I'm not trying VirtualBox 3.0.10 yet since there are no changelog items of my interest at the moment; it's a fairly large download after all.

UPDATE 2009/11/07: updated to 3.0.10, no new problems with Windows 95 OSR 2, 98 SE or Me so far.

RadeonHD and kompmgr, Part II

With EXA support enabled in the xorg.conf for radeonhd, I started to get X.org lock-ups when resuming from s2disk on the 2.6.28.8 kernel at random times; sometimes after 2 days of uptime, sometimes after just 15 minutes. Since I'm using a laptop with that funny Fn key layout, pressing alt-sysrq-k (SAK), alt-sysrq-u (remount read-only) and alt-sysrq-i (kill all tasks but init IIRC) makes the kernel believe I have pressed alt-sysrq-<insert numpad key here>, which doesn't do anything but set the kernel loglevel. Not being able to see anything on the pitch-black screen, all I have left then is alt-sysrq-s (sync system call) and alt-sysreq-b (reboot) and let fsck replay the ext3 journals while restarting.

Hearing that the drm kernel modules at the freedesktop.org repositories are not updated with the mainline kernel changes, I decided to try out the kernel 2.6.30-rc2. Fear. Despair. It locked up forever while suspending to disk.

Either it apparently did something more than mereley locking up, or the freedesktop.org drm and radeon kernel modules are more broken than I expected, but some hours after rebooting to 2.6.28.8 and doing my homework on the laptop, most of the running GUI programs crashed with random SIGSEGVs and SIGABRTs. Yes, what you read - even the KDE window manager got busted. Interestingly, I could bring it up again with some effort without restarting X, but then I noticed that one of the apps I was working with, Eclipse (3.4.x), didn't want to start again. When launched from the console, it would exit around 2 seconds after being invoked, with no explanation. After fiddling around with my ~/.eclipse dir, restarting X various times just to see knotify and artsd crashing like mad on KDE startup, I started session as another (dummy) user in X and tried to launch Eclipse 3.2 instead - its explanation was a bit more satisfactory: missing libraries.

"Of course", I thought, "the ldconfig cache may have become corrupted".

I kind of hit the nail:

bluecore:~# ldconfig
ldconfig: Cannot mmap file /usr/lib/libartsmidi.so.0.0.0.
ldconfig: Cannot lstat /usr/lib/libartsgui.so.0.0.0: Input/output error
ldconfig: Cannot lstat /usr/lib/libpangox-1.0.so.0.2002.3: Input/output error
ldconfig: Cannot mmap file /usr/lib/libwine.so.1.
ldconfig: Cannot mmap file /usr/lib/libpangox-1.0.so.
ldconfig: Cannot lstat /usr/lib/libpangoft2-1.0.so.0.2002.3: Input/output error
ldconfig: Cannot lstat /usr/lib/libakregatorprivate.so: Input/output error
ldconfig: Cannot mmap file /usr/lib/libartsmidi.so.0.
ldconfig: Cannot mmap file /usr/lib/libpangoxft-1.0.so.0.2002.3.
ldconfig: Cannot lstat /usr/lib/libartsgui_idl.so.0.0.0: Stale NFS file handle
ldconfig: Cannot lstat /usr/lib/libartsbuilder.so.0.0.0: Input/output error
ldconfig: Cannot mmap file /usr/lib/libpangoft2-1.0.so.0.
ldconfig: /usr/lib/libwine.so.1 is not a symbolic link

The 2.6.28.8 kernel was not amused by that attempt:

Apr 21 00:21:56 bluecore kernel: init_special_inode: bogus i_mode (3000)
Apr 21 00:22:44 bluecore kernel: init_special_inode: bogus i_mode (473)
Apr 21 00:26:40 bluecore kernel: init_special_inode: bogus i_mode (55000)
Apr 21 00:26:40 bluecore kernel: init_special_inode: bogus i_mode (0)
Apr 21 00:26:40 bluecore kernel: init_special_inode: bogus i_mode (71165)

My thoughts at that moment: WHAT THE F* HELL HAS HAPPENED!!!?!?! Considering that I hadn't run Wine for months, and it wasn't running at that moment either... I ran some ls's to see what had become of those libraries... it wasn't pretty. They had become character, block devices and FIFOs. 😕 I rebooted, and fsck still ignored the "clean" /usr (/dev/sda5) filesystem until I forced it to check it all filesystems again by setting their mount counts to some funny values using tune2fs.

Indeed, /usr was corrupted.

After init dropped me to a emergency console to run fsck manually on sda5, running e2fsck threw some interesting stuff, which I forgot to save to a file, but it was something like this:

inode ####### has 'compressed' flag set on an unsupported filesystem
inode ####### has invalid size
inode ####### (fifo) has non-zero size
inode ######## has 'compressed' flag set on an unsupported filesystem
((ad infinitum, ad nauseam))

After it finished, I noticed that the /usr/lost+found directory got populated with files and directories resembling parts of the Python 2.4 and 2.5 foundations installed. Of course, missing entire directories is a really bad sign. But I could recover the affect packages I noticed, with help of apt-get install --reinstall. Those where anything aRts, Python, wine, Pango and Akregator (which I actually remembered to reinstall right now, while writing this), although I reinstalled the Samba server I had installed just some hours before the crash, just in case. Apparently I didn't lose anything else. None of the other filesystems were affected by whatever smashed parts of /usr.

The obvious course of action then was blacklisting drm and radeon, and removing the Option "AccelMethod" "exa" line in xorg.conf, and nuking the 2.6.30-rc2 kernel image just in case. 😛 Eclipse and artsd worked again, and nothing else has shown any symptoms of being affected by fsck's decisions since then. It's been just 5 days though, and I have many apps lying around that I only use when asked/forced to, so I don't even remember their names.

I didn't say goodbye to kompmgr, however; it has a pretty decent performance and marginal CPU usage even while using a shadow framebuffer, with just one CPU core enabled at half of its maximum frequency, and even if the system is under load, so I figured that I really didn't need DRI or EXA all the time (radeonhd still doesn't have a opengl implementation, so having DRI enabled made no difference for clients) Still, it was nice to be able to scroll a huge konsole or Iceweasel/Firefox window without experiencing flickering.

RadeonHD and kompmgr

Thanks to the help of the guys at the #radeonhd channel on Freenode.net, I could quickly find out how to make DRI (and thus, EXA) work with my laptop's onboard graphics... thingy... I'll let lspci speak for itself; it's much easier that way:

shadowm@bluecore:~$ lspci
00:00.0 Host bridge: Advanced Micro Devices [AMD] RS780 Host Bridge
[...]
01:05.0 VGA compatible controller: ATI Technologies Inc RS780M/RS780MN [Radeon HD 3200 Graphics]
[...]

Let's not forget that this is my current (it was new on December) laptop, not the old, broken lappy, and I'm running Debian stable.

The thing is, with their help I installed a newer-than-Debian (or actually, newer-than-the-kernel-that-is-newer-than-Debian's) radeon drm module, and also solved a little mistake that made X lock up after resuming from suspend-to-disk with DRI and EXA enabled. Now I can enjoy some real 2D acceleration at last!

Although it seems, judging from the git repository's logs, that the official EXA and XRender acceleration support for this chipset is pretty recent, it is very stable compared to a certain other driver for another onboard chipset which I won't mention here. I could even enable KWin's composition manager (kompmgr)!

I have not tried the other big composition/window managers (e.g. compiz), nor intend to, as they require a hardware OpenGL interface which is not available yet for this chipset, and since they are labeled "window managers", I'm guessing that I'll definitively lose the KWin look-and-feel that kompmgr doesn't intend to replace).

Nonetheless, my very limited candy set-up has an incredibly low CPU overhead (either that or the radeonhd driver is f***ing awesome), and I can even let it enabled while in powersave mode, and no matter what I do, I don't see any impact in the other program's performance. I couldn't say the same thing about radeonhd before getting EXA.

One really odd thing I noticed, though, is that kompmgr doesn't seem to want to have anything to do with translucency and pop-up menus. Side-effect, it is KDE (Qt3?) itself who must provide and option for that in the Control Center -> Styles page. But even if I tell it to use XRender acceleration, something doesn't fit - and it is that the changes going on windows below a pop up menu are not seen, e.g. the pop up menu rendering is completely static. This unlike regular translucent windows. I don't know what benefits this design decision may bring, but it's KDE 3.5.10 anyway, and KDE 4 users would laugh at me for my obsession with not switching to 4.x.

Note that I am aware that 4.2.x has fixed a lot of the stunningly awful usability and configurability regressions seen in earlier 4.x versions. But I am still pondering why all Qt 4 widget engines are pretty slow, compared to Qt 3 - so, until I get an answer of the kind "it's just you" or "can be solved" or "has been solved already", I'll not consider switching to KDE 4 an option.

Build systems strike again

Since the introduction of the testing SCons-based build system to the (Battle for Wesnoth)[http://www.wesnoth.org] project, I have been annoyed repeteadly by its decreased performance in comparison with the old autotools based system we were using, and its increased power consumption in my laptop.

I felt alone in this world... since March IIRC, until I stumbled upon our Debian packager's blog entry about it just yesterday.

I cannot deny that Loonycyborg and ESR have done a great job in making the SCons build recipe for Wesnoth better over time, but there are these tiny issues that they cannot overcome without modifying SCons' source code itself and requesting all our users to use a patched version of SCons for that. 😕 Nonetheless, it annoys me that users have to install a non-GNU tool to be able to even see the build options for the software. This is certainly one of the good things about autotools: you generate the processed recipe and it will run on any machine with the UNIX or GNU coreutils, a compatible sh* and make! Of course, assuming the author of the raw recipe (configure.ac/in and friends) did not use what is called "bad practice" in it (bashisms, silent environment requirements, etc.). I still have to find a processed autotools recipe (Makefile.in, configure) that fails to run and do its job from a released source code distribution in any FLOSS project.

So with autotools it is rarely needed to install the recipe-processor tools (aclocal, autoconf, automake, autoheader, autopoint) if one wants to run software, not develop it. Yet with SCons and... CMake (the other candidate replacement for autotools at Wesnoth), it is necessary to install the equivalent of Apache server in the client machines for the equivalent of downloading a set of files from a public area of the server. Why?

That said, I'm personally sticking to autotools for managing builds (and have fun tailoring them to temporary needs at times!) in my personal project, Mesiga, until the GNU project comes with a better solution... should that be possible. Moreover, I'd personally maintain the autotools recipe in Wesnoth if our Release Manager wouldn't have been so persistent in the "let it rot" policy.

The fact that SCons project's homepage is filled with propaganda from big people in the software industry such as iD Software and ESR, is even more disturbing for me. It makes me think it... it... IT IS A TRAP! 😮 The "What makes SCons better?" section is fearsome... to me it looks like 'featuritis'. There are so many features built into SCons that they are overwhelming to me. 😕 Why users have to install this big piece of software in their machines if they just want to build some software, I mean?

Thanks goodness they didn't make adding new source code files to targets harder than with autotools.

Half-assed commits

During my work on the Coordinated Wesnoth User-Made Content Development Project (which we dub "wesnoth-umc-dev" for short), I came up with an interesting concept related to Subversion's standard workflow. Half-assed commits are revision commits to the Subversion repository that are not completed due to the subversion client (or server!) process dying unexpectedly, usually due to anything but a SIGTERM.

The obvious symptom of a half-assed commit in your local file system is a bunch of 'L' flags in the svn st command output. These can be removed with svn cleanup. So, most half-assed commits are harmless to you. However, according to the (holy) Subversion Book, it may leave garbage, half-assed transactions in the repository. These are not viewable to anyone but the repository admin of course, and should not harm anyone provided the filesystem on which it resides does not run out of space.

😐 Last afternoon I ran into a more harmful and painful sort of half-assed commit. I renamed some files in my working copy, invoked svn ci, and my crappy Wireless LAN connection burped just when it was about to update the working copy with the changes introduced to the repository:

Transmitting file data ...svn: Commit failed (details follow):
svn: MERGE request failed on '/svnroot/wesnoth-umc-dev/trunk/Invasion_from_the_Unknown'
svn: MERGE of '/svnroot/wesnoth-umc-dev/trunk/Invasion_from_the_Unknown': Could not read status line: Connection reset by peer
(https://wesnoth-umc-dev.svn.sourceforge.net)
svn: Your commit message was left in a temporary file:
svn: '/home/shadowm/src/wesnoth-umc-dev/trunk/Invasion_from_the_Unknown/svn-commit.2.tmp'

Unsurprisingly, I was left with my files in an awful state that caused local conflicts with the repository. That is, next svn update failed because the commit above was successful for the server, leaving the renamed files in the repository. SVN just didn't like that at my end, because I had those renamed files already in the working copy as result of the svn move result I just (half-ass) commited.

Thanks for nothing SVN! Seriously, the protocol should have the server request for a final confirmation from the client to check-in the transaction after its changes have been merged in the client's working copy. Or the inverse: have the client react in a smarter fashion to these situations that people like me often run into.

Mozilla Firefox 3.0

openSUSE 10.3 ships with Firefox 2. I switched to Firefox 3 from the "Mozilla" repository a few weeks after it came out (missed the download day/party). So far so good. Most user interface changes are nifty, except for the change of the History menu layout - the history sidebar cannot be enabled from there unlike in previous versions. CTRL-H or the View menu must be used instead. Awkward, but I can live with it thanks to keyboard shortcuts.

Flash-embedding pages (YouTube amongst others) good. No crashes when watching Flash videos although I use a crashy X.org display adapter driver ('radeon'... don't even ask about 'fglrx').

The problem goes when I use seemingly simpler features that I have known since at least Firefox 1.0. I am a laptop user, and I'm often disconnected from the Internet. I use the browser's cache to read pages that I had already skimmed since I can't be bothered to keep zillions of HTML downloads in my home dir. Then problems arise.

Seemingly this version Firefox crashes at random, specially when its session has run for long (t > 30 min.) time and one does stuff in the page view area such as scrolling or clicking on text while the sidebar is active. What a pity, because I like the history sidebar much better than this new separate "Full History" window. More pity is that the cache gets invalidated after a session crash and its contents get wiped out. Of course... I'm not the best person to judge whether this is a bug or a feature, since I don't know much computer security; but I can tell it annoys me to the point I have to be doing backups of the cache et al after closing Firefox successfully:

$ rm -rf ~/.mozilla2 && cp -rf ~/.mozilla ~/.mozilla2

Then if Firefox 3 crashes, I restart it, close it again, and restore the Cache directory from .mozilla2/firefox/SeeminglyHashedSessionId so I can continue reading from it when I'm offline.

By the way, the offline cache (for any browser) seems to be often underestimated. Some time ago, after the www.wesnoth.org server crash, I got some forum pages from Firefox's cache and uploaded them to this website in a hidden directory to serve as a partial, temporary mirror for people for having a guide to get back to work after the 2-months roll back of the forum's database.

The cache issues apart, the fact that Firefox gets crashy for nothing disturbs me. Wasn't this version supposed to be more stable than 2.0 according to the announcements? The rendering also got some performance regressions. Some pages take longer to be rendered than downloaded, specially those with heavy use of scripts. Those affected pages usually are also sluggish to scroll up/down, no matter if I disable "smooth" scrolling. I never experienced anything of this with the same pages on the same OS (openSUSE 10.3), the same architecture (x86_64) and earlier version (2.0.0.x) of Firefox.

Perhaps this whole download-day thing was just a trap. Or they put more attention to the Windows and MacOS X ports rather than the GNU/Linux one. Or I am cursed to have bad luck for the rest of my life. Whatever it is, I don't like it, and I'm seriously considering switching to a better open source browser for Linux; IceWeasel may be it if it is a fork that is being developed on its own. I have yet to do the switch to Debian.