Christian Reis lives here

I know you know. But well, just in case you forgot..

Since 2004, I've been actively involved in development of Launchpad, and in 2005 I became application manager for the project, together with Steve Alexander. These days I lead a team of over 30 people at Canonical working on building a platform for the future of open source development and collaboration.

In 2003, I somehow managed an MSc degree from USP São Carlos, where I wrangled out my dissertation on defining a Process Model for Free Software Projects. My MSc project is described in two long documents (in portuguese). I graduated in Computer Engineering from UFSCar in 1997, though most of that time evaporated into swimming pools and bike trails.

A couple of years ago (just as I had decided I wanted nothing to do with computers) I discovered Free Software and Unix, and I've been working on both ever since then. I've contributed to dozens of free software projects, and I am currently an active developer for Bugzilla, PyGTK, ZODB, Kiwi and IndexedCatalog. I've worked with Web development (who hasn't?) and Usability, additionally, in the past years.

I am a partner at Async Open Source, a company that provides development and consulting services focused on on Free Software. I helped found Async in early 1999.

When I'm not pretending to be a software engineering manager I engage in outdoor sports, travelling, language and vain philosophy. I've raced mountain bikes for a couple of years now, and from 1999 to 2003 I raced a number of national-level adventure races, including the multi-day EMA 2000 and 2001.

Getting in touch with me

Online: Homepage (~kiko)
<kiko at async.com.br>
Phones: +55 16 3376 0125 work
+55 16 9112 6430 mobile
Home: (map) Rua Rui Barbosa 1977
Sao Carlos, SP
Brazil 13560-330

What he's been up to

15.12.2009 Evolution and Google Calendar
  • This works the way you'd expect, just adding a new calendar in Evolution, and it syncing automatically to the calendar applet. But there are a few caveats to it:
  • Google provides you with https:// URLs; Evolution expects them to start with webcal:// and will unhelpfully tack on the method in front of the URL for you, so you end up with something that starts with [https] -- yuck. Just delete the https:// in front of the address and everything works (though I am surprised it does, as I'd expect it to be SSL-only -- oh well). I didn't use Secure connection and it still worked -- don't know if it should but I'm not pretending to know what I'm doing here!
  • If it doesn't work, I bet your system provides settings for a network proxy. For some reason, this doesn't work for calendaring -- perhaps [bugs.edge.launchpad.net] has something to do with it -- but the workaround is pretty easy, just don't use a proxy for Evolution, or provide details there manually.
  • Whatever you change, it appears you have kill the additional processes Evolution spawns to actually have them pick up the changes. At least I did knock out evolution-data-server and evolution-alarm-notify for good measure.
  • If something goes wrong, you get notified of a problem with the ics URL through the statusbar, which might be a bit unexpected -- if you click on the icon in the statusbar you get a dialog with more details -- in my case, a connection problem.
10.12.2009 WCDMA and UMTS
  • For Brazilians, the important thing to remember is that 3G frequencies vary according to operator and location, and [www.gdhpress.com.br] tells you more about that sort of thing. In general, for my operator, Claro, north american versions work, but that means no E52 for me, since the US version isn't out yet.
09.12.2009 mdadm, USB and SCSI, UUIDs and initramfs-tools
  • We've had a problem with our Ubuntu server since last year when I bought an external USB drive to handle rdiff-backups: when the bootup process starts, if the USB drive is plugged in, the RAID-5 arrays we have get confused because the drive names change -- normally sda is the first SCSI drive, but since probing is asynchronous, when the USB drive is detected first it's the external drive and mdadm fails into an initramfs shell. That's not nice!
  • A separate problem we were also running into was that the spare partitions, which normally live in /dev/sde, were not being automatically added to the array (even when the USB drive wasn't connected and the bootup succeeded) so I had to add an ugly rc.local command to add them in. This is similar to the issues found at [bugs.edge.launchpad.net] and [ubuntuforums.org] but not quite the same, as mdadm.conf's DEVICES entry actually included the spare drive.
  • So from 7pm to 11pm yesterday Johan and I worked on figuring out exactly what was causing this. And it turns out that it was a combination of two things, both related: the mdadm.conf ARRAY definitions, and a race condition during the initramfs process between the RAID autostart and the module-based kernel hardware probing. Here's how it works.
  • Ubuntu starts and mounts its RAID arrays in an initramfs. The initramfs image is packed with a set of shell scripts; the image is assembled from a bunch of code in /usr/share/initramfs-utils, primarily scripts with additional hooks that are run during the initramfs image generation and which modify the image itself. Anyway, while in the initramfs, the following happens:
  • Between initramfs' init-top and init-premount phases, it loads all essential modules. It knows to load the RAID modules through the work of hooks/mdadm (installed by the mdadm package), which adds them to the essential module list inside the initramfs. It knows to load the USB and SCSI modules because by default Ubuntu uses MODULES=most in its initramfs configuration.
  • Now, module loading order is deterministic, but in Ubuntu the SCSI probing is set to asynchronous (via the kernel option CONFIG_SCSI_SCAN_ASYNC=y; see [lwn.net] for details) the actual drives show up as the bus scans are finalized.
  • Meanwhile, in init-premount, a udev script fires off udevd which goes away and kicks off 85-mdadm.rules' RAID assembly, using mdadm --incremental. This is a pretty magical mdadm mode, and a read through the manpage section is pretty interesting.
  • Also in init-premount, an mdadm script checks for degraded RAID and LVM devices, and figures out whether or not to try and run degraded arrays. It does this by scanning the RAID superblocks using mdadm --misc --scan --detail.
  • After loading the modules, a script is run to mount the local filesystems; on non-NFS systems this script is called "local", and this is where we see if the root device is present and mountable.
  • After all this is done, the init-bottom scripts are run; here is where the udevd process is killed and whatever handling of the available hardware stops until udevd is started again.
  • The problem we are running is that the initial SCSI bus probe is done in a flurry of asynchronous activity; if you stop and read your kern.log after a reboot you're find out just how random the ordering is. What can happen on systems with lots of drives (such are ours, which has 6) is a race between the SCSI bus scanning, udev's triggering of 85-mdadm.rules and the killing of udev after the root filesystem is mounted. The race happens because the bus scan is asynchronous, which in turn means that the devices (which if you noticed above, are being added via mdadm --incremental) might take too long to show up -- long enough that udevd is killed and the real init starts trying to mount the rest of the (still incomplete) local filesystems.
  • To solve this, we simply added a script which sleeps for 15s in initramfs' local-premount; this is enough time for the hardware probing and udev rules to complete firing, ensuring that all RAID devices have been assembled and are available for mounting once /sbin/init kicks in. Simple but does the trick. It is likely that changing the scsi_mod scan option to "sync" would also solve this part of the problem. Yet another option would be to load the scsi_wait_scan module in local-premount.
  • We had an additional problem, which was that our mdadm.conf file specified sdX-style partition names for each ARRAY:
     ARRAY /dev/md2 devices=/dev/sda5,/dev/sdb5,/dev/sdc5,/dev/sdd5
    This doesn't play well with mdadm's --incremental mode when the drive order is changing around -- so when the USB drive appeared as /dev/sda the array could never be initialized and we failed to mount the root device. Changing the ARRAY line to refer to the MD device through a UUID solved this problem:
     ARRAY /dev/md2 UUID=b3b855d3:d8ee85f4:2ee19fc3:ff71564e
    As a nice side-effect once we made this change, mdadm --incremental started assembling our spare devices into our arrays: I had never been able to specify spares using the devices= syntax.
  • I suspect that in part [bugs.edge.launchpad.net] is caused by the on-demand nature of the module loading, but I'm not entirely sure -- perhaps specifying an explicit DEVICE line in mdadm.conf causes each device to be hit, which in turn means the right modules are probed. But I think it's actually a red herring, and that in fact the problem is with the modules Jan was missing in the initramfs.
  • Finally, it is strange but true that in the cases where mounting non-root local filesystems failed because of the udevd race, udevd in userspace never kicked the incremental mdadm again to finalize the running arrays, and even when running cat /proc/mdstat from the prompt, you could also see they were left incomplete or lacking spares. I'm not sure why this happens.
08.12.2009 Bad source addresses on my local server
  • Since a few reboots ago we ran into a pretty annoying issue on our server: packets originating on the server that were delivered to the server itself, handled by lo, were using the wrong source address. In more detail: our internal address for the server is 192.168.99.4. If when sitting on the server and I did:
     kiko@anthem:~$ telnet 192.168.99.4 80
     Trying 192.168.99.4...
    the connection was never completed. If I looked at a tcpdump trace
     17:18:17.439479 IP 189.x.x.x.42826 > 192.168.99.4.80: S 2220677816:2220677816(0) win 32792 mss 16396,sackOK,timestamp 6978218 0,nop,wscale 6
    it became clear that the wrong source address was being selected. But why?
  • I spent a lot of time reading and rereading the source address selection descriptions at [linux-ip.net] and couldn't figure it out. Everything in my routing tables was kosher -- and even if I cleared out all our fancy ip balancing rules and used plain turkey kernel and default gateway rules, the address was wrong. The src hints were there. There was no funky ordering issue. So why was the address wrong?
  • I finally read a slightly unrelated post at [lists.unix-ag.uni-kl.de] that gave me another idea: NAT! I hadn't thought of this possibility, nor did I recall changing anything there, but it was worth a try.
  • Turns out it was exactly that. I had a rule which said that:
     -t nat -A POSTROUTING -s 192.168.99.0/24 -o !eth1 -j MASQUERADE
    This would have worked perfectly if the packets were actually going to eth1. However, in the case where you are on the server connecting to itself, the packets go to the lo interface, but with eth1's source address 192.168.99.4, which matches that rule -- oops.
  • There are various solutions to this problem -- the one I went with was simple, specifying a destination network of ! -d 192.168.99.0/24 -- in other words, we only masquerade packets that are actually meant to be routed elsewhere. I could also have specified two ACCEPT rules that came before the MASQUERADE rule, avoiding masquerading for eth1 and lo simultaneously.
  • The trickiest thing with masquerading is that changing the rules takes a while to actually kick in -- it's not instantaneous. So if you change the rule and run the test immediately, it will fail -- but if you wait a litle bit you'll see it's actually fixed. Gar!
  • The morals to the story are
  • a) source address selection is also affected by IP masquerading even if the documentation doesn't remind you of that
  • b) when a host connects to itself, the lo interface is always used, even if the address being used to connect is not 127.0.0.1, and
  • c) it takes a while for iptables rule changes to actually take effect, so wait a while before actually testing them!
03.12.2009 Google Calendar on the Ubuntu Desktop n more
  • [johnnyjacob.wordpress.com]
  • When your sound goes bad, "sudo alsa force-reload" to the rescue!
  • When your xchat completion is weird, use "/set completion_amount 0"
23.11.2009 Git crack HEADs
  • Maybe git knows that I don't actually want to be using it:
     kiko@baratinha:~$ git clone [gitorious.org]
     Initialized empty Git repository in /home/kiko/x11-maemo/.git/
     remote: Counting objects: 241288, done.
     remote: Compressing objects: 100% (73222/73222), done.
     remote: Total 241288 (delta 191071), reused 214443 (delta 165447)
     Receiving objects: 100% (241288/241288), 186.11 MiB | 64 KiB/s, done.
     Resolving deltas: 100% (191071/191071), done.
     warning: remote HEAD refers to nonexistent ref, unable to checkout.
  • I wonder what it means when I have just spent 30 minutes downloading revisions to end up with no working tree! #$!@#@
(Read older diary entries)

Complain to me if anything's broken, please?