arla-0.35.9 FreeBSD crash

Nickolai Zeldovich kolya at mit.edu
Tue Sep 3 05:36:36 CEST 2002


I upgraded my FreeBSD remote-login server to arla 0.35.9 recently,
and today found arlad crashed, with a failed assert.  A gdb output
is attached.  Probably the right thing to do is to avoid an assert
in collectstats_stop().  In general, I've always been surprised by
the number of asserts in arla on data received from the network --
a misbehaving or malicious fileserver could easily supply inconsistent
records that could crash arlad.

At a guess, perhaps what happened here is that the volume got moved
between the collectstats_start() and _stop() calls?

-- kolya

orbit# gdb -q ../libexec/arlad arlad.core.0
Core was generated by `arlad'.
Program terminated with signal 6, Abort trap.
Reading symbols from /usr/lib/libkvm.so.2...done.
Reading symbols from /usr/lib/libc.so.4...done.
Reading symbols from /usr/libexec/ld-elf.so.1...done.
#0  0x280fa9a4 in kill () from /usr/lib/libc.so.4
(gdb) frame 3
#3  0x804e967 in collectstats_stop (p=0x866eb30, entry=0x82e8f98, 
    conn=0x810a268, measure_type=1, measure_items=1) at fcache.c:309
309	    assert(partition != -1);
(gdb) p/x conn->host
$1 = 0x6c0e40ab
(gdb) p entry->volume->entry
$2 = {name = "user.ricka", '\000' <repeats 54 times>, nServers = 1, 
  serverNumber = {-1421865308, 0 <repeats 12 times>}, serverPartition = {2, 
    0 <repeats 12 times>}, serverFlags = {4, 0, 0, 0, 1521464, 337444, 2, 
    1472696, 509880, 510272, 1, 18511816, 4096}, volumeId = {2003439208, 
    2003439209, 2003439210}, cloneId = 0, flags = 20480, spares1 = 473088, 
  spares2 = 1, spares3 = 9, spares4 = 1, spares5 = 1, spares6 = 3379088, 
  spares7 = 777872, spares8 = 7, spares9 = 0}
(gdb) p/x entry->volume->entry.serverNumber[0]
$3 = 0xab400ea4





More information about the Arla-drinkers mailing list