arla-0.35.9 FreeBSD crash
Nickolai Zeldovich
kolya at mit.edu
Tue Sep 3 05:36:36 CEST 2002
I upgraded my FreeBSD remote-login server to arla 0.35.9 recently,
and today found arlad crashed, with a failed assert. A gdb output
is attached. Probably the right thing to do is to avoid an assert
in collectstats_stop(). In general, I've always been surprised by
the number of asserts in arla on data received from the network --
a misbehaving or malicious fileserver could easily supply inconsistent
records that could crash arlad.
At a guess, perhaps what happened here is that the volume got moved
between the collectstats_start() and _stop() calls?
-- kolya
orbit# gdb -q ../libexec/arlad arlad.core.0
Core was generated by `arlad'.
Program terminated with signal 6, Abort trap.
Reading symbols from /usr/lib/libkvm.so.2...done.
Reading symbols from /usr/lib/libc.so.4...done.
Reading symbols from /usr/libexec/ld-elf.so.1...done.
#0 0x280fa9a4 in kill () from /usr/lib/libc.so.4
(gdb) frame 3
#3 0x804e967 in collectstats_stop (p=0x866eb30, entry=0x82e8f98,
conn=0x810a268, measure_type=1, measure_items=1) at fcache.c:309
309 assert(partition != -1);
(gdb) p/x conn->host
$1 = 0x6c0e40ab
(gdb) p entry->volume->entry
$2 = {name = "user.ricka", '\000' <repeats 54 times>, nServers = 1,
serverNumber = {-1421865308, 0 <repeats 12 times>}, serverPartition = {2,
0 <repeats 12 times>}, serverFlags = {4, 0, 0, 0, 1521464, 337444, 2,
1472696, 509880, 510272, 1, 18511816, 4096}, volumeId = {2003439208,
2003439209, 2003439210}, cloneId = 0, flags = 20480, spares1 = 473088,
spares2 = 1, spares3 = 9, spares4 = 1, spares5 = 1, spares6 = 3379088,
spares7 = 777872, spares8 = 7, spares9 = 0}
(gdb) p/x entry->volume->entry.serverNumber[0]
$3 = 0xab400ea4
More information about the Arla-drinkers
mailing list