arla-0.90 on linux-2.6.19 (debian)
Tomas Olsson
tol at stacken.kth.se
Tue Jun 19 13:20:20 CEST 2007
nisse at lysator.liu.se (Niels Möller) writes:
> > Try running arlad with --tracefile=tracefile, the message trace is very
> > helpful if it happens again.
>
> Now I've observed the same (or at least a similar problem again).
> arlad consuming lots of cpu, and stopping arla failed. Unlike the
> previous time, I can't find any "NNPFS PANIC WARNING" messages, and
> unfortunately I didn't try strace before rebooting.
>
> But I did capture a message trace file of some 4 GB. What should I do
> with that? It's a binary file, so I guess I need some special tool to
> see what's happening.
>
python arla-src/nnpfs/readtrace.py < tracefile > readable-trace or
something like that IIRC.
Usually, I first look at the basic chain of rpc:s etc, see what the user is
doing and how arlad responds. If I find anything odd (unexpected messages,
or warnings in syslog at some specific point in time), I try to figure out
how and when the node ended up in the unexpected state. It could be things
like earlier gc messages, just search backwards for the fid (or its
parents's).
thanks
/t
More information about the Arla-drinkers
mailing list