Debugging arla on FreeBSD 4.0
Robert Ricci
ricci at siren.eng.utah.edu
Sat Apr 1 05:24:46 CEST 2000
Well, I traced the function calls back to process_message(), and
discovered that:
At the point where xfs_message_recieve() was called (leading to
the eventual assertion failure), header->size was 40, but
msg_len was only 16. To me this would suggest that arlad was
somehow passed an invalid message by the kernel module. Looking
at the calling stack frame, process_message() was called with a
msg_len of 65536. Co-incidentally, this is MAX_XMSG_SIZE .
Hmm....
Looks to me like the queue for the channel filled up, with more
than 2^16 bytes of messages to send. The message that caused the
core dump was sent by xfs_reclaim, so maybe this happened during
a period of cache cleaning?
I'm still looking at the code, so I'll let you know if I have
any other insights.
Thus spake Assar Westerlund on Sat, Apr 01, 2000 at 01:47:34AM +0200:
> Robert P Ricci <ricci at eng.utah.edu> writes:
> > I've had arlad 0.32 crash on me a few times in the past weeks.
> > After examining a core dump from the latest crash, I was able
> > to track the problem down to a failed assertion in volcache.c,
> > line 588:
> >
> > if (db_servers == NULL || num_db_servers == 0) {
> > arla_warnx (ADEBWARN,
> > "Cannot find any db servers in cell %d(%s) while "
> > "getting data for volume `%s'",
> > cell, cell_num2name(cell), name);
> > -----> assert (cell_is_sanep (cell)); <-----
> > return ENOENT;
> > }
> > }
> >
> > Any ideas on how arlad could get into this state, or what code I
> > should look at to investigate it further? Thanks!
>
> I assume that `cell' (and the rest of the parameters) are garbage? If
> that's the case, I'm afraid that my theory is that something is
> sending down a bogus Fid to the volume cache and that's why it's
> crashing there. Can you tell us from where the bogus information has
> been propagation, i.e. where is the source of the bogus information
> that get_info_loop() has gotten?
>
> /assar
--
/-----------------------------------------------------------
| Robert Ricci - <ricci at eng.utah.edu>
| University of Utah - CADE Lab operator
| "Boredom comes to those who wait" - The Pietasters
\-----------------------------------------------------------
More information about the Arla-drinkers
mailing list