arla-devel port for FreeBSD (was: Patches to get Arla running on FreeBSD 8-CURRENT)
Tomas Olsson
tol at stacken.kth.se
Fri Mar 7 07:43:46 CET 2008
On Thu, 2008-03-06 at 20:49 -0600, Alec Kloss wrote:
> Anyway, Tomas, or others, do you have any hints for me about how
> best to start diagnosing and maybe fixing issues? The most
> repeatable way I've found to get bad behavior is to rsync -a
> /usr/src and /usr/obj into AFS. After 30 seconds or so of this,
> I'll start getting messages like these:
>
> lockmgr: thread 0xc6970840 unlocking unheld lock
> lockmgr: thread 0xc6970840 unlocking unheld lock
> lockmgr: thread 0xc6970840 unlocking unheld lock
> lockmgr: thread 0xc6970840 unlocking unheld lock
> lockmgr: thread 0xc6970840 unlocking unheld lock
>
> on the console. Eventually, rsync will block and generally things
> will decay. Overnight, I'm going to script the console while
> attempting this with nnpfsdeb almost-all set. This is, of course,
> a lot slower than arla normally runs, but I'm hoping someone may be
> able to see the source of the trouble. I'll post the console
> somewhere tomorrow.
>
> Anyway, any hints about debugging arla would be welcome.
>
Some random thoughts:
* If you don't have it yet, get a debug kernel with full vfs sanity
checking etc.
* Set a breakpoint (or panic) at the lockmgr printf and inspect stack
trace and other live threads.
* See if you can run into similar problems using arla's tests, if
you're lucky there will be a faster way to trigger it.
* Perhaps you can cut down on almost-all. Not sure how much. Of course,
there's always the risk that timing changes with nnpfsdebug on.
* try arlad --tracefile=foo.trace (in the cache dir) and cat it to
nnpfs/readtrace.py to decipher it when you're done. It's fast and gives
a complete log of arlad-nnpfs communication.
Hope this helps
/t
More information about the Arla-drinkers
mailing list