arla-related hangs/pauses

Love lha at stacken.kth.se
Wed Oct 9 12:07:14 CEST 2002


Nickolai Zeldovich <kolya at mit.edu> writes:

> I've been seeing occasional hangs in accesses to files/directories in
> AFS lately on my FreeBSD 4.6.2 machine running arla 0.35.10pre4.  Any
> process that tries to access AFS hangs for a long period of time, from
> 10-15 seconds to a few minutes.  Then all the processes un-freeze and
> everything returns to normal for a while.  I suspect this is related
> to the fact that Google is indexing my AFS cell through this machine,
> but I hoped arla could handle this..

If you do a ps axlwww, that do the process sleep on (xfs or xfs_lock) ?

> At first I tried increasing --workers from 16 to 64 (I got syslogs
> about running out of workers), but it didn't help, and I don't get
> such syslogs anymore.  I'm getting different messages now (see below)
> but I suspect they aren't actually relevant (AFAICT they're because
> something tried to fetch a file larger than the cache).

That would fail directly.

I think the problem is the cleaner that isn't aggressive enough or it fails
since its too fast to abort, it might be so that inserting a
xfs_send_message_version() beween that state-changes, instead of doing
LWP_Dispatch/IOMGR_Poll.

> Any thoughts on what might cause such lock-up behavior, and if there
> are any good ways to work around it (aside from "don't access AFS
> so much")?

How many fcache-nodes are you using ?

Love





More information about the Arla-drinkers mailing list