Error 'Software caused connection abort' in arla 0.26 and 0.25

Love lha at stacken.kth.se
Sat Jul 24 19:57:02 CEST 1999


Jeffrey Hutzelman <jhutz at cmu.edu> writes:

> Indeed, in arlad/fcache.c:try_next_fs(), the handling of VMOVED and VNOVOL
> changed between 0.22 and 0.25 (versions I happen to have on hand).
> Previously, if a fileserver returned VMOVED or VNOVOL, arla would try the
> next fileserver, if any.  Now, it gives up on the call immediately, but
> then updates its volume cache and tries again.  The new behaviour is
> correct for VMOVED, but IMNSHO try_next_fs() should still return TRUE for
> VNOVOL, since we could be talking about an RO site which doesn't have an
> online copy of the volume, and I believe the current code will retry such
> a site forever.

The thing is that I have seen VNOVOL directly after a volume moved. I think
the correct way of handle it is to choose the next volume if it exist one
and try to avoid talk to servers with known bad volumes.

I know that the loop exist, I haven't just got around to fix it.
 
> In any case, I don't think that's your problem -- if that code were broken
> _and_ leaked an error code, it would likely leak ARLA_VNOVOL (4103), not
> VNOVOL (103).  I believe the real problem in this case is that the error
> code translation is not happening, and so the special handling for VNOVOL
> is not happening.  I'll forward more details when I'm more sure of what's
> going on.

We miss the conversion on a couple of place that is fixed (hopfully) in the
current code (mostly rx_Write).

Love





More information about the Arla-drinkers mailing list