two bugs and a patch
Nickolai Zeldovich
kolya at mit.edu
Fri Jan 5 05:41:16 CET 2001
It looks like rxi_DecongestionEvent decrements the refcount on its
rx_peer too early, so it can potentially be recycled while it is
still running (arlad crashed on me once earlier today, and rx_peer
was filled with 0xAA, which my osi_Free memsets all freed memory to.)
Attached below is a patch which should fix this.
I seem to have also ran into some bug with arla and FreeBSD 4.2, where
the contents of a directory don't seem to be validated, even when the
callback from the fileserver has been broken. While it doesn't seem to
be perfectly deterministic, at the moment I'm seeing this:
freebsd% cd /afs/zepa.net/user/kolya
freebsd% ls | grep -c -w Q
0
aix% cd /afs/zepa.net/user/kolya
aix% touch Q ; ls | grep -c -w Q
1
freebsd% ls | grep -c -w Q ; ls Q
0
Q
freebsd% cd / ; ls /afs/zepa.net/user/kolya | grep -c -w Q
1
Another bug I've ran into this so far is a concurrency problem on the
rx_connHashTable, in particular in rxi_ReapConnections. Here's some
information on the crash (as I mentioned above, my osi_Free memsets
everything to 0xAA):
(gdb) bt
#0 0x8086b79 in rxi_CheckCall (call=0xaaaaaaaa) at rx.c:3237
#1 0x8087669 in rxi_ReapConnections () at rx.c:3596
#2 0x808b80b in rxevent_RaiseEvents (next=0x80f5f10) at rx_event.c:213
#3 0x808bc34 in rxi_Listener () at rx_user.c:283
[...]
(gdb) frame 1
#1 0x8087669 in rxi_ReapConnections () at rx.c:3596
3596 rxi_CheckCall(conn->call[i]);
(gdb) print *conn
$1 = {next = 0xaaaaaaaa, peer = 0xaaaaaaaa, ...
Should there be some per-entry locking on rx_connHashTable, and maybe
some other hash tables in arla as well? It looks like LWP is preemptive
so something of this sort would be required, unless I'm confused.
-- kolya
--- rx.c 2000/11/25 22:36:28 1.16
+++ rx.c 2001/01/04 18:16:26
@@ -3398,7 +3402,6 @@
struct rx_call *call;
struct rx_call *nxcall; /* Next pointer for queue_Scan */
- peer->refCount--; /* It was bumped by the callee */
peer->burst += nPackets;
if (peer->burst > peer->burstSize)
peer->burst = peer->burstSize;
@@ -3415,8 +3418,10 @@
*/
rxi_Start((struct rxevent *) 0, call);
if (!peer->burst)
- return;
+ goto done;
}
+done:
+ peer->refCount--; /* It was bumped by the callee */
}
/*
More information about the Arla-drinkers
mailing list