arla-0.38 on FreeBSD/amd64
Garrett Wollman
wollman at khavrinen.lcs.mit.edu
Mon Jan 10 20:51:54 CET 2005
<<On Sat, 08 Jan 2005 14:44:22 +0100, Love <lha at stacken.kth.se> said:
> What types of benchmarks are you running ? We have the last few years
> concentrated on maintenance, bugfixes and functionallity. I have no doubt
> there could be improvements done to performance.
I've done a few different sorts of benchmarks, both simple
microbenchmarks with `dd' and also more complicated benchmarks like
bonnie++. Here's an interesting test of the `dd' variety:
wollman at wollman-random-testing(37)$ pwd
/afs/csail.mit.edu/u/w/wollman
wollman at wollman-random-testing(38)$ rm foo
rm: foo: No such file or directory
wollman at wollman-random-testing(39)$ time dd if=/dev/zero of=foo bs=1024k count=100
100+0 records in
100+0 records out
104857600 bytes transferred in 2.429002 secs (43169004 bytes/sec)
real 0m22.322s
user 0m0.000s
sys 0m0.543s
[We can see here that the kernel side of arla is doing exactly the
right thing and passing the writes through to the underlying cache
file as fast as it can; the reported speed is very close to the speed
of writing to a local file. Then we wait 20 seconds at close for the
write-back to the server. I waited about ten to twenty seconds after
this before executing the next command.]
wollman at wollman-random-testing(40)$ time dd if=foo of=/dev/null bs=1024k count=100
100+0 records in
100+0 records out
104857600 bytes transferred in 30.001047 secs (3495131 bytes/sec)
real 0m30.013s
user 0m0.000s
sys 0m0.162s
[Why is it so slow? It is clear from network traces that arla had
already turned in its callback on this file -- and did so within a few
seconds of the last close.]
wollman at wollman-random-testing(41)$ time dd if=foo of=/dev/null bs=1024k count=100
100+0 records in
100+0 records out
104857600 bytes transferred in 0.140142 secs (748224137 bytes/sec)
real 0m0.150s
user 0m0.000s
sys 0m0.149s
[But if we give it less than a second from close to open, we still
have the callback and don't have to drag the whole file over the
network again.]
wollman at wollman-random-testing(42)$ time dd if=foo of=/dev/null bs=1024k count=100
100+0 records in
100+0 records out
104857600 bytes transferred in 29.780244 secs (3521046 bytes/sec)
real 0m29.791s
user 0m0.000s
sys 0m0.162s
[Oops, waited too long and had to drag the whole file back over the
network again!]
wollman at wollman-random-testing(47)$ /usr/local/arla/bin/fs getcacheparms
Arla is using 26 of the cache's available 1851392 1K byte blocks
(and 2 of the cache's available 10000 vnodes)
I don't have an OpenAFS machine with comparable cache parameters to
compare against, so this may be the server's fault. My feeling,
though, is that Arla is a little too eager to return callbacks,
particularly for large files which are expensive to drag across the
network.
>> Any advice as to whether lwp or pthreads is preferred, going forward?
>> pthreads is at least SMP-capable.
> The pthread code in arla is to emulate lwp-threads, so it have have the
> same problem as LWP threads, even when using it on a smp machine.
I don't quite understand this. If you are using pthreads,
pthread_create() will give you multiple real threads; if you are using
"true" LWP, the best you can do is multiple simulated threads inside a
single real thread.
-GAWollman
More information about the Arla-drinkers
mailing list