|
|
|
Setting the Record Straight:
Where Salon Got It Right and Wrong
By Bruce Weiner
May 4, 1999
|
|
In an April 27, 1999 article
entitled "Microsoft's Flawed Linux vs. NT Shootout," Salon presented a
biased report with innuendos impugning my honesty and Mindcraft's
reputation. We want to set the record straight with this rebuttal.
Unfortunately, it takes more words to right a wrong than it does to make
someone look wrong, so please bear with me.
What's Right
Mr. Leonard had several points right in his
article: |
|
Our tests did find that Windows NT Server is
"2.5 times faster than Linux as a file server and 3.7 times
faster as a Web server" as he states.
Mindcraft did the tests stated in the article
under contract with Microsoft and we did them in a Microsoft lab. We
couldn't divulge where the tests were being done because we
were under a non-disclosure agreement at the time.
Many have tried to imply that
something is wrong with Mindcraft's tests because they were done in a
Microsoft lab. You should know that Mindcraft verified the clients
were set up as we documented in our report and that Mindcraft, not
Microsoft, loaded the server software and tuned it as documented in our
report. In essence, we took over the lab we were
using and verified it was set up fairly.
Mindcraft did conduct a second test
with support from Linus Torvalds, Alan Cox, Jeremy Allison, Dean Gaudet,
and David Miller. Andrew Tridgell provided only one piece of input before
he left on vacation. Mindcraft received excellent support from these leading
members of the Linux, Samba, and Apache communities. I thank them for
their help and very much appreciate it.
The response from the Linux community was more
than convulsions. It was net
rage.
Will did post a request for tuning information.
He got one response
telling him to use FreeBSD if he wanted a high-speed Web server for
static files. Will is not a Mindcraft employee. He is someone who did a
posting to a newsgroup about Linux on the system we were going to use
for testing. He wanted to remain as anonymous as possible because he
didn't want to get a ton of flamming email (based on the email Mindcraft
has received, his expectation was underestimated). I see no need to
reveal who he is now because his worst nightmare will come true and because he had nothing
to do with our test.
|
|
What's Wrong
Unfortunately, Mr. Leonard did not contact
Mindcraft to get information from us. I was at Microsoft conducting the
second test of Linux and Windows NT Server myself and was difficult to
reach. When I called Mr. Leonard back, he was not in and I left a message.
There was no intension to duck his questions as one might infer
from his article. The following points will give you the other
side of his story.
-
We did not make any posting
under false pretences as Mr. Leonard states. Given the anti-Microsoft
sentiments in the Linux community, what kind of response do you think we
would have received if we said we were benchmarking Linux and Windows NT
Server under contract with Microsoft? Take another look at a small part
of the net rage e-mail we did
receive. A search for Mindcraft at dejanews
yielded 764 postings
since we published our report. All that doing what Mr. Leonard suggested
would have done is to start the net
rage and newsgroup postings earlier.
-
For our second test, as Mr. Leonard points out, we
sought the tuning suggestions of the people in the Linux community who
should really know the answers including Linus Torvalds. But his
attribution to Linus that "...but [Mindcraft] isn't giving him the info
he needs to do
the job right," is wrong.
All of the Linux experts helping us knew the exact configuration of
the system we were
testing and knew the benchmarks we were running. The NetBench and
WebBench benchmarks are readily available on the Web for free and are
probably some of the best documented benchmarks available. We withheld
no technical details from him
or the other Linux experts.
Jeremy Allison directly contradicts Mr. Leondard's attribution in
a Linux Today article
when he says "...I can confirm
that we have reproduced Mindcraft's NT server numbers here in our lab." Clearly, Jeremy was
tracking what we were doing and we provided him with the information needed to reproduce
our tests.
We exchanged several emails with the Linux experts supporting us and they made
suggestions on tunes for Linux, Apache, and Samba. They also provided a
kernel patch that was not readily available. We applied all tunes they
suggested and the kernel patch. Here are some of the things that
happened:
-
Red Hat provided version 1.0 of the MegaRAID driver during our tests and we used it, even
though it meant retesting.
We
sent out our Apache and Samba configuration files for review and
received approval of them before we tested. (We actually got better
performance in Apache when we made some changes to the approved
configuration file on our own).
Whenever we got poor performance we sent a
description of out how the system was set up and the performance
numbers we were measuring. The Linux experts and Red Hat told us what
to check out, offered tuning changes, and provided patches to try. We
had several rounds of messages between us in which Mindcraft answered
the questions they posed.
|
|
-
We found the Linux experts who supported our
second test to be extremely helpful. However, when we saw only one
response to Will's posting during the first
test, we expected that we would not get much of a response
to further queries. I doubt that we would have received much constructive help
if we said we were doing a test for Microsoft.
-
Alan Cox is wrong when he says, according to the
article, "They [Mindcraft] seem solely intent on trying to re-create
their existing pro-Microsoft results and hoping, by attaching some kind
of 'Linux top mind' credibility to it, they can do more damage." We're
not trying to damage anything. We're reporting the truth of what we
find. I asked for Mr. Cox to help provide Linux tuning because Linus
Torvalds sent me his name. It was my hope to get Linux performing at an
optimal level. I hope that he will participate in the Open Benchmark
Mindcraft has proposed.
|
The Linux community will gain a real benefit from our benchmark report - a
new Linux performance documentation
project was created in
response to our reporting a lack of such documentation.
|
|
Mr. Leonard's attempts to defame Mindcraft's
reputation are indicative
of a biased and unfounded position. It's not
clear if he his trying to gain favor with
Linux proponents by his attacks on Mindcraft, if he is simply biased against Microsoft,
or if he has something to gain personally by seeing Linux
outperform Windows NT Server. I had expected more from a reputable
organization like Salon.
-
Putting independent test lab in quotes as Mr.
Leonard did in the first paragraph of his article immediately calls into
question our independence. That position is most clearly stated in his
concluding paragraph on the first page where he states, "... the story
underlines the essential worthlessness of commercially sponsored comparison
tests. The purpose of these tests is to please the customer who
commissions them." So he concludes Mindcraft is not independent and our
tests are worthless. Then as a final smear Mr. Leonard quotes
an unnamed "engineer familiar with the testing
business" in the second to last paragraph on page two to
bolster his unfounded views.
It's clear Mr. Leonard did not do his homework.
He obviously knows nothing about the real business of testing and knows
nothing about Mindcraft. I have included some background
on Mindcraft so
you and Mr. Leonard can learn the truth about us.
As for Mr. Leonard's unnamed "engineer familiar with the testing
business," he certainly knows nothing about the professional testing
business or he just gave Mr. Leonard the quotation he wanted or he just
does not exist. Or, maybe he is the kind of engineer who produces
"... numbers that are favorable to the customer" so he can get paid off.
Such an engineer could not get or keep a job at Mindcraft.
To believe this possibly non-existent engineer,
you would have to think that we are dishonest and idiots and that our
clients are stupid. If our clients get trashed over benchmarks we did
for them, how long do you think we'd be able to
stay in business. Not 14 years, that's for sure.
The Crux of The Matter
The whole controversy over Mindcraft's benchmark report
is about three things: we showed that Windows NT Server was faster than
Linux on an enterprise-class server, Apache did not outperform IIS, and
we didn't get the same performance measurements for Samba that Jeremy Allison
got in the
PC Week article or in his lab. Let's look at these
issues.
|
Comparing the performance of a
resource-constrained desktop PC with an enterprise-class server is
like saying a go-kart beat a grand prix race car on a go-kart race
course. |
-
Smart Reseller reported a head-to-head test of Linux and Windows
NT Server in a January
25, 1999 article; they tested performance on a resource-constrained
266 MHz desktop PC. One cannot reasonably extrapolate the
performance of a resource-constrained desktop PC to an unconstrained,
enterprise-class server with four 400 MHz Xeon processors.
-
In a February
1, 1999 article, PC Week tested the file server performance
of Linux and Samba on an enterprise-class system. They did not compare
it to Windows NT Server on the same system. Jeremy Allison helped with
these tests comparing the Linux 2.2 kernel with the Linux 2.0 kernel.
I'll show you below
what he thinks about Windows NT Server on an enterprise-class
server. |
|
-
If you doubt our published Apache performance,
Dean Gaudet, who wrote the
Apache Performance Notes and who provided
tuning
help
for our testing, gives some insights in a recent
newsgroup posting. In response to a request for tuning
Apache for Web benchmarks,
Dean wrote:
"Unless by tuning you mean 'replace apache with something that's
actually fast' ;)
"Really, with the current multiprocess apache
I've never really been able to see more than a handful of percentage
improvement from all the tweaks. It really is a case of needing a
different server architecture to reach the loads folks want to see in
benchmarks."
In other words, Apache cannot achieve the
performance that companies want to see in benchmarks. That's probably
why none of the Unix benchmarks results reported at SPEC
use Apache.
Jeremy Allison believes, according to an April 27, 1999
Linux Today
article, that if we do another benchmark with his help,
"...this doesn't mean Linux will neccessarily [sic] win, (it doesn't when serving Win95
clients here in my lab, although it does when serving NT clients)..." In other words, in a fair test we should
find Windows NT Server outperforming Linux and Samba on the same system.
That's what we found.
|
|
Jeremy's statement in the Linux Today article that "It is a shame that they
[Mindcraft] cannot reproduce the PC Week Linux numbers ..." shows a lack of
understanding of the NetBench benchmark. If he looked at the NetBench documentation
, he would find a very significant reason why
Mindcraft's measured Samba performance was lower:
We used 133 MHz Pentium clients while Jeremy
and PC Week used faster clients, although we don't know how
much faster because neither documented that. We believe that PC
Week
uses faster clients. Because they use clients that are faster and because so much of the NetBench
measurements are affected by the clients, this can account for most of the difference in the
reported measurements.
|
"You can only compare results if you used the same testbed each time you ran
that test suite."
Understanding and Using NetBench 5.01
|
|
In addition, the following testbed and server differences
add to the measured performance variances:
-
Mindcraft used a server with 400 MHz Xeon
processors while PC Week used one with 450 MHz Xeon processors. Jeremy
did not disclose what speed processor he was using.
-
Mindcraft used a server with a MegaRAID controller with a
beta driver (which was the latest version available at the time
of the test) for our first test while the PC Week
server used an
eXtremeRAID controller with a fully released driver. The MegaRAID
driver was single threaded while the eXtremeRAID driver was
multi-threaded.
-
Mindcraft used Windows 9x clients while Jeremy and PC Week
used Windows NT clients. According to Jeremy, he gets faster
performance with Windows NT clients than with Windows 9x clients.
Given these differences in the testbeds and
servers, is it any wonder we got lower performance than Jeremy and
PC Week
did?
If you scale up our numbers to account for their speed advantage, we
get essentially the same results.
The only reason to use Windows NT clients is to give
Linux and Samba an advantage, if you believe Jeremy's claim. In
the real world, there are many more Windows 9x clients connected to file servers than
Windows NT clients. So benchmarks that use Windows NT clients are unrealistic and
should be viewed as benchmark-special configurations.
The fact that Jeremy did not publish the details
of the testbed he used and the tunes he applied to Linux and Samba is a
violation of the NetBench license. If he had published the tunes he
used, we would have tried them. What's the big secret?
-
Jeremy states in the article "The essense of
scientific testing is *repeatability* of the experiment..." I concur
with his assertion. But a scientific test would use the same test apparatus set up and would use the same
initial conditions. Jeremy's unscientific test did not use the same testbed
or even one with client computers of the same speed we used. We
reported enough information in our report so that someone could do
a scientific test to determine the accuracy of our findings. Jeremy did not.
Given the warning in the NetBench documentation against comparing results from different
testbeds, it is Jeremy and Linus that are being unscientific in
their thrashing of Mindcraft's results. Mindcraft never compared its NetBench
results to those produced on a different testbed.
Mindcraft has been in business for over 14 years doing various kinds of
testing. For example, from May 1, 1991 through September 30, 1998
Mindcraft was accredited as a POSIX Testing Laboratory by the National
Voluntary Laboratory Program (NVLAP),
part of the National Institute of Standards and Technology (NIST ).
During that time, Mindcraft did more POSIX FIPS certifications than all
other POSIX labs combined. All of those tests were paid for by the client
seeking certification. NIST saw no conflict of interest in our being paid
by the company seeking certification and NIST reviewed and validated each
test result we submitted. We apply the same honesty to our performance
testing that we do for our conformance testing. To do otherwise would be
foolish and would put us out of business quickly.
Some may ask why we decided not to renew our NVLAP accreditation. The
reason is simple, NIST stopped its POSIX FIPS certification program on
December 31, 1997. That program was picked up by the IEEE and on November
7, 1997 the IEEE announced that they recognized Mindcraft as an Accredited
POSIX Testing Laboratory. We still are IEEE accredited and are still
certifying systems for POSIX FIPS conformance.
We've received many emails and there have been many postings in
newsgroups accusing us of lying in our report about Linux and Windows NT
Server because Microsoft paid for the tests. Nothing could be further from
the truth. No Mindcraft client, including Microsoft, has ever asked us to deliver a report that lied or misrepresented the results of a
test. On the contrary, all of our clients ask us to get the best performance for their
product and for their competitor's
products. They want to know where they really stand. If a client ever
asked us to rig a test, to lie about test results, or to misrepresent test
results, we would decline to do the work.
A few of the emails we've received asked us why the company that
sponsored a comparative benchmark always came out on top. The answer is
simple. When that was not the case our client exercised a clause in the
contract that allowed them to refuse us the right to publish the results.
We've had several such cases.
Mindcraft works much like a CPA hired by a company to audit its books. We give an independent, impartial assessment based on our
testing. Like a CPA we're paid by our client. NVLAP approved test
labs that measure everything from asbestos to the accuracy of scales are
paid by their clients. It is a common practice for test labs
to be paid by their clients.
Considering the defamatory innuendos and bias in the Salon article written
by Mr. Leonard, we believe that Salon should take the following actions
in fairness to Mindcraft and its readers:
Remove the article from its Web site and put an apology in
its place. If you do not do that, at least provide a link to this
rebuttal at the top of the article so that your readers can get both
sides of the story.
Provide fair coverage from an unbiased reporter
of Mindcraft's Open Benchmark
of Windows NT Server and Linux. For this benchmark, we have
invited Linus Torvalds, Jeremy Allison, Red Hat, and all of the other Linux
experts we were in contact with to tune Linux, Apache, and Samba and
to witness all tests. We have also invited Microsoft to tune Windows
NT and to witness the tests. Mindcraft will participate
in this benchmark at its own expense.
The NetBench document entitled Understanding
and Using NetBench 5.01
states on page 24, " You
can only compare results if you used the same testbed each time you ran
that test suite [emphasis added]."
Understanding and Using NetBench 5.01
clearly gives another reason why the performance measurements
Mindcraft reported are so different than the ones Jeremy and PC Week
found. Look what's stated on
page 236, "Client-side caching occurs when the client is able to place
some or all of the test workspace into its local RAM, which it then uses
as a file cache. When the client caches these test files, the client can
satisfy locally requests that normally require a network access. Because a
client's RAM can handle a request many times faster than it takes that
same request to traverse the LAN, the client's throughput scores show a
definite rise over scores when no client-side caching occurs. In
fact, the client's throughput numbers with client-side caching can
increase to levels that are two to three times faster than is possible
given the physical speed of the particular network [emphasis added]."
|
|
|
|
|
|
|
|