Proving a Point with Statistics
By Loren Bolinger, Tuesday, August 01, 2006
Finding enlightenment or not, dazzling with
data and/or engaging in costly self-delusion
Solving a problem requires
enough understanding to formulate the right questions. If you cannot
ask the right questions you will never find the right answers just as
if you don´t look in the right places you´ll never find the
solution. So, if you´re intellectually curious, you count stuff,
measure various things with various types of yardsticks, classify into
different groups based on objective similarity, make labels for your
groups, massage the data with all sorts of sophisticated mathematical
algorithms - hopefully to attain a bit of enlightenment. Oh yeah, and
while doing all this, don´t forget to include contemplation,
commonsense, logic - you know, using the brains you were given.
If your goal is objective and
pragmatic enlightenment that can be usefully applied as, for example,
in horse breeding, the accuracy and integrity of the collected data
must be as close to beyond reproach as possible. I mean, if
you´re just trying to win your argument at any cost, just
manipulate or ignore the data all together.
Proving your point often
implies that the answer is already known and you are merely selecting
data that supports your position. That is neither scientific inquiry
nor objective. If your motives are less than pure, you could just
select the data that proves your point and ignore the rest.
Failure to record all relevant data, flawed, incomplete, or incorrect
data, is worse than useless, it is unconscionable. Inaccurately
classifying data sets into useless [incompatible] groups, manipulating
the data incorrectly or mistakenly all produce corrupted results that
prove nothing. There are no places for overt or hidden agendas that
might cause undue influence in the creation of a living creature.
Objectivity must sweep aside all spin. Arbitrary selection based on
intuition is acceptable only if it leads to objective results. Failure
to be objective and accurate in evaluating breeding data, pedigree
information not only leads to erroneous interpretations but invalidates
the justifications for any selective matings based on such erroneous
data. You risk making meaningless and absurd the goals, objectives and
selective matings guided by such erroneous interpretations made during
your career as a breeder. [A breeder deliberately create a living
creature partially of his own specifications, largely from his
thoughtful considerations for the process of selective mating, the
principles, guidelines, and practices of breeders that have gone
before. He grapples with the harmonies of anatomical proportions,
various measures of compatibility between sire and dam, powers in near
and distant ancestors, prepotency in the sire, proper proportions of
inbreeding, combinations of bloodlines, weighing ratios of sex-balance,
assessing the contribution of matrilineal descent, environmental
influences. Albert Einstein observed, "whoever undertakes to set
himself up as judge in the field of truth and knowledge, is shipwrecked
by the laughter of the gods," could very well be applied to Jack Werk
and others like him. Whether they are those who manipulate and/or
misuse racing/breeding statistics, misunderstand, or misinterpret the
numbers they stand or fall based on their faithfulness to eternal
objectivity.
Blind, ignorant reliance on
statistical studies without verification of the study itself or its
mathematics leads to the laughter of the gods.
Such a horse breeder may find
enlightenment or not, may be dazzled with data, attempt to dazzle
others with the same data, and/or engage in costly self-delusion.
While some indices such as
"dosage" are without biological merit or usefulness in the selective
mating of racehorses, even those measures specifically developed by
knowledgeable horsemen and scientists to objectively gauge racing
performance must be taken with a grain of salt. As I have pointed out
many times, these population statistics derived measures do not apply
to the individual except only as a general guide. It is wrong to give
them more credence than that. Among those most relevant to breeding and
selective mating, many of the most credible indices commonly used to
gain insight into the genetic worth of horses, breakdown or fail under
certain circumstances. For example, Standard Starts Index (SSI) and
Average Earnings Index (AEI) are very much less useful or applicable as
measures of performance merit in regional areas where purse structure,
racing styles, etc. are often quite different than in major racing
venues. These indices were developed for and are useful as a
performance gauge with major runners racing for major purses at the
most prestigious race tracks.] The indices were constructed on a
weighted basis, but the weighting scheme breaks down as a valid
comparison when applied across all racing venues.
Depending on your integrity
and/or your ability to set up the intellectual conditions, situations,
or queries that will provide as much as possible of the answer to the
question, you may or may not reach enlightenment. As I said before, if
you don´t know how to ask the questions you will never find the
answers. Supposing that the data is clean, then there are the problems
of how to interpret: just what does it mean and just how may it be
applied as a selection factor in selective mating? There are the
difficulties of not understanding what information you´ve
obtained, misunderstanding the meaning of the information, coming to
the wrong conclusion through misinterpretation, or even ignoring the
obvious conclusions.
Then there is complexity of
inheritance. Multifactorial traits such as performance traits may
involve thousands of genes. On the surface, it seems impossible that
anything a horse breeder could do would have much of an effect on
inheritance or triggering the expression of performance genes. Yet the
results of the early English founder breeders are there for all to see.
Wise, selective mating by expert agriculturists [early breeders] using
methods that may go back to the time of the Roman Empire revealed that
some traits can be approximately manipulated through selection. They
invented a breed of animal whose phenotype and genotype visibly
responded to such breeding selection in less than one hundred years.
The proof of this is the dramatic changes in the rapidly developing
Thoroughbred from its founder stock. Some may argue that a performance
plateau may have been reached for the breed, but to many horsemen
that´s irrelevant in light of their love for the horse, the
wonderful nature of the modern Thoroughbred, and the dramatic
differences between the today´s horse and its origins.
Some critics suggest that
analyzing the inheritance of the good horses without the controls of
analysis of the bad horses is committing bad science. I disagree;
failure is failure regardless of the cause. Often the causes of failure
cannot be determined. I don´t need to learn how to breed bad
horses, I need to discover the principles of breeding good horses. Even
under the best of conditions, with the very best horses, there will be
inexplicable failures. A breeder must glean what he can from each
failure, disregard any her obsession, and move on. Each breeder has
only one lifetime to affect the gene pool of the Thoroughbred, so if
your intent is to breed a good horse, you need to sort through the
chaff, find the right needle in the haystack, and execute your planned
selective matings.