Proving a Point with Statistics
By Loren Bolinger, Tuesday, August 01, 2006

Finding enlightenment or not, dazzling with
data
and/or engaging in costly self-delusion

Solving a problem requires enough understanding to formulate the right questions. If you cannot ask the right questions you will never find the right answers just as if you don´t look in the right places you´ll never find the solution. So, if you´re intellectually curious, you count stuff, measure various things with various types of yardsticks, classify into different groups based on objective similarity, make labels for your groups, massage the data with all sorts of sophisticated mathematical algorithms - hopefully to attain a bit of enlightenment. Oh yeah, and while doing all this, don´t forget to include contemplation, commonsense, logic - you know, using the brains you were given.

If your goal is objective and pragmatic enlightenment that can be usefully applied as, for example, in horse breeding, the accuracy and integrity of the collected data must be as close to beyond reproach as possible.  I mean, if you´re just trying to win your argument at any cost, just manipulate or ignore the data all together.

Proving your point often implies that the answer is already known and you are merely selecting data that supports your position. That is neither scientific inquiry nor objective. If your motives are less than pure, you could just select the data that proves your point and ignore the rest.  Failure to record all relevant data, flawed, incomplete, or incorrect data, is worse than useless, it is unconscionable.  Inaccurately classifying data sets into useless [incompatible] groups, manipulating the data incorrectly or mistakenly all produce corrupted results that prove nothing. There are no places for overt or hidden agendas that might cause undue influence in the creation of a living creature. Objectivity must sweep aside all spin. Arbitrary selection based on intuition is acceptable only if it leads to objective results. Failure to be objective and accurate in evaluating breeding data, pedigree information not only leads to erroneous interpretations but invalidates the justifications for any selective matings based on such erroneous data. You risk making meaningless and absurd the goals, objectives and selective matings guided by such erroneous interpretations made during your career as a breeder. [A breeder deliberately create a living creature partially of his own specifications, largely from his thoughtful considerations for the process of selective mating, the principles, guidelines, and practices of breeders that have gone before. He grapples with the harmonies of anatomical proportions, various measures of compatibility between sire and dam, powers in near and distant ancestors, prepotency in the sire, proper proportions of inbreeding, combinations of bloodlines, weighing ratios of sex-balance, assessing the contribution of matrilineal descent, environmental influences. Albert Einstein observed, "whoever undertakes to set himself up as judge in the field of truth and knowledge, is shipwrecked by the laughter of the gods," could very well be applied to Jack Werk and others like him. Whether they are those who manipulate and/or misuse racing/breeding statistics, misunderstand, or misinterpret the numbers they stand or fall based on their faithfulness to eternal objectivity.

Blind, ignorant reliance on statistical studies without verification of the study itself or its mathematics leads to the laughter of the gods.

Such a horse breeder may find enlightenment or not, may be dazzled with data, attempt to dazzle others with the same data, and/or engage in costly self-delusion.

While some indices such as "dosage" are without biological merit or usefulness in the selective mating of racehorses, even those measures specifically developed by knowledgeable horsemen and scientists to objectively gauge racing performance must be taken with a grain of salt. As I have pointed out many times, these population statistics derived measures do not apply to the individual except only as a general guide. It is wrong to give them more credence than that. Among those most relevant to breeding and selective mating, many of the most credible indices commonly used to gain insight into the genetic worth of horses, breakdown or fail under certain circumstances. For example, Standard Starts Index (SSI) and Average Earnings Index (AEI) are very much less useful or applicable as measures of performance merit in regional areas where purse structure, racing styles, etc. are often quite different than in major racing venues. These indices were developed for and are useful as a performance gauge with major runners racing for major purses at the most prestigious race tracks.] The indices were constructed on a weighted basis, but the weighting scheme breaks down as a valid comparison when applied across all racing venues.

Depending on your integrity and/or your ability to set up the intellectual conditions, situations, or queries that will provide as much as possible of the answer to the question, you may or may not reach enlightenment. As I said before, if you don´t know how to ask the questions you will never find the answers. Supposing that the data is clean, then there are the problems of how to interpret: just what does it mean and just how may it be applied as a selection factor in selective mating? There are the difficulties of not understanding what information you´ve obtained, misunderstanding the meaning of the information, coming to the wrong conclusion through misinterpretation, or even ignoring the obvious conclusions.

Then there is complexity of inheritance. Multifactorial traits such as performance traits may involve thousands of genes. On the surface, it seems impossible that anything a horse breeder could do would have much of an effect on inheritance or triggering the expression of performance genes. Yet the results of the early English founder breeders are there for all to see. Wise, selective mating by expert agriculturists [early breeders] using methods that may go back to the time of the Roman Empire revealed that some traits can be approximately manipulated through selection. They invented a breed of animal whose phenotype and genotype visibly responded to such breeding selection in less than one hundred years. The proof of this is the dramatic changes in the rapidly developing Thoroughbred from its founder stock. Some may argue that a performance plateau may have been reached for the breed, but to many horsemen that´s irrelevant in light of their love for the horse, the wonderful nature of the modern Thoroughbred, and the dramatic differences between the today´s horse and its origins.

Some critics suggest that analyzing the inheritance of the good horses without the controls of analysis of the bad horses is committing bad science. I disagree; failure is failure regardless of the cause. Often the causes of failure cannot be determined. I don´t need to learn how to breed bad horses, I need to discover the principles of breeding good horses. Even under the best of conditions, with the very best horses, there will be inexplicable failures. A breeder must glean what he can from each failure, disregard any her obsession, and move on. Each breeder has only one lifetime to affect the gene pool of the Thoroughbred, so if your intent is to breed a good horse, you need to sort through the chaff, find the right needle in the haystack, and execute your planned selective matings.