# The Laws of Averages: Part 1, Fruit Salad

By Kip Hansen – Re-Blogged From http://www.WattsUpWithThat.com

### Averages: A Primer

As both the word and the concept “average” are subject to a great deal of confusion and misunderstanding in the general public and both word and concept have seen an overwhelming amount of “loose usage” even in scientific circles, not excluding peer-reviewed journal articles and scientific press releases,  let’s have a quick primer (correctly pronounced “primer”), or refresher,  on averages (the cognizanti can skip this bit and jump directly  to  Fruit Salad).

and, of course, the verb meaning to mathematically calculate an average, as in “to average”.

Since there are three major types of “averages” — the mode, the median, and the mean — a quick look at these:

Several of these definitions refer to “a set of data”… In mathematics, a set is a well-defined collection of distinct objects, considered as an object in its own right. (For example, the numbers 2, 4, and 6 are distinct objects when considered separately, but when they are considered collectively they form a single set of size three, written {2,4,6}.)

This image summarizes the three different common “averages”:

Here we see the Ages at which patients develop Stage II Hypertension (severe HBP – high blood pressure) along the bottom (x-axis) and the Number of Patients along the left vertical axis (y-axis).  This bar graph or histogram shows that some patients develop HBP fairly young, in their late 30 and 40s, after 45 the incidence increases more or less steadily with advancing age to peak in the mid-60s, falling off after that age.  We see what is called a skewed distribution, skewed to the right.  This shewdness (right or left)  is typical of many real world distributions.

What we would normally call the average, the mean, calculated by adding together all the patient’s ages at which they developed HBP and dividing by the total number of patients, though mathematically correct, is not very clinically informative.  While it is true that the Mean Age for Developing HPB is around 52 it is far more common to develop HPB in one’s late 50s to mid- 60s.  There are medical reasons for this skewing of the data — but for our purposes, it is enough to know that those outlying patients who develop HPB at younger ages sk

ew the mean — ignoring the outliers at the left would bring the mean more in line with the actual incidence figures.

For the medically inclined, this histogram hints that there may be two different causes or disease paths for HPB, one that causes early onset HPB and one related to advancing age, sometimes known as late High Blood Pressure.

(In this example, the Median Age for HPB is not very informative at all.)

Our HPB example can be read as “Generally, one begins their real risk of developing late HPB in their mid-40s and the risk continues to increase until their mid-60s.  If you haven’t developed HPB by 65 or so, your risk decreases with additional years, though you still must be vigilant.”

Different data sets have different information values for the different types of averages.

Housing prices for an area are often quoted as Median Housing Costs.  If we looked at the mean, the average would be skewed upward by the homes preferred by the wealthiest 1% of the population, homes measured in millions of dollars (see here, and here, and here).

Stock markets are often judged by things like the Dow Jones Industrial Average (DJIA)  [which is a price-weighted average of 30 significant stocks traded on the New York Stock Exchange (NYSE) and the NASDAQ and was invented by Charles Dow back in 1896].  A weighted average is a mean calculated by giving values in a data set more influence according to some attribute of the data. It is an average in which each quantity to be averaged is assigned a weight, and these weightings determine the relative importance of each quantity on the average. The S&P 500 is a stock market index tracks the 500 most widely held stocks on the New York Stock Exchange or NASDAQ.  [A stock index … is a measurement of the value of a section of the stock market. It is computed from the prices of selected stocks, typically a weighted average.]

Family incomes are reported by the US Census Bureau annually as the Median Household Income for the United States [\$55,775 in 2015].

Life Expectancy is reported by various international organizations as “average life expectancy at birth” (worldwide it was 71.0 years over the period 2010–2013).  “Mathematically, life expectancy is the mean number of years of life remaining at a given age, assuming age-specific mortality rates remain at their most recently measured levels. … Moreover, because life expectancy is an average, a particular person may die many years before or many years after the “expected” survival.” (Wiki).

Using any of the major internet search engines to search phrases including the word “average” such as “average cost of a loaf of bread”, “average height of 12-year-old children”  can keep one entertained for hours.

However, it is doubtful that you will be more knowledgeable as a result.

This series of essays is an attempt to answer this last point: Why studying averages might not make you more knowledgeable.

We are all familiar with the concept of comparing Apples and Oranges.

Sets to be averaged must be homogeneous, as in comparable and not so heterogeneous as to be incommensurable.

Problems arise, both physically and logically, when attempts are made to find “averages” of non-comparable or incommensurable objects — objects and/or  measurements, which do not logically or physically (scientifically) belong in the same “set”.

The discussion of sets for Americans schooled in the 40s and 50s can be confusing, but later, younger Americans were exposed to the concepts of sets early on.  For our purposes, we can use a simple definition of a collection of data regarding a number of similar, comparable, commensurable, homogeneous objects, and if a data set,  the data being itself comparable and in compatible measurement units. (Many data sets contains many sub-sets of different information about the same set of objects.  A data set about a study of Eastern Chipmunks might include  sub-sets such as height, weight, estimated age, etc.  The sub-sets must be internally homogeneous — as “all weights in grams”.)

One cannot average the weight and the taste of a basket of apples.  Weight and taste are not commensurable values.  Nor can one average the weight and color of bananas.

Likewise, one cannot logically average the height/length of a set like “all animals living in the contiguous North American continent (considered as USA, Canada, and Mexico)”  Why?  Besides the difficulty in collecting such a data set, even though one’s measurements might all be in centimeters (whole or fractional), “all animals” is not a logical set of objects when considering height/length.  Such a set would include all animals from bison,  moose and Kodiak bears down through cattle, deer, dogs, cats, raccoons, rodents, worms, insects of all descriptions, multi-cellular but microscopic animals, and single-celled animals.   In our selected geographical area there are (very very roughly) an estimated one quintillion five hundred quadrillion (1,500,000,000,000,000,000)  insects alone.   There are only 500 million humans,  122 million cattle, 83 million pigs and 10 million sheep in the same area.   Insects are small and many in number and some mammals are comparatively large but few in number.  Uni- and multicellular microscopic animals?  Each of the 500 million humans has, on average, over 100 trillion (100,000,000,000,000 ) microbes in and on their body.  By any method — mean, median, or mode — the average height/length of all North American animals would be literally vanishing small — so small that “on average” you wouldn’t expect to be able to see any animals with unaided eyes.

To calculate an average of any type that will be physically, scientifically meaningful as well as logical and useful,  the set being averaged must itself make sense as a comparable, commensurable, homogenous collection of objects with data about those objects being comparable and commensurable.

As I will discuss later, there are cases where the collection (the data set) seems proper and reasonable, the data about the collection seems to be measurements in comparable units and yet the resulting average turns out to be non-physical — it doesn’t make sense in terms of physics or logic.

These types of averages, of disparate, heterogeneous data sets — in which either the measurements or the objects themselves are incommensurable — like comparing Apples and Oranges and Bananas — give a results which can be labelled Fruit Salad and have applicability and meaning that ranges from very narrow through nonsensical to  none at all.

“Climate Change Rapidly Warming World’s Lakes”

This is claimed as the major  finding of a study by Catherine M. O’Reilly, Sapna Sharma, Derek K. Gray, and Stephanie E. Hampton, titled “Rapid and  highly variable warming of lake surface waters around the globe”  [ .pdf here; poster here, AGU Meeting video presentation here ].  It is notable that the study is a result of the  Global Lake Temperature Collaboration (GLTC) which states: “These findings, the need for synthesis of in situ and remote sensing datasets, and continued recognition that global and regional climate change has important impacts on terrestrial and aquatic ecosystems are the motivation behind the Global Lake Temperature Collaboration.

The AGU Press Release regarding this study begins thus: “Climate change is rapidly warming lakes around the world, threatening freshwater supplies and ecosystems, according to a new study spanning six continents.”

“The study, which was funded by NASA and the National Science Foundation, found lakes are warming an average of 0.61 degrees Fahrenheit (0.34 degrees Celsius) each decade. That’s greater than the warming rate of either the ocean or the atmosphere, and it can have profound effects, the scientists say.”

All this is followed by scary “if this trend continues” scenarios.

Nowhere in the press release do they state what is actually being measured, averaged and reported.  [See “What Are They Really Counting?”]

So, what is being measured and reported?  Buried in the AGU Video presentation, Simon Hook, of JPL and one of the co-authors, in the Q&A session, reveals that  “these are summertime nighttime surface temperatures.”   Let me be even clearer on that — these are summertime nighttime skin surface water temperatures as in “The SST directly at the surface is called skin SST and can be significantly different from the bulk SST especially under weak winds and high amounts of incoming sunlight …. Satellite instruments that observe in the infrared part of the spectrum in principle measure skin SST.” [source]   When pressed, Hook goes on to clarify that the temperatures in the study are greatly influenced by satellite measurement as the data is in large part satellite data, very little data is actually in situ  [“in its original place or in position “ — by hand or buoy, for instance] measurements.   This information is, of course, available to those who read the full study and carefully go through the supplemental information and data sets — but it is obscured by the reliance on stating, repeatedly “lakes are warming an average of 0.61 degrees Fahrenheit (0.34 degrees Celsius) each decade.“

What kind of average?  Apples and Oranges and Bananas.  Fruit Salad.

Here is the study’s map of the lakes studied:

One does not need to be a lake expert to recognize that these lakes range from the Great Lakes of North America and Lake Tanganyika in Africa to Lake Tahoe in the Sierra Nevada Mountains on the border of California and Nevada.   Some lakes are smaller and shallow, some lakes are huge and deep, some lakes are in the Arctic and some are in the deserts, some lakes are covered by ice much of the year and some lakes are never iced over, some lakes are fed from melting snow and some are feed by slow-moving equatorial rivers.

Naturally, we would assume, that like Land Surface Temperature and Sea Surface Temperature, the Lake Water Temperature average in this study is weighted by lake surface area.   No, it is not.  Each lake in the study is given equal value, no matter how small or large, how deep or how shallow, snow fed or river fed.  Since the vast majority of the study’s data is from satellite observations, the lakes are all “larger”, small lakes, like the reservoir for my town water supply, are not readily discerned by satellite.

So what do we have when we “average” the [summertime nighttime skin surface] water temperature of 235 heterogeneous lakes? We get a Fruit Salad — a metric that is mathematically correct, but physically and logically flawed beyond any use [except for propaganda purposes].

This is freely admitted in the conclusion of the study, which we can look at piecemeal: [quoted Conclusion in italics]

“The high level of spatial heterogeneity in lake warming rates found in this study runs counter to the common assumption of general regional coherence.”

Lakes are not regionally responding to a single cause — such as “global warming”.  Lakes near one another or in a defined environmental region are not necessarily warming in similar manners or for the same reason, and some neighboring lakes have opposite signs of temperature change.  The study refutes the researcher’s expectation that regional surface air temperature warming would correspond to regional lake warming.  Not so.

“Lakes for which warming rates were similar in association with particular geomorphic or climatic predictors (i.e., lakes within a “leaf”) [see the study for the leaf chart] showed weak geographic clustering (Figure 3b), contrary to previous inferences of regional-scale spatial coherence in lake warming trends [Palmer et al., 2014; Wagner et al., 2012]. “

Lakes are warming for geomorphic (having to do with form of the landscape and other natural features of the Earth’s surface) and local climate — not regionally, but individually.  This heterogeneity implies lack of a single or even similar causes within regions.  Lack of heterogeneity means that these lakes should not be consider a single set and thus should not be averaged together to find a mean.

“In fact, similarly responding lakes were broadly distributed across the globe, indicating that lake characteristics can strongly mediate climatic effects.”

Globally, lakes are not a physically meaningful set in the context of surface water temperature.

“The heterogeneity in surface warming rates underscores the importance of considering interactions among climate and geomorphic factors that are driving lake responses and prevents simple statements about surface water trends; one cannot assume that any individual lake has warmed concurrently with air temperature, for example, or that all lakes in a region are warming similarly.”

Again, their conclusion is that, globally, lakes are not a physically meaningful set in the context of surface water temperature yet they insist on finding a simple average, the mean, and basing conclusions and warnings on that mean.

“Predicting future responses of lake ecosystems to climate change relies upon identifying and understanding the nature of such interactions.”

The surprising conclusion shows that if they want to find out what is affecting the temperature of any given lake, they will have to study that lake and its local ecosystem for the causes of any change.

A brave attempt has been made at saving this study with ad hoc conclusions — but most are simply admitting that their original hypothesis of “Global Warming Causes Global Lake Warming” was invalidated.  Lakes (at least Summertime Nighttime Lake Skin Surface Temperatures) may be warming, but they are not warming even in step with air temperatures, not reliably in step with any other particular geomorphic or climatic factor, and not necessarily warming even if air temperatures in the locality are rising.  As a necessary outcome, they fall back on the “average” lake warming metric.

This study is a good example of what happens when scientists attempt to find the averages of things that are dissimilar — so dissimilar that they do not belong in the same “set”.    One can do it mathematically — all the numbers are at least in the same units of degrees C or F — but such averaging gives results that are non-physical and nonsensical — a Fruit Salad resulting from the attempt to average Apples and Oranges and Bananas.

Moreover, Fruit Salad averages not only can lead us astray on a topic but they obscure more information than they illuminate, as is clearly shown by comparing the simplistic Press Release statement “lakes are warming an average of 0.61 degrees Fahrenheit (0.34 degrees Celsius) each decade” to the actual,  more scientifically valid findings of the study which show that each lake’s temperature is changing due to local, sometimes even individual, geomorphic and climate causes specific to each lake and casting doubt on the idea of global or regional causes.

Another example of a Fruit Salad metric was shown in my long-ago essay Baked Alaska?   which highlighted the logical and scientific error of averaging temperatures for Alaska as a single unit, the “State of Alaska”, a political division, when Alaska, which is very large,  consists of 13 distinct differing climate regions, which have been warming and cooling at different rates (and obviously with different signs) over differing time periods.   These important details are all lost, obscured, by the State Average.

### Bottom Line:

It is not enough to correctly mathematically calculate the average of a data set.

It is not enough to be able to defend the methods your Team uses to calculate the [more-often-abused-than-not] Global Averages of data sets.

Data sets must be homogeneous, physically and logically.  They must be data sets of like-with-like, not apples-and-oranges. Data sets, even when averages can be calculated with defensible methods, must have plausible meaning,  both physically and logically.

Careful critical thinkers will be on the alert for numbers which, though the results of simple addition and division,   are in fact Fruit Salad metrics, with little or no real meaning or with meanings far different than the ones claimed for them.

Great care must be taken before accepting that any number presented as an average actually represents the idea being claimed for it.  Averages most often have very narrow applicability, as they obscure the details that often reveal the much-more-important actuality [which is the topic of the next essay in this series].

# # # # #

Note on LOTI, HadCRUT4, etc.:  It is my personal opinion that all combined Land and Sea Surface Temperature metrics, by all their various names, including those represented as indexes, anomalies and ‘predictions of least error’,  are just this sort of Fruit Salad average.  In physics if not Climate Science,  temperature change is an indicator of change in thermal energy of an object (such as of a particular volume of air or sea water).  In order to calculate a valid average of mixed air and water temperatures,  the data set must first be equal units for equivalent volumes of same material (which automatically excludes all data sets of sea surface skin temperatures, which are volume-less).  The temperatures of different volumes of different materials, even air with differing humidity and density, cannot be validly averaged without being converted into a set of temperature-equivalent-units of thermal energy for that material by volume.  Air and water (and stone and road surfaces and plowed fields) have much different specific heat capacities thus a 1 °C temperature change of equal volumes of these differing materials represents greatly differing changes in thermal energy.  Sea Surface (skin or bulk) Temperatures cannot be averaged with Surface Air Temperatures to produce a physically correct representation claimed as a change in thermal (heat) energy — the two data sets are incommensurable and such averages are Fruit Salad.

And yet, we see every day, these surface temperature metrics represented in exactly that non-physical way — as if they are quantitative proof of increasing or decreasing energy retention of the Earth climate system.  This does not mean that correctly measured air temperatures at 2 meters above the surface and surface sea water temperatures (bulk — such as Argo floats at specific depths) cannot tell us something, but we must be very careful in our claims as to what they tell us.  Separate averages of these data sets individually are nonetheless still subject to all the pitfalls and qualifications being presented in this series of essays.

Our frequent commenter, Steven Mosher, recently commented that:

“The global temperature exists. It has a precise physical meaning. It’s this meaning that allows us to say…

The LIA was cooler than today…it’s the meaning that allows us to say the day side of the planet is warmer than the nightside…The same meaning that allows us to say Pluto is cooler than earth and mercury is warmer.”

I must say I agree with his statement — and if Climate Scientists would limit its claims for various Global Temperature averages to these three concepts, their claims would be far more scientifically correct.

NB: I do not think it is correct to say “It has a precise physical meaning.”   It may have a precise description but what it means for the Earth’s climate is far from certain and does not approach precise by any measure.

I expect opinions may vary on this issue.