Part II: Statistical Connections
between El Niņo/La Niņa Events and West Coast Precipitation
The last two lab meetings of the semester, on Wednesdays, Dec. 5 and 12 (Lab Section #1) or Fridays, Dec. 7 and 14 (Lab Section #2), will be devoted to supporting the final project, which will be worth 15% of your final course grade. (Some of each of the last three lecture class meetings on the Mondays of Dec. 3, 10, and 17 will also support this assignment.)
The final project is a research project broken into
four distinct parts:
- Part I: Analysis of
precipitation data (done for you, but you need to understand how it was done in order to interpret it properly).
- Part II: Analysis of Pacific equatorial
sea-surface temperature and statistical connections to precipitation
data (to be completed in lab session on Wednesday, Dec. 12 or Friday, Dec. 14)
- Part III: Jet Stream Patterns during El Niño/La Niño Events (supported in lecture class on Mon., Dec. 17)
- A final, summative report (due on Friday, Dec. 21; I will provide you with a template for this)
- Conduct a realistic research project to investigate possible connections, both statistical and physical, between variations in sea surface temperatures near the equator in the Pacific Ocean and variations in winter precipitation on the west coast of the U.S.
- Access, analyze, interpret, and present data to test the assertion that there are such connections.
Objectives for Part II:
- Using Ocean Niņo Index data, identify years when there were El
Niņo and La Niņo events of various strengths from winter of 1950-51
through winter of 2011-12.
- Using precipitation data analyzed in Part I, determine for that period how many El
Niņo events and how many La Niña events occurred during "wet" years and how many of each occurred during "dry" years, for each of four
stations on the West Coast.
- For each type of El Niņo/La Niņa event, estimate the probability
that at least that many events would have occurred during wet or dry
years if there were no systematic connection between them; and conclude
whether there likely is or perhaps isn't a connection, at least
As you probably discovered in Part I of the Final Project, the amount
of rainfall that West Coast weather stations receive through the five
wettest months of the year (which are typically October through February
or November through March, depending on the station), can vary quite a bit from year to year.
Very low rainfall years, especially several in a row, result in drought,
which stresses people, plants, animals, and industry. Very high
rainfall years are often associated with increased likelihood of
flooding (though compared to drought, flooding isn't as simply related
to total rainfall over periods of months).
What could account for this inter-annual (that
is, year to year) variability in precipitation? There is
probably more than one cause. Research meteorologists try to identify
the most important causes, and operational meteorologists (that is,
forecasters) apply that understanding to try to predict whether the
upcoming winter rainfall season will likely be relatively wet, "normal",
or relatively dry. Being able to anticipate rainfall a season or two in
advance can have major economic and other benefits.
To look for possible causes for something like
inter-annual variability in precipitation, a common first step is to
look for statistical connections between it and possible causes
of it. However, statistical connections aren't by themselves enough to
establish a cause-effect relationship because (a) a statistical
connection could occur simply by random chance; and (b) two types of events that have a statistical connection (correlation) might not actually
have a direct causal connection, but instead might both be caused by
another cause entirely. However, statistical connections do suggest
further investigation to see if there is a physical connection
by which one phenomenon might in fact cause the other.
In the second half of the 20th Century, research
meteorologists began to notice a possible connection between
semi-oscillatory behavior of sea-surface temperatures in the tropical
Pacific Ocean, and patterns of rainfall in many places around the world,
including parts of the West Coast of North America. The phenomenon is
called El Niņo/Southern Oscillation (ENSO). In Part II of the Final
Project, you will look for statistical connections between the two
different phases of El Niņo/Southern Oscillation (ENSO), namely El Niņo
and its opposite phase, La Niņa, and some of the interannual variability
of precipitation at each of the West Coast stations that you have
To do this, you will need to do the following:
- Identify the particular rainy seasons in which El Niņo or La
Niņa events of various strengths have occurred.
- Count how many El Niņo events of various strengths occurred
during "wet" years and during "dry" years, and similarly for La Niņa
events, for each station that you are analyzing.
- Test the hypothesis that the observed number of El Niņo or La Niņa events
that occurred during wet or dry years could have occurred solely by
random chance—that is, that there is no statistical connection between El
Niņo or La Niņa events and wet or dry years. If the odds are low that
the observed number could have occurred by chance, you'll reject the
hypothesis and conclude that there is likely a connection between the
two. That will raise the question about whether there is a physical,
causal connection, which we'll pursue further in Part III of the Final
Project, if warranted.
II. What You Did in Part I
In Part I, you were assigned a set of four weather
stations with continuous precipitation records since 1950,
including one from each of the following four regions (see map):
- Southern California on or near the coast (San
Diego, Los Angeles International Airport [KLAX], or Santa Barbara)
- Central California (Watsonville, Mission Dolores in San Francisco,
- Northern California or southern Oregon (Eureka, CA; Ashland, OR;
or Medford, OR)
- Washington state (Olympia, WA; Palmer, WA; or Bellingham, WA)
You were given rainfall data for your assigned stations, which were analyzed (mostly for you) as follows:
- For each station, the average observed
precipitation for each month, from 1950 through the most recent year with a complete precipitation record, was calculated and plotted.
- Using these results, you identified a single, five-month "rainy
season" that more or less described the wettest months for all
- For each station, the total precipitation for each
five-month rainy season, from the season ending in 1951 through the one
ending in the the most recent year with sufficient data, was computed.
- For each station, the rainfall records were sorted (ranked) by total 5-month rainy season
precipitation, from highest to lowest, and the rainy seasons
were divided into "wet" years, "normal" years, and "dry" years, each representing
about one-third of the total number of rainy seasons since 1951.
- For each station, the wet and dry years were color coded, and then
re-sorted by year (that is, chronologically).
- For each station, the analysis was repeated independently and the results carefully checked against the results of the first analysis to reduce the chance that errors were made, and any errors
in the analysis were corrected.
Our goal now see if there is a statistical connection
between ENSO events (El Niņos and La Niņas) and wet or dry years at
each of your selected stations. To do this, we first need to identify
when ENSO events occurred since 1950, and to do that we need a criterion
for defining the occurrence of these events.
III. Defining and Classifying El Niņo and
La Niņa Events
The Oceanic Niņo
Index (ONI). The ONI
has become the de-facto standard that the National Oceanographic
and Atmospheric Administration (NOAA) uses for identifying El Niņo
(warmer than normal sea-surface temperature) and La Niņa (cooler
than normal) events in the eastern and central tropical Pacific
Ocean. The ONI is defined as the
3-month running mean sea-surface temperature (SST)
anomaly for the "Niņo 3.4" region (i.e., in the region from
5°N – 5°S latitude and 120° –
170°W longitude), as shown in the figure below.
What is a 3-month
running mean? For any particular month, it
consists of that month's observations averaged with the previous and
next month's observations. For example, the 3-month running mean for
November consists of the average of October, November, and December
(OND), while the 3-month running mean for December consists of the
average of November, December, and January (NDJ).
What is a SST
anomaly? The sea-surface temperature (SST) anomaly is the
difference between the observed SST at any particular time and
the long-term average SST. A positive anomaly (that
is, an anomaly greater than zero) means that the SST is warmer
than average, while a negative anomaly means that SST is colder
Definition of El Niņo events.
For our purposes, we'll define El Niņo events to occur when there
are four or more consecutive months in which the ONI equals or
exceeds +0.5oC, and at least one of the four
consecutive ONI values overlaps with one of the months of the five-month
rainy season for your four stations.
[An example: to clarify what we mean
by "...at least one ... ONI value overlaps...", suppose that
your five-month rainy season consists of November through March
(NDJFM). Each ONI value consists of the average of three months' worth
of SST anomalies, such as September, October, and November (SON), which
we would call the "October" ONI value because the middle month is
October. Although October isn't one of the five NDJFM rainy season
months, November is both a NDJFM rainy season month and one of
the "October" (SON) ONI months. Hence, we say that the October ONI
overlaps with the NDJFM rainy season.
(The rationale here is that an El Niņo
event that begins in November is going to contribute to the October ONI
value and potentially influence precipitation during much of
the NDJFM rainy season, so we should consider October ONI values when
defining El Niņo
events that might influence the NDJFM rainy season.
Similarly, the "April" ONI, which consists
of the average SST anomalies for March, April, and May (MAM), includes
March SST anomalies and hence overlaps with the NDJFM rainy season.]
We can further subclassify El Niņo events
- "Strong" (at least three of the four+ consecutive SST anomalies equal
or exceed +1.5oC)
- "Moderate" (the conditions for a "strong" event aren't met, but
at least three of the SST anomalies equal or exceed +1.0oC)
- "Weak" (the conditions for a "moderate" event aren't met, but all
four+ SST anomalies equal or exceed +0.5oC)
La Niņa events. Similarly, we'll
define La Niņa events as four or more consecutive months in which
the ONI equals or exceeds -0.5oC, and at least one of
those four months overlaps with with one of the months of the
five-month rainy season for your four stations.
We can further subclassify La Niņa events as follows:
- "Strong" (at least three of the four+ consecutive SST anomalies equal or exceed –1.5oC)
- "Moderate" (the conditions for a "strong" event aren't met, but
at least three of the SST anomalies equal or exceed –1.0oC)
- "Weak" (the conditions for a "moderate" event aren't met, but all
four+ SST anomalies equal or exceed –0.5oC)
IV. Instructions for Part II
We will use Oceanic
Niņo Index (ONI) data that we have adapted from data downloaded
Climate Prediction Center, which also provides a graph
showing Oceanic Niņo Index (ONI) vs. time from 1950 to the
current year. [Note, though, that the graph identifies "Strong" and
"Moderate" ENSO events based on a slightly different criterion than the
one described in Section III above, so you should not rely entirely on
these particular classifications to check your own.]
- Identify El Niņo and La Niņa events and
classify them as "Strong", "Moderate", or "Weak".
- Using the data in the table, "Three-Month
Running Average Oceanic Niņo Index (ONI) (Oct–Apr)",
apply the criteria defined in Section III above to identify El Niņo and
La Niņa events and classify them as "Strong", "Moderate", or "Weak".
[Example: In the 1950-1951 NDJFM rainy season,
there were six months in a row with ONI values exceeding –0.5oC,
so this was a La Niņa event. Only two of those values were as great as
–1.0oC, so it was a "weak" event. In the 1951-1952 NDJFM
rainy season, there were three months in a row with ONI values
exceeding 0.5oC, not enough to qualify as an El Niņo event.
In the 1954-1955 NDJFM rainy season, there were seven months in a row
with ONI values exceeding –0.5oC, with four in a row equaling
or exceeding –1.0oC but none equaling or exceeding –1.5oC,
so this was a "moderate" La Niņa event.]
- For each event that you identify and classify,
enter the year in the appropriate "Year" column in all four of the
accompanying blank tables, "El
Niņo/La Niņa Classification and Probabilities".
(At this point, all four stations should have identical classification
and probabilities tables.)
- To reduce the chances of mistakes, compare
your results with someone else's and make any needed corrections.
- For each station, count the number of each
type of event that occurred during "wet" years and during "dry"
- Refer to the precipitation analyses
for your four stations. Pick a station. For that
station, for each weak, moderate, and strong El Niņo and La Niņa event,
determine whether the event occurred during a "wet" year, a "dry" year,
or neither. In the appropriate column of the "El Niņo/La Niņa Classification
and Probabilities" table for the chosen station, enter a
"W", a "D", or leave blank, respectively, for the particular event.
- When you've finished classifying events as
"wet" year or "dry" year events for the chosen station, count the total
number of each type of event that occurred in wet years and in dry
years, and enter the totals in the station's "El Niņo/La Niņa Classification
and Probabilities" table.
- Determine the combined
number of moderate and strong events in wet years and enter the total
in the table. Repeat for moderate plus
strong events in dry years.
- To reduce the odds that you've made a mistake,
compare your results with someone else analyzing the same station, and
make any necessary corrections.
- Repeat the previous steps for each of the
other three stations.
- For each station and each type of ENSO
event, test the hypothesis that the number of events (or more) actually
observed in wet or dry years could have occurred by random
- Refer to the accompanying Tables of Probabilities.
Pick a station. For each type of event, determine the probability that,
out of all weak El Niņos observed to occur since 1951, at least as many
as were actually observed in wet years could have occurred by random
chance. (See below for more detailed instructions.) Repeat for moderate
and for strong El Niņos. Enter the results in the appropriate cells of
the "El Niņo/La Niņa
Classification and Probabilities" table for the chosen station.
[Example: Suppose that a total of eight "weak"
La Niņas have occurred since 1950, and suppose that five of them
occurred during years that were "wet" at a particular station. According to the Tables of Probabilities,
the probability that five or more out of eight weak La Niņas could have
occurred during wet years by random chance is only 8.8%. (Note that the
probability that exactly five out of eight could have occurred
during wet years by random chance is even lower, but we're giving the
"random chance" hypothesis the benefit of the doubt to increase our
confidence that we're right if we reject the hypothesis.)]
- Repeat for El Niņos of each type that occurred
in dry years
- Repeat for weak, moderate, strong, and
combined moderate plus strong La Niņas in wet years and in dry years.
- Which type(s) of El Niņo and/or La Niņa
event(s) would you say had a high likelihood of being statistically
connected to the occurrence of wet or dry years? (See below for
guidance about how to decide.)
- Repeat for each of your other three stations.
Using the Tables of Probabilities.
Suppose that you picked a year at random from the period from 1951
through 2012. The chances that it would be a "wet" year", "dry" year, or
neither (using the definitions in Part I) would each
be about 1/3 = 0.3333333... (that is, 33%).
If you picked not one but two years at
random, the probability that both are wet years is 1/9 =
0.111... (that is, 11.1%). (The probability that any two events both
occur, is just the probability of each event multiplied together,
which in this case is 1/3 × 1/3 = 1/9, or 11.1%.) The probability that
both years are dry years is the same (1/9, or 11.1%).
If you picked three years at random, the
probability that all three occur in wet years or all three in
dry years, is 1/3 × 1/3 × 1/3 = 1/27 = 0.37 (that is, 3.7%). Similarly,
if you picked three years at random, we could (with a little more
effort) calculate the probability that at least two of those
three years (that is, either two years or three years)
are wet years. (That turns out to be 25.9%.)
As a result of your analysis of ENSO events and
the precipitation analyses, you known how many El Niņo or La Niņo events of each
type have occurred at a particular station since 1950-51, and you know
how many of each have occurred in "wet" rainy seasons and in "dry" rainy
seasons. If there is no connection between any particular type of ENSO
event and wet or dry rainy seasons, then the association between them
would be purely random. In that case, we can calculate the probability
that what we observed could have happened by random accident.
However, if the probability is low enough, we might
justifiably conclude that there is probably a (non-random) connection
between that type of ENSO event and wet or dry rainy seasons. In the
language of statistics, we say that we would "reject the hypothesis"
that at that station, ENSO events of that type occur in wet or dry years
solely by random chance.
To test the hypothesis that ENSO events of a
particular type occur in wet or dry years solely by random accident,
proceed as follows:
- For a particular station,
in its "El Niņo/La Niņa
Classification and Probabilities" table, look up
the total number of ENSO events of a particularly type (or combination
of types) that have occurred during the period from 1951-2012. On the
accompanying Tables of
Probabilities, find the particular column (A) (labeled "#
of Events") that corresponds to that total number of events.
- In the table for a particular station, "El Niņo/La Niņa Classification
and Probabilities", look up the number of events of that
type that occurred in "wet" or in "dry" years. On the section of the Tables of Probabilities
that you located in Step (a) above, locate the row in column (B)
(labeled "# Wet or Dry Years") containing the number of events that you
counted in wet or dry years.
- In column (C) (labeled "Probability"), look up the
probability that, from among the observed total number of ENSO
events, the number of them (or more) that would have occurred by
random chance in wet or in dry years.
- If the probability is low enough, then reject
the hypothesis that there are only random, accidental associations
between wet or dry rainy seasons and ENSO events of that type. This
encourages us to pursue questions about whether, and how, ENSO events
might lead physically (that is, cause) wet or dry rainy seasons, which
in turn might help us predict the occurrence wet and dry rainy seasons
with greater accuracy than we could otherwise.
How low is "low enough"? It really depends on how sure you want to
be that you're not reaching a wrong conclusion. A probability less than
5% (or even 1%) is best, but for our purposes we'll settle for less
than 15%. If the probability is less than 15% that a particular type of
event could have occurred in wet or dry years as often as it actually
did by random chance alone, then we could be at least 85% sure that the
association wasn't actually random.
[Example: In the example given in 3(a) above, we
noted that the probability that at least five out of eight weak La Niņas
could have occurred during wet years by random chance alone was only
8.8%. If the probability that this could happen by random chance is only
8.8%, then the probability that it didn't happen by random
chance is 100% – 8.8% = 91.2%.
Since 8.8% is less than the 15% threshold that we decided upon, we would
reject the initial hypothesis and say that we're 91.2%
confident that since 1951, weak La Nina events did not occur in wet
years by random chance. Rather, there is probably some sort of
(non-random) connection between them.
We have to be cautious in the claims we make, though, because if there
were errors in the data, or if some of our underlying assumptions were
faulty, then our conclusion would not be justified. (Note that in such a
case the conclusion might still be right, but we simply couldn't claim
to have offered acceptable evidence supporting it.) We can't even
conclude that La Nina events might cause wet years (or vice versa) at
the chosen station because it's possible that both are caused by some
other, third type of event or phenomenon. However, if the data are good
and our assumptions reasonable, then the probability that these events
are connected in some way seems high enough to justify looking for a
physical, causal connection between them.]