METR 104:
Our Dynamic
(Lecture w/Lab)
(An Investigation):
Why Does West Coast Precipitation
Vary from Year to Year?
Dr. Dave Dempsey
Dept. of Geosciences
SFSU, Spring 2012

Part II: Statistical Connections
between El Niņo/La Niņa Events and West Coast Precipitation

Two lab sessions, on Wednesdays, April 25 and May 2 (Lab Section #1), or Fridays, April 27 and May 4 (Lab Section #2), will support the final project, which will be worth 15% of your final course grade. (Some of the lectures on Mondays, April 30 and May 7 will also support this assignment.)

The final project is a research project broken into four distinct parts:

  1. Part I: analysis of precipitation data (done for you, but you need to understand how it was done in order to interpret it properly).
  2. Part II: analysis of Pacific equatorial sea-surface temperature and statistical connections to precipitation data (to be completed in lab session on Wednesday, May 2 or Friday, May 4)
  3. Part III: Jet Stream Patterns during El Niño/La Niño Events (supported in lecture class on Mon., May 7)
  4. A final, summative report (due on Friday, May 18; I will provide you with a template for this)

Overall Objectives:

Objectives for Part II:


I. Introduction

As you probably discovered in Part I of the Final Project, the amount of rainfall that West Coast weather stations receive through the five wettest months of the year (which are typically October through February or November through March, depending on the station), can vary quite a bit from year to year. Very low rainfall years, especially several in a row, result in drought, which stresses people, plants, animals, and industry. Very high rainfall years are often associated with increased likelihood of flooding (though compared to drought, flooding isn't as simply related to total rainfall over periods of months).

What could account for this inter-annual (that is, year to year) variability in precipitation? There is probably more than one cause. Research meteorologists try to identify the most important causes, and operational meteorologists (that is, forecasters) apply that understanding to try to predict whether the upcoming winter rainfall season will likely be relatively wet, "normal", or relatively dry. Being able to anticipate rainfall a season or two in advance can have major economic and other benefits.

To look for possible causes for something like inter-annual variability in precipitation, a common first step is to look for statistical connections between it and possible causes of it. However, statistical connections aren't by themselves enough to establish a cause-effect relationship because (a) a statistical connection could occur simply by random chance; and (b) an event and a possible cause that have a statistical connection might not actually have a direct causal connection, but instead might both be caused by another cause entirely. However, statistical connections do suggest further investigation to see if there is a physical connection by which one phenomenon might in fact cause the other.

In the second half of the 20th Century, research meteorologists began to notice a possible connection between semi-oscillatory behavior of sea-surface temperatures in the tropical Pacific Ocean, and patterns of rainfall in many places around the world, including parts of the West Coast of North America. The phenomenon is called El Niņo/Southern Oscillation (ENSO). In Part II of the Final Project, you will look for statistical connections between the two different phases of El Niņo/Southern Oscillation (ENSO), namely El Niņo and its opposite phase, La Niņa, and some of the interannual variability of precipitation at each of the West Coast stations that you have analyzed.

To do this, you will need to do the following:

  1. Identify the particular rainy seasons in which El Niņo or La Niņa events of various strengths have occurred.
  2. Count how many El Niņo events of various strengths occurred during "wet" years and during "dry" years, and similarly for La Niņa events, for each station that you are analyzing.
  3. Test the hypothesis that the observed number of El Niņo or La Niņa events that occurred during wet or dry years could have occurred solely by random chance—that is, that there is no statistical connection between El Niņo or La Niņa events and wet or dry years. If the odds are low that the observed number could have occurred by chance, you'll reject the hypothesis and conclude that there is likely a connection between the two. That will raise the question about whether there is a physical, causal connection, which we'll pursue further in Part III of the Final Project, if warranted.

II. What You Did in Part I

In Part I, you were assigned a set of four weather stations with continuous precipitation records since 1950, including one from each of the following four regions (see map):

  1. Southern California on or near the coast (San Diego, Los Angeles International Airport [KLAX], or Santa Barbara)
  2. Central California (Watsonville, Mission Dolores in San Francisco, or Sacramento)
  3. Northern California or southern Oregon (Eureka, CA; Ashland, OR; or Medford, OR)
  4. Washington state (Olympia, WA; Palmer, WA; or Bellingham, WA)

You then did the following:

  1. For each station, calculated the average observed precipitation for each month, from 1950 through the most recent year with a complete precipitation record, and plotted the twelve monthly averages.
  2. Using these results, identified a single, five-month "rainy season" that more or less described the wettest months for all four stations.
  3. For each station, computed the total precipitation for each five-month rainy season, from the season ending in 1951 through the one ending in the the most recent year with sufficient data.
  4. For each station, sorted the rainfall records by rainy season total precipitation, from highest to lowest, and divided the seasons into "wet" years, "normal" years, and "dry" years, each representing one-third of the total number of rainy seasons since 1951.
  5. For each station, color coded the wet and dry years, and then re-sorted the data by year.
  6. For each station, checked your results against the results of someone else to make sure there were no errors, and corrected any errors in the analysis that you found.

Our goal now see if there is a statistical connection between ENSO events (El Niņos and La Niņas) and wet or dry years at each of your selected stations. To do this, we first need to identify when ENSO events occurred since 1950, and to do that we need a criterion for defining the occurrence of these events.

III. Defining and Classifying El Niņo and La Niņa Events

    The Oceanic Niņo Index (ONI). The ONI has become the de-facto standard that the National Oceanographic and Atmospheric Administration (NOAA) uses for identifying El Niņo (warmer than normal sea-surface temperature) and La Niņa (cooler than normal) events in the eastern and central tropical Pacific Ocean.  The ONI is defined as the 3-month running mean sea-surface temperature (SST) anomaly for the "Niņo 3.4" region (i.e., in the region from 5°N – 5°S latitude and 120° – 170°W longitude), as shown in the figure below. 

    What is a 3-month running mean? For any particular month, it consists of that month's observations averaged with the previous and next month's observations. For example, the 3-month running mean for November consists of the average of October, November, and December (OND), while the 3-month running mean for December consists of the average of November, December, and January (NDJ).

    What is a SST anomaly? The sea-surface temperature (SST) anomaly is the difference between the observed SST at any particular time and the long-term average SST. A positive anomaly (that is, an anomaly greater than zero) means that the SST is warmer than average, while a negative anomaly means that SST is colder than average.

    Definition of El Niņo events. For our purposes, we'll define El Niņo events to occur when there are four or more consecutive months in which the ONI equals or exceeds +0.5oC, and at least one of the four consecutive ONI values overlaps with one of the months of the five-month rainy season for your four stations.

[An example: to clarify what we mean by " least one ... ONI value overlaps...", suppose that your five-month rainy season consists of November through March (NDJFM). Each ONI value consists of the average of three months' worth of SST anomalies, such as September, October, and November (SON), which we would call the "October" ONI value because the middle month is October. Although October isn't one of the five NDJFM rainy season months, November is both a NDJFM rainy season month and one of the "October" (SON) ONI months. Hence, we say that the October ONI overlaps with the NDJFM rainy season.

(The rationale here is that an El Niņo event that begins in November is going to contribute to the October ONI value and potentially influence precipitation during much of the NDJFM rainy season, so we should consider October ONI values when defining El Niņo events that might influence the NDJFM rainy season.

Similarly, the "April" ONI, which consists of the average SST anomalies for March, April, and May (MAM), includes March SST anomalies and hence overlaps with the NDJFM rainy season.]

We can further subclassify El Niņo events as follows:

    Definition of La Niņa events. Similarly, we'll define La Niņa events as four or more consecutive months in which the ONI equals or exceeds -0.5oC, and at least one of those four months overlaps with with one of the months of the five-month rainy season for your four stations. 

We can further subclassify La Niņa events as follows:

IV. Instructions for Part II

We will use Oceanic Niņo Index (ONI) data that we have adapted from data downloaded from NOAA's Climate Prediction Center, which also provides a graph showing Oceanic Niņo Index (ONI) vs. time from 1950 to the current year. [Note, though, that the graph identifies "Strong" and "Moderate" ENSO events based on a slightly different criterion than the one described in Section III above, so you should not rely entirely on these particular classifications to check your own.]

  1. Identify El Niņo and La Niņa events and classify them as "Strong", "Moderate", or "Weak".

    1. Using the data in the table, "Three-Month Running Average Oceanic Niņo Index (ONI) (Oct–Apr)", apply the criteria defined in Section III above to identify El Niņo and La Niņa events and classify them as "Strong", "Moderate", or "Weak".

      [Example: In the 1950-1951 NDJFM rainy season, there were six months in a row with ONI values exceeding –0.5oC, so this was a La Niņa event. Only two of those values were as great as –1.0oC, so it was a "weak" event. In the 1951-1952 NDJFM rainy season, there were three months in a row with ONI values exceeding 0.5oC, not enough to qualify as an El Niņo event. In the 1954-1955 NDJFM rainy season, there were seven months in a row with ONI values exceeding –0.5oC, with four in a row equaling or exceeding –1.0oC but none equaling or exceeding –1.5oC, so this was a "moderate" La Niņa event.]

    2. For each event that you identify and classify, enter the year in the appropriate "Year" column in all four of the accompanying blank tables, "El Niņo/La Niņa Classification and Probabilities". (At this point, all four stations should have identical classification and probabilities tables.)

    3. To reduce the chances of mistakes, compare your results with someone else's and make any needed corrections.

  2. For each station, count the number of each type of event that occurred during "wet" years and during "dry" years.

    1. Refer to the precipitation analyses for your four stations. Pick a station. For that station, for each weak, moderate, and strong El Niņo and La Niņa event, determine whether the event occurred during a "wet" year, a "dry" year, or neither. In the appropriate column of the "El Niņo/La Niņa Classification and Probabilities" table for the chosen station, enter a "W", a "D", or leave blank, respectively, for the particular event.

    2. When you've finished classifying events as "wet" year or "dry" year events for the chosen station, count the total number of each type of event that occurred in wet years and in dry years, and enter the totals in the station's "El Niņo/La Niņa Classification and Probabilities" table.

    3. Determine the combined number of moderate and strong events in wet years and enter the total in the table. Repeat for moderate plus strong events in dry years.

    4. To reduce the odds that you've made a mistake, compare your results with someone else analyzing the same station, and make any necessary corrections.

    5. Repeat the previous steps for each of the other three stations.

  3. For each station and each type of ENSO event, test the hypothesis that the number of events actually observed in wet or dry years (or more) could have occurred by random chance.

    1. Refer to the accompanying Tables of Probabilities. Pick a station. For each type of event, determine the probability that, out of all weak El Niņos observed to occur since 1951, at least as many as were actually observed in wet years could have occurred by random chance. (See below for more detailed instructions.) Repeat for moderate and for strong El Niņos. Enter the results in the appropriate cells of the "El Niņo/La Niņa Classification and Probabilities" table for the chosen station.

      [Example: Suppose that a total of eight "weak" La Niņas have occurred since 1950, and suppose that five of them occurred during years that were "wet" at a particular station. According to the Tables of Probabilities, the probability that five or more out of eight weak La Niņas could have occurred during wet years by random chance is only 8.8%. (Note that the probability that exactly five out of eight could have occurred during wet years by random chance is even lower, but we're giving the "random chance" hypothesis the benefit of the doubt to increase our confidence that we're right if we reject the hypothesis.)]

    2. Repeat for El Niņos of each type that occurred in dry years

    3. Repeat for weak, moderate, strong, and combined moderate plus strong La Niņas in wet years and in dry years.

    4. Which type(s) of El Niņo and/or La Niņa event(s) would you say had a high likelihood of being statistically connected to the occurrence of wet or dry years? (See below for guidance about how to decide.)

    5. Repeat for each of your other three stations.

    Using the Tables of Probabilities. Suppose that you picked a year at random from the period from 1951 through 2012. The chances that it would be a "wet" year", "dry" year, or neither (using the definitions in Part I) would each be about 1/3 = 0.3333333... (that is, 33%).

If you picked not one but two years at random, the probability that both are wet years is 1/9 = 0.111... (that is, 11.1%). (The probability that any two events both occur, is just the probability of each event multiplied together, which in this case is 1/3 × 1/3 = 1/9, or 11.1%.) The probability that both years are dry years is the same (1/9, or 11.1%).

If you picked three years at random, the probability that all three occur in wet years or all three in dry years, is 1/3 × 1/3 × 1/3 = 1/27 = 0.37 (that is, 3.7%). Similarly, if you picked three years at random, we could (with a little more effort) calculate the probability that at least two of those three years (that is, either two years or three years) are wet years. (That turns out to be 25.9%.)

As a result of your analysis of ENSO events and the precipitation analyses, you known how many El Niņo or La Niņo events of each type have occurred at a particular station since 1950-51, and you know how many of each have occurred in "wet" rainy seasons and in "dry" rainy seasons. If there is no connection between any particular type of ENSO event and wet or dry rainy seasons, then the association between them would be purely random. In that case, we can calculate the probability that what we observed could have happened by random accident.

However, if the probability is low enough, we might justifiably conclude that there is probably a (non-random) connection between that type of ENSO event and wet or dry rainy seasons. In the language of statistics, we say that we would "reject the hypothesis" that at that station, ENSO events of that type occur in wet or dry years solely by random chance.

To test the hypothesis that ENSO events of a particular type occur in wet or dry years solely by random accident, proceed as follows:

  1. For a particular station, in its "El Niņo/La Niņa Classification and Probabilities" table, look up the total number of ENSO events of a particularly type (or combination of types) that have occurred during the period from 1951-2012. On the accompanying Tables of Probabilities, find the particular column (A) (labeled "# of Events") that corresponds to that total number of events.

  2. In the table for a particular station, "El Niņo/La Niņa Classification and Probabilities", look up the number of events of that type that occurred in "wet" or in "dry" years. On the section of the Tables of Probabilities that you located in Step (a) above, locate the row in column (B) (labeled "# Wet or Dry Years") containing the number of events that you counted in wet or dry years.

  3. In column (C) (labeled "Probability"), look up the probability that, from among the observed total number of ENSO events, the number of them (or more) that would have occurred by random chance in wet or in dry years.

  4. If the probability is low enough, then reject the hypothesis that there are only random, accidental associations between wet or dry rainy seasons and ENSO events of that type. This encourages us to pursue questions about whether, and how, ENSO events might lead physically (that is, cause) wet or dry rainy seasons, which in turn might help us predict the occurrence wet and dry rainy seasons with greater accuracy than we could otherwise.

    How low is "low enough"? It really depends on how sure you want to be that you're not reaching a wrong conclusion. A probability less than 5% (or even 1%) is best, but for our purposes we'll settle for less than 15%. If the probability is less than 15% that a particular type of event could have occurred in wet or dry years as often as it actually did by random chance alone, then we could be at least 85% sure that the association wasn't actually random.

    [Example: In the example given in 3(a) above, we noted that the probability that at least five out of eight weak La Niņas could have occurred during wet years by random chance alone was only 8.8%. If the probability that this could happen by random chance is only 8.8%, then the probability that it didn't happen by random chance is 100% – 8.8% = 91.2%.

    Since 8.8% is less than the 15% threshold that we decided upon, we would reject the initial hypothesis and say that we're 91.2% confident that since 1951, weak La Nina events did not occur in wet years by random chance. Rather, there is probably some sort of (non-random) connection between them.

    We have to be cautious in the claims we make, though, because if there were errors in the data, or if some of our underlying assumptions were faulty, then our conclusion would not be justified. (Note that in such a case the conclusion might still be right, but we simply couldn't claim to have offered acceptable evidence supporting it.) We can't even conclude that La Nina events might cause wet years (or vice versa) at the chosen station because it's possible that both are caused by some other, third type of event or phenomenon. However, if the data are good and our assumptions reasonable, then the probability that these events are connected in some way seems high enough to justify looking for a physical, causal connection between them.]