LikeLike

]]>Just some feedback and a question or two from Lessons 4 and 5 of the course. These lessons are difficult but ultimately went well and were highly informative in my opinion.

The Einstein Summation Notation took a while getting used to for me anyway as it is the first time I have used it,and it required some outside resources which was to be expected. . By the end of Session 5 it wasn’t that bad. For anyone who has worked a lot with matrices there is an intuitive basis on which to build an understanding of ESC but it still gets tricky. The thing is there a number of steps in the notes that involve matrix/vector calculations based on indicial notation that requires correct interpretation to get to the final result and this interpretation might not be clear if one does not have a good understanding of (tensor) indicial notation.

It would be interesting to hear other students comments on the approach with ESC and index notation.

It was useful for me to the rules for ESC and index notation on hand in one spot as I worked through the equations in the lecture. One paper that was of some limited use with the rules was this one:

I can see where the ESC is a worthwhile method to study because of the notational advantages that you mention in the lectures. Actually it becomes an interesting topic in its own right. Having the same matrix relations from what you describe as “the usual method” together with the method presented in the slides (Lesson 4) was very useful because it helped to understand what the correct interpretation of the ESC and associated matrix algebra had to be in order to produce the same result. I found myself writing in the “usual method” results in a number of instances to help get the ESC interpretation correct. Also the example in Lesson 4 with linear regression was very helpful.

Lesson 5 was very informative regarding the various types of regression and the Theil-Sen example provided. I think however there may be an error in the Lesson 5 slides on slide 30 pertaining to the Theil-Sen slope confidence interval. After the two indexes L and U get calculated the slide indicates using the Dth and Uth slope estimates as the 1-alpa CI. In the audio I believe you are referring to the Lth and Uth estimates which would seem like the right values. Is that correct? [**Response**: *Yes. My bad.*]

One takeaway from Lesson 5 might be that weighted least squares would be an effective way to compensate parameter estimates for heteroscedasticity provided the variances (in the case discussed measurement errors) are known or can be estimated. Would that be correct and is that typically how heteroscedasticity is compensated? [**Response**: Yes. *In fact you don’t need to know the actual variances, it suffices to know the *relative* variances so you get the weights right.*]

Personally I am enjoying the lessons very much and it is proving to be an excellent course on this subject without doubt!

[**Response**: *Thanks. I *love* the Einstein summation convention, and I think its advantages are legion. But it does take getting used to, as does the index notation for vectors/matrices. Those who stick with it, and get accustomed to it, will reap many benefits.*

*Perhaps I don’t fully remember how difficult it was for me when I was first introduced to it. That’s a common problem with teaching in general — the teacher thinks it’s “obvious” but it’s only so if you already know it!*]

LikeLike

]]>This series is a another headliner often seen in the discussions of the impacts of AGW, Arctic Sea Ice Area. Actually it is it’s close relative Arctic Sea Ice Extent that is probably more often seen. The series here is Arctic Sea Ice Area for September, the yearly minimum area, over the satellite observation period 1979-2015. Arctic Sea Ice data analysis has been and continues to be a subject of intense investigation at Open Mind which provides an excellent compendium of time series analysis on this topic.

Anyway the plots for the data:

Measurements and regression curve:

beta_2 = -0.0027093 million sq km per yr^2, beta_1 = 10.775 million sq km per yr, beta_0 = -10708.344 million sq km

Regression Residuals:

First 20 lags of the autocorrelation function of the residuals with the +2 sigma thresholds:

The visual test shows the residuals may not be white as the autocorrelation at the 10th lag (of 20) is over the 2 sigma threshold. Also the Ljung Box test produced a p-value of .0187 thus rejecting the null hypothesis of no autocorrelation at an .05 level of significance. So it can be concluded that the residuals are not white at an .05 level of significance.

For Problem 5 the 5 regression fits were performed on the Sept Arctic Sea Ice data centered at the time origin.

The results for the 5 fits are contained in the following table:

Based on the significance of the coefficients it would appear the quadratic fit would be the best. The Ljung Box test rejected white residuals in all cases so perhaps there is a variable that still needs to be accounted for in the regression.

LikeLike

]]>The following time series has a bit of a story with it so it was used for this problem.

The time series data that resulted in the plots below is the HadCru global temperature anomaly data set over the period of 1970 to 2015. But it is not the standard data referenced to the period 1961-1990, instead this data is referenced to the period 1850-1900. Now one might wonder why would anyone be interested in a data set referenced to that period. Well as this article from the British Met Office

http://www.metoffice.gov.uk/research/news/2015/global-average-temperature-2015

indicates this period serves (at least in some sense) as the pre-industrial level that we often hear cited to in news articles or discussions on global warming science, policy etc. In the notes and references they give their rationale for using this period. So the data in the plot below is temperature anomaly above the pre-industrial level defined in the article.

The British economics historian TS Ashton put the Industrial Revolution as occurring from 1760-1830 so pre-industrial is taken as before 1760. That being the case there may be model based methods used to determine the pre-industrial levels commonly cited, but I was unable to find any. The IPCC AR5 had a couple of references to the 1850-1900 reference period and many references to pre-industrial levels but I found no explicit mention on how they determine temperatures relative to the preindustrial era though maybe there is something in there.

This data was kindly provided to me by the good people at the Met Office as it is not typically available at the their data website. It included monthly anomalies as well and all their standard uncertainty measures. Actually the anomalies referenced to 1850 to 1900 could be derived from those referenced to the 1961-1990. Method to do this were discussed at Open Mind a few years ago when questions arose as to how to find a common baseline for temperature data sets with different baselines.

Anyway the plots for the data. The results are similar to those for the NASA data given in the Session 3 presentation,:

Measurements and regression line:

beta_1 = .016996 degC/yr beta_0 = -33.316 degC

For 2015 the measured value was 1.06 degrees above pre-industrial and the estimated value was about .931 degrees above

Regression Residuals:

First 20 lags of the autocorrelation function of the residuals with the +2 sigma thresholds:

The visual test shows there is little reason to believe that the residuals are not white, The Ljung Box test produced a p-value of .4056 thus failing to reject the null hypothesis of no autocorrelation so it is consistent with the visual test of the autocorrelation function.

LikeLike

]]>For Problem 4 shifting the time scale by the mean of the time variable E(t):

For a linear regression, one thing shifting the origin does is it puts the mean of x and the mean of t_new at the center of the plot and it becomes easy to see what the data is doing relative to the mean.

It was shown in the lecture E(x) = beta_0 + beta_1*E(t), t = t_new and E(t) = 0. So the regression line intercept becomes the mean of the data.

Here is HadCru global temperature anomalies and regression line with time centered at the origin:

For comparison here is the unshifted HadCru global temperature anomaly data together with the regression line:

LikeLike

]]>For a linear regression, one thing shifting the origin does is it puts the mean of x and the mean of t_new at the center of the plot and it becomes easy to see what the data is doing relative to the mean.

It was shown in the lecture = beta_0 + beta_1 so t = t_new and = 0. So the regression line intercept becomes the mean of the data.

Here is HadCru global temperature amomalies and regression line with time centered at the origin:

For comparison here is the unshifted HadCru global temperature anomaly data together with the regression line:

LikeLike

]]>If anything a few more examples would be fine. And when you have just pointed to a little algebra, sometimes the slide shifts immediately after. It takes pausing the video and going back to follow.

My husband started following the course too, so we are discussing time series at home and having fun.

Thank you a lot for the course.

LikeLike

]]>LikeLike

]]>LikeLike

]]>[**Response**: *See the UPDATE at the end of this post.*]

LikeLike

]]>