|
|
|
|
|
|
|
Instructions and materials
Linear regression in Excel
Excel is our basic tool for conducting linear regression in the course. We will also introduce Igor in order to introduce you to a more elegant software package. But, Excel gets the job done. There are two basic tools in Excel. Trendline is a way to obtain a linear regression of data in a plot. This is a good quick way to get an estimate of the slope and intercept. Trendline also gives you the correlation coefficient, R2. However, if you want the errors themselves (i.e. in the slope and intercept) then you need to use LINEST. The segment below shows both of these methods. It also reviews the matrix steps that are behind LINEST (and Trendline). These are given in Lecture 2 as well.
The LS command worksheet
To use the LS command worksheet you will simply need to download the commands
First paste your data as x,y pairs in columns A and B of an Excel worksheet. Then follow the directions exactly by making a rectangle for the LINEST output and AVERAGE, SUMSQ (sum of squares) and TINV (t-test). Then you will input 5 lines of code that will generate the regression line (column C), the lower 95% confidence limit (column D), the upper 95% confidence limit (column E), the lower 95% prediction limit (column F) and the upper 95% prediction limit (column G). These steps are all shown in the PDF in the link below.
The Q-test worksheet: treating a potential outlier in 1-D data set
The Dixon Q-test is based on a comparison between a calculated value Q = gap/range with the Q value for a given number of data points within the 95% confidence limit. The test assumes that the data conform to a normal distribution. Thus, the idea is that one is comparing a point to the distribution predicted by the remaining points. Given this assumption the test is relatively direct. Calculate the range = highest - lowest and the gap = highest - next highest and take the absolute value of the ratio |gap/range|. This is done mostly automatically in the worksheet below. In the worksheet you still need to paste in the data in column A (starting on A2 and going down). You need to change the number of points in cell F2 so that the MAX and MIN functions go from A2 to A# where # is the number of data points. MAX($A$2:$A$#)-MIN($A$2:$A$#). As long as the number of data points is greater than 10 you do not need to change cell I2, which is set to 0.466, the value appropriate for all data sets greater than 10 points. One case is not considered here. If the outlier is a large negative value then you would need to order the data set from lowest to highest (instead of highest to lowest). The calculation in the DixonQ worksheet is set to use absolute values so that the case of a negative outlier work the same way as a positive one.
The Q-test worksheet: treating a potential outlier in 2-D data set
The Dixon Q-test for a line is based on the values in the residuals (residuals = calculated line - data). The column of y values can then be used in a Dixon Q-test in the same way as above. First, calculate Q = gap/range and then compare to the appropriate value for a 95% confidence limit. To test for a negative outlier the worksheet is expanded as shown in the downloads below.
The limit of detection (LOD)
The LOD is determined by drawing a horizontal line from the intersection of the upper 95% confidence limit with the y-axis until it intersects the lower 95% confidence limit. But, often the data do not extend to zero. The worksheet below gives you two ways to use the LS data to extrapolate to zero and obtain an estimate for the LOD.
The RLS worksheet
The Excel worksheet has been designed as tool for your use throughout the course. It provides several options for error analysis. Perhaps the most important is a linear regression that includes outputs of the standard errors using both least squares and robust statistics. By the end of the lab and Lecture 1 you should understand that robust statistics is useful for determining outliers. Outliers can be excluded from the data set provided you have gone through the appropriate statistical analysis. The RLS worksheet does this, but you should not use it as a "black box". You should understand the various tools provided and make sure you can justify the applications of this worksheet as you perform analyises of the data in various laboratories and experiments in this course.
|
|
|