Instructions for this demo are down below the graph.
Click on the Go button to generate 100 data sets, just like the data for assignment 1. For each data set, the graph shows the line that connects the first and last data point. I call these "End Point Regression" lines. (The data points are not shown, to avoid cluttering the graph.) These 100 End Point Regression cluster around the true line.
Click the Go button again to show in blue the least squares regression lines for each of those same 100 data sets. The least squares lines cluster closer than the endpoint lines. This shows that the least squares estimator is more efficient than the connect-the-endpoints estimator.
The least squares estimator wins this efficiency matchup because the data here are generated in a way that conforms with the assumptions behind using least squares: an expected value of 0 for all observations' errors, the same variance for all observations' errors, and a normal distribution for each observation's error. It is possible to devise a way of generating the data that would make the connect-the-endpoints estimator more efficient. Judgements about the efficiency of an estimator therefore depend on assumptions about how the data were generated.
Click on the Clear button to clear the screen for a new demonstration.
Click on the Show True button to show or hide the true line.