List of data sets used in this online book
The Intuition Behind Correlation
Automobile MPG Data Set: Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. Download curated data set

Average monthly maximum temperatures recorded in Boston, Massachusetts. Data is taken from National Centers for Environmental Information. Download curated data set

Understanding Partial Auto-correlation And The PACF
Southern Oscillation Index (SOI) data is downloaded from United States National Weather Service’s Climate Prediction Center’s Weather Indices page. Download curated data set.
Average monthly maximum temperatures recorded in Boston, Massachusetts. Data is taken from National Centers for Environmental Information. Download curated data set

How To Adjust For Inflation In Monetary Data Sets
Wages and salaries by Occupation: Total wage and salary earners (series id: CXU900000LB1203M). U.S. Bureau of Labor Statistics

How To Isolate Trend, Seasonality And Noise From A Time Series
U.S. Census Bureau, Retail Sales: Used Car Dealers [MRTSSM44112USN], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/MRTSSM44112USN, June 17, 2020, under FRED copyright terms. Download curated data set.
U.S. Census Bureau, E-Commerce Retail Sales [ECOMNSA], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/ECOMN, under FRED copyright terms.
SILSO, World Data Center — Sunspot Number and Long-term Solar Observations, Royal Observatory of Belgium, on-line Sunspot Number catalogue: http://www.sidc.be/SILSO/, 1818–2020 (CC-BY-NA)
Samuel H. Williamson, “Daily Closing Values of the DJA in the United States, 1885 to Present,” MeasuringWorth, 2020 URL: http://www.measuringworth.com/DJA/

The White Noise Model
Restaurant decibel levels data is copyright Sachin Date under CC-BY-NC-SA. Download curated data set

The Assumptions Of Linear Regression, And How To Test Them
Combined Cycle Power Plant Data Set: downloaded from UCI Machine Learning Repository is used under the following citation requests:
- Pınar Tüfekci, Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods, International Journal of Electrical Power & Energy Systems, Volume 60, September 2014, Pages 126–140, ISSN 0142–0615, [Web Link],
([Web Link]) - Heysem Kaya, Pınar Tüfekci , Sadık Fikret Gürgen: Local and Global Learning Methods for Predicting Power of a Combined Gas & Steam Turbine, Proceedings of the International Conference on Emerging Trends in Computer and Electronics Engineering ICETCEE 2012, pp. 13–18 (Mar. 2012, Dubai

Introduction to Heteroskedasticity
U.S. Census Bureau, Retail Sales: Beer, Wine, and Liquor Stores [MRTSSM4453USN], retrieved from FRED, Federal Reserve Bank of St. Louis; USN, June 19, 2021.
U.S. Bureau of Labor Statistics, Export Price Index (End Use): Non-monetary Gold [IQ12260], retrieved from FRED, Federal Reserve Bank of St. Louis;, June 19, 2021. Download curated data set

Building Robust Linear Models For Nonlinear, Heteroscedastic Data
U.S. Bureau of Labor Statistics, Export Price Index (End Use): Non-monetary Gold [IQ12260], retrieved from FRED, Federal Reserve Bank of St. Louis;, June 19, 2021. Curated version for download

The Poisson Regression Model
Bicycle Counts for East River Bridges. Daily total of bike counts conducted monthly on the Brooklyn Bridge, Manhattan Bridge, Williamsburg Bridge, and Queensboro Bridge. From NYC Open Data under Terms of Use. Curated data set for download.

The Negative Binomial Regression Model
Bicycle Counts for East River Bridges. Daily total of bike counts conducted monthly on the Brooklyn Bridge, Manhattan Bridge, Williamsburg Bridge, and Queensboro Bridge. From NYC Open Data under Terms of Use. Curated data set for download.

The Generalized Poisson Regression Model
Bicycle Counts for East River Bridges. Daily total of bike counts conducted monthly on the Brooklyn Bridge, Manhattan Bridge, Williamsburg Bridge, and Queensboro Bridge. From NYC Open Data under Terms of Use. Curated data set for download.

Fitting Linear Regression Models on Count Based Data Sets
Bicycle Counts for East River Bridges. Daily total of bike counts conducted monthly on the Brooklyn Bridge, Manhattan Bridge, Williamsburg Bridge, and Queensboro Bridge. From NYC Open Data under Terms of Use. Curated data set for download.

Introduction to Regression With ARIMA Errors Model
Data set of Air Quality measurements is from UCI Machine Learning repository and available for research purposes. Curated data set download link
Paper citation for original data set: S. De Vito, E. Massera, M. Piga, L. Martinotto, G. Di Francia, On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario, Sensors and Actuators B: Chemical, Volume 129, Issue 2, 22 February 2008, Pages 750–757, ISSN 0925–4005, [Web Link]. ([Web Link])

Holt-Winters Exponential Smoothing
U.S. Census Bureau, Retail Sales: Used Car Dealers [MRTSSM44112USN], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/MRTSSM44112USN, June 17, 2020, under FRED copyright terms. Download link to curated data set.
Merck & Co., Inc. (MRK), NYSE — Historical Adjusted Closing Price. Currency in USD, https://finance.yahoo.com/quote/MRK/history?p=MRK, 23-Jul-2020. Copyright Yahoo Finance and NYSE

SILSO, World Data Center — Sunspot Number and Long-term Solar Observations, Royal Observatory of Belgium, on-line Sunspot Number catalogue: http://www.sidc.be/SILSO/, 1818–2020 (CC-BY-NA)
Poisson Regression Models For Time Series Data
The STRIKES Data set. Source: R data sets

The Binomial Regression Model
The Titanic data set has been downloaded from Stanford’s CS109 class website. Curated data set download link.

Introduction to Survival Analysis – Concepts, Techniques and Regression models
The Stanford heart transplant data set is taken from https://statistics.stanford.edu/research/covariance-analysis-heart-transplant-survival-data and available for personal/research purposes only. Curated data set download.

The Stratified Cox Proportional Hazards Regression Model
The VA lung cancer data appears in the following book: The Statistical Analysis of Failure Time Data, Second Edition, by John D. Kalbfleisch and Ross L. Prentice.

Testing For Normality of Residual Errors Using Skewness And Kurtosis Measures
Wages and salaries by Occupation: Total wage and salary earners (series id: CXU900000LB1203M). U.S. Bureau of Labor Statistics under US BLS Copyright Terms. Curated data set link for download

The Nonlinear Least Squares (NLS) Regression Model
The Bike sharing data set has been downloaded from UCI Machine Learning Repository. Curated data set download link.
Data set cited in paper: Fanaee-T, Hadi, and Gama, Joao, “Event labeling combining ensemble detectors and background knowledge”, Progress in Artificial Intelligence (2013): pp. 1–15, Springer Berlin Heidelberg, doi:10.1007/s13748–013–0040–3.

Estimation of Vaccine Efficacy Using a Logistic Regression Model
COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. URL: https://github.com/CSSEGISandData/COVID-19.
Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Inf Dis. 20(5):533–534. doi: 10.1016/S1473–3099(20)30120–1
The Akaike Information Criterion
Average monthly maximum temperatures recorded in Boston, Massachusetts. Data is taken from National Centers for Environmental Information. Download link for the curated data set

R-squared, Adjusted R-squared and Pseudo-R-squared
The Taiwan House prices data set retrieved from UCI Machine Learning repository. Curated data set download
Data set citation: I-Cheng Yeh, Tzu-Kuang Hsu, Building real estate valuation models with comparative approach through case-based reasoning, Applied Soft Computing, Volume 65, 2018, Pages 260-271, ISSN 1568-4946

The Chi Squared Test
The TAKEOVER BIDS data set has been referenced from the following paper: Jaggia, S., Thosar, S. Multiple bids as a consequence of target management resistance: A count data approach. Rev Quant Finan Acc 3, 447–457 (1993). https://doi.org/10.1007/BF02409622 PDF Download link Download link for the curated data set
Testing The Assumptions Of the Cox Proportional Hazards Model Using Schoenfeld Residuals
The Stanford heart transplant data set is taken from https://statistics.stanford.edu/research/covariance-analysis-heart-transplant-survival-data and available for personal/research purposes only. Download curated data set.

Estimating The Range Of A Population Parameter: A Guide To Interval Estimation
The DOHMH Beach Water Quality Data taken from NYC OpenData under their Terms of Use. Download curated data set.

Estimator Bias, And The Bias — Variance Tradeoff
The North East Atlantic Real Time Sea Surface Temperature data set downloaded from data.world under CC BY 4.0.

The Auto-Regressive Poisson Model
The Poisson INAR(1) Regression Model
The Manufacturing strikes data set is one of several data sets available for public use and experimentation in statistical software, most notably, over here as an R package. The data set has been made accessible for use in Python by Vincent Arel-Bundock via vincentarelbundock.github.io/rdatasets under a GPL v3 license.

Hidden Markov Models
U.S. Bureau of Labor Statistics, Unemployment Rate [UNRATE], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/UNRATE, October 29, 2021. Available under Public license.

The Markov Switching Dynamic Regression Model
U.S. Bureau of Economic Analysis, Personal Consumption Expenditures [PCE], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/PCE, November 11, 2021. Available under Public license.

University of Michigan, Survey Research Center, Surveys of Consumers. The Index of Consumer Sentiment. Available under public license.

Hamilton, James, Dates of U.S. recessions as inferred by GDP-based recession indicator [JHDUSRGDPBR], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/JHDUSRGDPBR, November 12, 2021.

The Pooled OLS Regression Model for Panel Data Sets
The Fixed Effects Regression Model For Panel Data Sets
The Random Effects Regression Model for Panel Data Sets
World Development Indicators data from World Bank under CC BY 4.0 license. Download curated dataset
