
Python Statistics Fundamentals
Description
Book Introduction
"Python Statistics Fundamentals" explains the concepts of important statistical topics and provides step-by-step explanations of application methods and execution procedures through examples and Python solutions.
index
1.
Statistics and Python
1.1 Statistics and Data
1.1.1 Concepts and basic elements of statistics
1.1.1.1 Concepts of Statistics
1.1.1.2 Decision Making and Statistical Thinking
1.1.1.3 Basic Elements of Statistics
1.1.1.4 Challenges in the Application of Statistics
1.1.2 Data
1.1.2.1 Data Concepts and Types
1.1.2.2 Concept and Characteristics of Big Data
1.1.2.3 Data Utilization Plan
1.2 Python
1.2.1 Python's Strengths and Specialized Libraries
1.2.1.1 Strengths of Python
1.2.1.2 Python's specialized modules and libraries
1.2.1.3 Python's specialized statistical library
1.2.2 Installing Python and Related Packages
1.2.2.1 Installing Python
1.2.2.2 PIP Installation
1.2.2.3 Installing ANACONDA
1.2.2.4 Installing Jupyter Notebook
1.2.2.5 Installing required libraries using PIP
2.
Python language and data frames and basic operations
2.1 Python Language and Data Frames
2.1.1 Python Language and Standard Data Types
2.1.1.1 Python Language
2.1.1.2 Python Standard Data Types
2.1.2 Python functions and list and string processing
2.1.2.1 Python functions
2.1.2.2 Python data type conversion and annotation
2.1.2.3 List processing and related functions
2.1.2.4 String Handling
2.1.2.5 Using f-strings
2.1.3 Data Frame Structure and Utilization
2.1.3.1 Data frame structure
2.1.3.2 Setting up df using NumPy arrays and saving and reading files
2.1.3.3 Convenient operations and modifications of NumPy arrays
2.1.3.4 Merging Data Frames
2.1.3.5 Creating a Split Table and Pivot Table
2.1.3.6 Creating a cross-tabulation
2.2 Python's basic operation functions
2.2.1 Basic operators and applications of Python
2.2.1.1 Python Basic Operators
2.2.2 Basic Python functions related to numbers and statistics
2.2.2.1 Python Basic Functions for Numbers
2.2.2.2 math module functions
2.2.2.3 Basic Python Date-Related Functions
2.2.2.4 Python basic functions related to statistics
2.2.2.5 Command Statement Flow Control
3.
Descriptive statistics and data visualization, and frequency distribution generation
3.1 Descriptive Statistics and Data Visualization
3.1.1 Components and characteristics of descriptive statistics
3.1.1.1 Components of descriptive statistics
3.1.1.2 Characteristics of descriptive statistics
3.1.2 Exploratory Data Analysis and Data Visualization
3.1.2.1 Exploratory Data Analysis and Key Tools
3.1.2.2 Purpose and Objectives of Visualization
3.1.2.3 Roles and types of charts and how to apply them
3.1.3 Principles and Preparation of Frequency Distribution
3.1.3.1 Principle of frequency distribution
3.1.3.2 Procedure for creating a frequency distribution
3.2 Creating frequency distributions for qualitative and quantitative data
3.2.1 Creating a frequency distribution and chart for qualitative data
3.2.1.1 Creating a frequency distribution of qualitative data
3.2.1.2 Creating a frequency distribution chart for qualitative data
3.2.2 Creating a frequency distribution and chart of quantitative data
3.2.2.1 Creating a frequency distribution of quantitative data
3.2.2.2 Creating a frequency distribution chart for quantitative data
4.
Measures of central tendency and dispersion and shape
4.1 Central tendency measures and location measures
4.1.1 Concept of central tendency measures
4.1.1.1 Concept of average
4.1.1.2 Mode, Median, and Midpoint
4.1.2 Concept of relative position measurement
4.1.2.1 Standard score
4.1.2.2 Percentile
4.1.2.3 Quartiles and Deciles
4.1.3 Python Computation of Central Tendency Measures and Location Measures
4.2 Volatility Measurement and Distribution Shape Analysis
4.2.1 Concept and Types of Volatility Measures
4.2.1.1 Concept of volatility measure
4.2.1.2 Scope
4.2.1.3 Mean deviation
4.2.1.4 Variance and standard deviation
4.2.1.5 Using standard deviation
4.2.2 Python Computation of Volatility Measures
4.2.3 Concept of distribution shape measurement
4.2.3.1 The concept of skewness
4.2.3.2 The concept of kurtosis and the shape of the distribution
4.2.4 Python Computation of Distribution Shape Measures
4.2.4.1 Pandas and Scipy calculations of skewness
4.2.4.2 Calculating Kurtosis with Pandas and Scipy
4.2.4.3 Checking the distribution shape by creating a frequency distribution curve
5.
Probability calculations, contingency tables, and Bayes' theorem
5.1 Thought and probability calculations
5.1.1 Thought and Set
5.1.1.1 Experiments and Results
5.1.1.2 Concept and types of sets
5.1.1.3 Sample space and sample points
5.1.1.4 Concept and Types of Thought
5.1.2 Concept and calculation of probability
5.1.2.1 Concept of probability
5.1.2.2 Approach to probability calculation
5.1.3 Counting Rules and Probability Calculation Rules
5.1.3.1 Counting Rule
5.1.3.2 Probability Calculation Rules
5.1.4 Calculating Probability Using Python
5.1.4.1 Calculating the probability of heads on a random coin toss
5.1.4.2 Simulating a Random Coin Toss and Calculating the Probability of Heads
5.1.4.3 Creating a Dot Plot for a Simulation
5.1.4.4 Factorial, Permutation, and Combination Calculations
5.2 Conditional probability, contingency tables, and Bayes' theorem
5.2.1 Conditional probability and joint probability
5.2.1.1 Conditional probability and joint probability
5.2.1.2 Conditional probability and joint probability of independent events
5.2.2 Partition Table and Bayes' Rule
5.2.2.1 Partition Table
5.2.2.2 Joint probability table and marginal probability
5.2.2.3 Bayes' Law and Diachronic Interpretation
5.2.3 Creating a partition table and solving Bayes' rule using Python
5.2.3.1 Creating a partition table using Python
5.2.3.2 Bayes' Law Solution Using Python
6.
Random variables and probability distributions
6.1 Random variables, probability distributions, and expected values
6.1.1 Concept and types of random variables
6.1.1.1 Concept of random variable
6.1.1.2 Types of random variables
6.1.1.3 Random Number Generation in Python
6.1.2 Probability distribution and expected value of random variables
6.1.2.1 Concept of probability distribution
6.1.2.2 Expectation and Variance of Random Variables
6.2 Discrete and continuous probability distributions
6.2.1 Discrete probability distribution
6.2.1.1 Concept of discrete probability distribution
6.2.1.2 Bernoulli Trial and Bernoulli Process
6.2.1.3 Binomial distribution
6.2.1.4 Poisson distribution
6.2.1.5 Initial geometric distribution
6.2.1.6 Negative binomial distribution and geometric distribution
6.2.2 Continuous probability distribution
6.2.2.1 Concept of continuous probability distribution
6.2.2.2 Uniform distribution
6.2.2.3 Exponential distribution
6.2.2.4 Normal distribution
6.2.2.5 Central Limit Theorem and Monte Carlo Simulation
7.
Sampling, sampling, and sample distribution
7.1 Sampling and sampling
7.1.1 Knowledge Exploration and Sampling
7.1.1.1 Scientific Method
7.1.1.2 Investigation Approach and Requirements
7.1.1.3 Advantages and procedures for conducting sample surveys
7.1.2 Sampling
7.1.2.1 Sampling Concept
7.1.2.2 Sampling Bias and Sampling Error
7.1.2.3 Probability Sampling
7.1.2.4 Non-probability sampling
7.2 Statistical Inference and Sampling Distributions
7.2.1 Statistical Inference and Sample Statistics
7.2.1.1 Concepts of statistical inference
7.2.1.2 Sample statistics
7.2.2 Sampling distribution
7.2.2.1 Concept of sampling distribution
7.2.2.2 Sampling distribution of the mean
7.2.2.3 Expected value and standard deviation of the mean sampling distribution
7.2.2.4 Method for determining sample size and number of samples
7.2.2.5 Normal approximation and continuity correction of the binomial distribution
7.2.3 Other sampling distributions
7.2.3.1 Other sample statistics and sampling distributions
7.2.3.2 t distribution
7.2.3.3 Chi-square distribution
7.2.3.4 F distribution
8.
Estimation and confidence intervals
8.1 Point estimation and interval estimation
8.1.1 Point Estimation and Point Estimators
8.1.1.1 Point Estimation and Point Estimators and Point Estimates
8.1.1.2 Requirements for Estimate Quantity
8.1.1.3 Problems with Using Point Estimation
8.1.2 Interval Estimation
8.1.2.1 Concept and components of interval estimation
8.1.2.2 Concept and estimation procedure of confidence intervals
8.2 Mean, point estimation, and interval estimation
8.2.1 Interval Estimation of Population Mean and Determining Sample Size Using Python
8.2.1.1 Interval Estimation of the Population Mean: When the Population Variance is Known
8.2.1.2 Interval Estimation of the Population Mean: When the Population Variance is Unknown
8.2.1.3 Determining the minimum sample size for estimating the population mean
8.2.1.4 Bootstrap
8.2.2 Interval Estimation of Population Proportions and Variances
8.2.2.1 Interval estimation of population proportions and determination of sample size
8.2.2.2 Interval estimation of population variance
9.
Hypothesis testing
9.1 Hypothesis Testing Principles
9.1.1 Hypothesis Testing Principles
9.1.1.1 Concept of hypothesis testing
9.1.1.2 Classification of hypotheses
9.1.2 Hypothesis Testing Decision-Making and Implementation Procedures
9.1.2.1 Decision-making in hypothesis testing
9.1.2.2 Distinguishing between error and rejection threshold in hypothesis testing decisions
9.1.2.3 Hypothesis testing procedure
9.2 Sample Testing
9.2.1 Single-sample test
9.2.1.1 One-sample t-test
9.2.1.2 Proportion z-test
9.2.1.3 Hypothesis Testing of Variance and Standard Deviation
9.2.1.4 Power and error matrix
9.2.2 Principles and procedures of two-sample hypothesis testing
9.2.2.1 Principles of Two-Sample Hypothesis Testing
9.2.2.2 Test for difference in means of independent samples
9.2.2.3 Test for difference in means of two dependent samples
9.2.2.4 Test for difference in proportions of two independent samples
9.2.2.5 Test for difference in two sample variances
9.2.3 Python Example Solution for Two-Sample Hypothesis Testing
9.2.3.1 Two-Sample t-Test: Equal Population Variances Two-Sided Test
9.2.3.2 Two-Sample t-Test: Two-Tail Test with Different Population Variances
9.2.3.3 Two Independent Samples t-Test: Equal Population Variances Right-Tailed Test
9.2.3.4 Two Independent Samples t-Test: Equal Population Variances Left-Tailed Test
9.2.3.5 Two independent sample mean difference z-test
9.2.3.6 Using Python ztest to Test the Difference Between Two Independent Sample Means
9.2.3.7 Using Python rp.ttest to Test the Difference Between Two Independent Sample Means
9.2.3.8 Two-Sample t-Test Using Python rp.ttest
9.2.3.9 Two-Sample Proportions Z-Test Using Python's Proportions_ztest
9.2.3.10 F-test for Difference in Two-Sample Variances Using Python
10. ANOVA
10.1 Experimental design and ANOVA principles
10.1.1 Statistical Experiments and Experimental Design
10.1.1.1 Statistical Experiments
10.1.1.2 Experimental Design
10.1.2 ANOVA Principles and Types
10.1.2.1 Principles of ANOVA
10.1.2.2 Types of ANOVA
10.2 One-way ANOVA
10.2.1 Concept and implementation procedure of one-way ANOVA
10.2.1.1 Concept and characteristics of one-way ANOVA
10.2.1.2 Procedure for performing one-way ANOVA
10.2.2 One-Way ANOVA Python Solution
10.2.2.1 One-Way ANOVA Python Solution
10.2.2.2 Comparison of means of data with different sample sizes
10.2.2.3 Comparison of means of data with the same sample size
10.2.2.4 Splitting data by category and comparing means
10.3 Two-way ANOVA
10.3.1 Principles and methods of non-repeated two-way ANOVA
10.3.1.1 Principles of Two-Way ANOVA
10.3.1.2 How to perform non-repeated two-way ANOVA
10.3.2 Non-repeated Two-Way ANOVA Python Solution
10.4 Repeated Two-Way ANOVA
10.4.1 Principles and implementation method of repeated two-way ANOVA
10.4.1.1 Principles of repeated two-way ANOVA
10.4.1.2 How to perform repeated two-way ANOVA
10.4.2 Python solution for repeated two-way ANOVA
11.
Correlation analysis and scatter plot
11.1 Correlation Analysis and Covariance
11.1.1 Correlation Analysis and Covariance Concepts
11.1.1.1 Correlation and Linear Relationship
11.1.1.2 Covariance
11.1.2 Correlation Coefficient and Correlation Test
11.1.2.1 Principle of calculating correlation coefficient
11.1.2.2 Interpretation of correlation coefficients
11.1.2.3 Correlation Test
11.2 Scatterplot
11.2.1 Scatterplot Concepts and Interpretation
11.2.1.1 Scatterplot concept and preparation procedure
11.2.1.2 Scatterplot shape and direction and strength of correlation coefficient
11.2.2 Creating a Scatterplot Using Python
11.2.2.1 Creating a scatter plot comparing two correlation coefficients
11.2.2.2 Creating a scatterplot excluding outliers
11.2.2.3 Creating a Single Chart with Two Scatterplots
12.
Regression analysis
12.1 Regression Analysis Principles and Simple Regression Analysis
12.1.1 Regression Analysis Principles
12.1.1.1 Concept and Purpose of Regression
12.1.1.2 The Essence of Regression Analysis
12.1.1.3 Types of Regression Analysis
12.1.2 Simple regression analysis
12.1.2.1 Simple linear regression
12.1.2.2 Exploring Regression Fit and Outlier Influence
12.1.2.3 Goodness-of-fit diagnostic plot
12.1.2.4 Performing Simple Linear Regression Using Python
12.1.2.5 Confidence intervals and prediction intervals in regression analysis
12.1.2.6 Creating Confidence Interval and Prediction Interval Charts Using Python
12.2 Multiple Regression
12.2.1 Principles of Multiple Regression Analysis
12.2.1.1 The Essence of Multiple Regression Analysis
12.2.1.2 Setting up a multiple regression analysis model
12.2.1.3 Test statistics for multiple regression analysis
12.2.2 Multiple Regression Analysis Using Python
12.2.2.1 Individual Effects Multiple Regression Analysis
12.2.2.2 Performing Standard and Step-by-Step Multiple Regression Analysis Using Python
13.
Time series analysis
13.1 Principles of time series analysis and time series decomposition
13.1.1 Time Series Data Nature and Plot Creation
13.1.1.1 The Nature of Time Series Data
13.1.1.2 Creating a Time Series Plot Using the Python Altair Module
13.1.1.3 Creating a Python Series Plot
13.1.2 Decomposition of time series data
13.1.2.1 Four major determinants of time series
13.1.2.2 Python Date Display
13.1.2.3 Trend Analysis Python Solution
13.1.2.4 Python solution for cyclical variation analysis
13.1.2.5 Seasonal Variation Analysis
13.2 Time Series Forecasting
13.2.1 Nature and Performance of Time Series Forecasting
13.2.1.1 Concepts and Types of Time Series Forecasting
13.2.1.2 Time series forecasting performance and requirements
13.2.1.3 Characteristics of Univariate Time Series Forecasting
13.2.2 Exponential Smoothing Forecasting
13.2.2.1 Smoothing and Pandas rolling & mean method
13.2.2.2 Exponential Smoothing and Python Utilization
13.2.3 Stationarity of Time Series Data and ARIMA Models
13.2.3.1 Stationarity of time series data
13.2.3.2 Autocorrelation Test of Time Series
13.2.3.3 ARIMA Model and SARIMAX
13.2.3.4 Assessing the Accuracy of Time Series Forecasts
13.2.3.5 Python Solutions for ARIMA and SARIMA Models
14.
Chi-square test and nonparametric test
14.1 Chi-square test
14.1.1 Conformity Test
14.1.1.1 Conformity Test Principle
14.1.1.2 Characteristics of the chi-square test of goodness of fit
14.1.1.3 Procedure for performing chi-square test for goodness of fit
14.1.1.4 Python solution for chi-square test of goodness of fit
14.1.2 Independence Test
14.1.2.1 Concept of Independence Test
14.1.2.2 Procedure for performing the chi-square test of independence
14.1.2.3 Python solution for the chi-square test of independence
14.2 Nonparametric tests
14.2.1 Concept and performance principles of nonparametric tests
14.2.1.1 Concepts and advantages of nonparametric tests
14.2.1.2 Disadvantages of nonparametric tests
14.2.1.3 Principles and assumptions of nonparametric tests
14.2.2 Python solution for nonparametric tests
14.2.2.1 Year black
14.2.2.2 Sign Test
14.2.2.3 Wilcoxon rank-sum test
14.2.2.4 Wilcoxon signed-rank test
14.2.2.5 Kruskal-Wallis test
14.2.2.6 Friedman Fr-test
14.2.2.7 McNemar's test
14.2.2.8 Spearman's rank correlation test
References
Search
Statistics and Python
1.1 Statistics and Data
1.1.1 Concepts and basic elements of statistics
1.1.1.1 Concepts of Statistics
1.1.1.2 Decision Making and Statistical Thinking
1.1.1.3 Basic Elements of Statistics
1.1.1.4 Challenges in the Application of Statistics
1.1.2 Data
1.1.2.1 Data Concepts and Types
1.1.2.2 Concept and Characteristics of Big Data
1.1.2.3 Data Utilization Plan
1.2 Python
1.2.1 Python's Strengths and Specialized Libraries
1.2.1.1 Strengths of Python
1.2.1.2 Python's specialized modules and libraries
1.2.1.3 Python's specialized statistical library
1.2.2 Installing Python and Related Packages
1.2.2.1 Installing Python
1.2.2.2 PIP Installation
1.2.2.3 Installing ANACONDA
1.2.2.4 Installing Jupyter Notebook
1.2.2.5 Installing required libraries using PIP
2.
Python language and data frames and basic operations
2.1 Python Language and Data Frames
2.1.1 Python Language and Standard Data Types
2.1.1.1 Python Language
2.1.1.2 Python Standard Data Types
2.1.2 Python functions and list and string processing
2.1.2.1 Python functions
2.1.2.2 Python data type conversion and annotation
2.1.2.3 List processing and related functions
2.1.2.4 String Handling
2.1.2.5 Using f-strings
2.1.3 Data Frame Structure and Utilization
2.1.3.1 Data frame structure
2.1.3.2 Setting up df using NumPy arrays and saving and reading files
2.1.3.3 Convenient operations and modifications of NumPy arrays
2.1.3.4 Merging Data Frames
2.1.3.5 Creating a Split Table and Pivot Table
2.1.3.6 Creating a cross-tabulation
2.2 Python's basic operation functions
2.2.1 Basic operators and applications of Python
2.2.1.1 Python Basic Operators
2.2.2 Basic Python functions related to numbers and statistics
2.2.2.1 Python Basic Functions for Numbers
2.2.2.2 math module functions
2.2.2.3 Basic Python Date-Related Functions
2.2.2.4 Python basic functions related to statistics
2.2.2.5 Command Statement Flow Control
3.
Descriptive statistics and data visualization, and frequency distribution generation
3.1 Descriptive Statistics and Data Visualization
3.1.1 Components and characteristics of descriptive statistics
3.1.1.1 Components of descriptive statistics
3.1.1.2 Characteristics of descriptive statistics
3.1.2 Exploratory Data Analysis and Data Visualization
3.1.2.1 Exploratory Data Analysis and Key Tools
3.1.2.2 Purpose and Objectives of Visualization
3.1.2.3 Roles and types of charts and how to apply them
3.1.3 Principles and Preparation of Frequency Distribution
3.1.3.1 Principle of frequency distribution
3.1.3.2 Procedure for creating a frequency distribution
3.2 Creating frequency distributions for qualitative and quantitative data
3.2.1 Creating a frequency distribution and chart for qualitative data
3.2.1.1 Creating a frequency distribution of qualitative data
3.2.1.2 Creating a frequency distribution chart for qualitative data
3.2.2 Creating a frequency distribution and chart of quantitative data
3.2.2.1 Creating a frequency distribution of quantitative data
3.2.2.2 Creating a frequency distribution chart for quantitative data
4.
Measures of central tendency and dispersion and shape
4.1 Central tendency measures and location measures
4.1.1 Concept of central tendency measures
4.1.1.1 Concept of average
4.1.1.2 Mode, Median, and Midpoint
4.1.2 Concept of relative position measurement
4.1.2.1 Standard score
4.1.2.2 Percentile
4.1.2.3 Quartiles and Deciles
4.1.3 Python Computation of Central Tendency Measures and Location Measures
4.2 Volatility Measurement and Distribution Shape Analysis
4.2.1 Concept and Types of Volatility Measures
4.2.1.1 Concept of volatility measure
4.2.1.2 Scope
4.2.1.3 Mean deviation
4.2.1.4 Variance and standard deviation
4.2.1.5 Using standard deviation
4.2.2 Python Computation of Volatility Measures
4.2.3 Concept of distribution shape measurement
4.2.3.1 The concept of skewness
4.2.3.2 The concept of kurtosis and the shape of the distribution
4.2.4 Python Computation of Distribution Shape Measures
4.2.4.1 Pandas and Scipy calculations of skewness
4.2.4.2 Calculating Kurtosis with Pandas and Scipy
4.2.4.3 Checking the distribution shape by creating a frequency distribution curve
5.
Probability calculations, contingency tables, and Bayes' theorem
5.1 Thought and probability calculations
5.1.1 Thought and Set
5.1.1.1 Experiments and Results
5.1.1.2 Concept and types of sets
5.1.1.3 Sample space and sample points
5.1.1.4 Concept and Types of Thought
5.1.2 Concept and calculation of probability
5.1.2.1 Concept of probability
5.1.2.2 Approach to probability calculation
5.1.3 Counting Rules and Probability Calculation Rules
5.1.3.1 Counting Rule
5.1.3.2 Probability Calculation Rules
5.1.4 Calculating Probability Using Python
5.1.4.1 Calculating the probability of heads on a random coin toss
5.1.4.2 Simulating a Random Coin Toss and Calculating the Probability of Heads
5.1.4.3 Creating a Dot Plot for a Simulation
5.1.4.4 Factorial, Permutation, and Combination Calculations
5.2 Conditional probability, contingency tables, and Bayes' theorem
5.2.1 Conditional probability and joint probability
5.2.1.1 Conditional probability and joint probability
5.2.1.2 Conditional probability and joint probability of independent events
5.2.2 Partition Table and Bayes' Rule
5.2.2.1 Partition Table
5.2.2.2 Joint probability table and marginal probability
5.2.2.3 Bayes' Law and Diachronic Interpretation
5.2.3 Creating a partition table and solving Bayes' rule using Python
5.2.3.1 Creating a partition table using Python
5.2.3.2 Bayes' Law Solution Using Python
6.
Random variables and probability distributions
6.1 Random variables, probability distributions, and expected values
6.1.1 Concept and types of random variables
6.1.1.1 Concept of random variable
6.1.1.2 Types of random variables
6.1.1.3 Random Number Generation in Python
6.1.2 Probability distribution and expected value of random variables
6.1.2.1 Concept of probability distribution
6.1.2.2 Expectation and Variance of Random Variables
6.2 Discrete and continuous probability distributions
6.2.1 Discrete probability distribution
6.2.1.1 Concept of discrete probability distribution
6.2.1.2 Bernoulli Trial and Bernoulli Process
6.2.1.3 Binomial distribution
6.2.1.4 Poisson distribution
6.2.1.5 Initial geometric distribution
6.2.1.6 Negative binomial distribution and geometric distribution
6.2.2 Continuous probability distribution
6.2.2.1 Concept of continuous probability distribution
6.2.2.2 Uniform distribution
6.2.2.3 Exponential distribution
6.2.2.4 Normal distribution
6.2.2.5 Central Limit Theorem and Monte Carlo Simulation
7.
Sampling, sampling, and sample distribution
7.1 Sampling and sampling
7.1.1 Knowledge Exploration and Sampling
7.1.1.1 Scientific Method
7.1.1.2 Investigation Approach and Requirements
7.1.1.3 Advantages and procedures for conducting sample surveys
7.1.2 Sampling
7.1.2.1 Sampling Concept
7.1.2.2 Sampling Bias and Sampling Error
7.1.2.3 Probability Sampling
7.1.2.4 Non-probability sampling
7.2 Statistical Inference and Sampling Distributions
7.2.1 Statistical Inference and Sample Statistics
7.2.1.1 Concepts of statistical inference
7.2.1.2 Sample statistics
7.2.2 Sampling distribution
7.2.2.1 Concept of sampling distribution
7.2.2.2 Sampling distribution of the mean
7.2.2.3 Expected value and standard deviation of the mean sampling distribution
7.2.2.4 Method for determining sample size and number of samples
7.2.2.5 Normal approximation and continuity correction of the binomial distribution
7.2.3 Other sampling distributions
7.2.3.1 Other sample statistics and sampling distributions
7.2.3.2 t distribution
7.2.3.3 Chi-square distribution
7.2.3.4 F distribution
8.
Estimation and confidence intervals
8.1 Point estimation and interval estimation
8.1.1 Point Estimation and Point Estimators
8.1.1.1 Point Estimation and Point Estimators and Point Estimates
8.1.1.2 Requirements for Estimate Quantity
8.1.1.3 Problems with Using Point Estimation
8.1.2 Interval Estimation
8.1.2.1 Concept and components of interval estimation
8.1.2.2 Concept and estimation procedure of confidence intervals
8.2 Mean, point estimation, and interval estimation
8.2.1 Interval Estimation of Population Mean and Determining Sample Size Using Python
8.2.1.1 Interval Estimation of the Population Mean: When the Population Variance is Known
8.2.1.2 Interval Estimation of the Population Mean: When the Population Variance is Unknown
8.2.1.3 Determining the minimum sample size for estimating the population mean
8.2.1.4 Bootstrap
8.2.2 Interval Estimation of Population Proportions and Variances
8.2.2.1 Interval estimation of population proportions and determination of sample size
8.2.2.2 Interval estimation of population variance
9.
Hypothesis testing
9.1 Hypothesis Testing Principles
9.1.1 Hypothesis Testing Principles
9.1.1.1 Concept of hypothesis testing
9.1.1.2 Classification of hypotheses
9.1.2 Hypothesis Testing Decision-Making and Implementation Procedures
9.1.2.1 Decision-making in hypothesis testing
9.1.2.2 Distinguishing between error and rejection threshold in hypothesis testing decisions
9.1.2.3 Hypothesis testing procedure
9.2 Sample Testing
9.2.1 Single-sample test
9.2.1.1 One-sample t-test
9.2.1.2 Proportion z-test
9.2.1.3 Hypothesis Testing of Variance and Standard Deviation
9.2.1.4 Power and error matrix
9.2.2 Principles and procedures of two-sample hypothesis testing
9.2.2.1 Principles of Two-Sample Hypothesis Testing
9.2.2.2 Test for difference in means of independent samples
9.2.2.3 Test for difference in means of two dependent samples
9.2.2.4 Test for difference in proportions of two independent samples
9.2.2.5 Test for difference in two sample variances
9.2.3 Python Example Solution for Two-Sample Hypothesis Testing
9.2.3.1 Two-Sample t-Test: Equal Population Variances Two-Sided Test
9.2.3.2 Two-Sample t-Test: Two-Tail Test with Different Population Variances
9.2.3.3 Two Independent Samples t-Test: Equal Population Variances Right-Tailed Test
9.2.3.4 Two Independent Samples t-Test: Equal Population Variances Left-Tailed Test
9.2.3.5 Two independent sample mean difference z-test
9.2.3.6 Using Python ztest to Test the Difference Between Two Independent Sample Means
9.2.3.7 Using Python rp.ttest to Test the Difference Between Two Independent Sample Means
9.2.3.8 Two-Sample t-Test Using Python rp.ttest
9.2.3.9 Two-Sample Proportions Z-Test Using Python's Proportions_ztest
9.2.3.10 F-test for Difference in Two-Sample Variances Using Python
10. ANOVA
10.1 Experimental design and ANOVA principles
10.1.1 Statistical Experiments and Experimental Design
10.1.1.1 Statistical Experiments
10.1.1.2 Experimental Design
10.1.2 ANOVA Principles and Types
10.1.2.1 Principles of ANOVA
10.1.2.2 Types of ANOVA
10.2 One-way ANOVA
10.2.1 Concept and implementation procedure of one-way ANOVA
10.2.1.1 Concept and characteristics of one-way ANOVA
10.2.1.2 Procedure for performing one-way ANOVA
10.2.2 One-Way ANOVA Python Solution
10.2.2.1 One-Way ANOVA Python Solution
10.2.2.2 Comparison of means of data with different sample sizes
10.2.2.3 Comparison of means of data with the same sample size
10.2.2.4 Splitting data by category and comparing means
10.3 Two-way ANOVA
10.3.1 Principles and methods of non-repeated two-way ANOVA
10.3.1.1 Principles of Two-Way ANOVA
10.3.1.2 How to perform non-repeated two-way ANOVA
10.3.2 Non-repeated Two-Way ANOVA Python Solution
10.4 Repeated Two-Way ANOVA
10.4.1 Principles and implementation method of repeated two-way ANOVA
10.4.1.1 Principles of repeated two-way ANOVA
10.4.1.2 How to perform repeated two-way ANOVA
10.4.2 Python solution for repeated two-way ANOVA
11.
Correlation analysis and scatter plot
11.1 Correlation Analysis and Covariance
11.1.1 Correlation Analysis and Covariance Concepts
11.1.1.1 Correlation and Linear Relationship
11.1.1.2 Covariance
11.1.2 Correlation Coefficient and Correlation Test
11.1.2.1 Principle of calculating correlation coefficient
11.1.2.2 Interpretation of correlation coefficients
11.1.2.3 Correlation Test
11.2 Scatterplot
11.2.1 Scatterplot Concepts and Interpretation
11.2.1.1 Scatterplot concept and preparation procedure
11.2.1.2 Scatterplot shape and direction and strength of correlation coefficient
11.2.2 Creating a Scatterplot Using Python
11.2.2.1 Creating a scatter plot comparing two correlation coefficients
11.2.2.2 Creating a scatterplot excluding outliers
11.2.2.3 Creating a Single Chart with Two Scatterplots
12.
Regression analysis
12.1 Regression Analysis Principles and Simple Regression Analysis
12.1.1 Regression Analysis Principles
12.1.1.1 Concept and Purpose of Regression
12.1.1.2 The Essence of Regression Analysis
12.1.1.3 Types of Regression Analysis
12.1.2 Simple regression analysis
12.1.2.1 Simple linear regression
12.1.2.2 Exploring Regression Fit and Outlier Influence
12.1.2.3 Goodness-of-fit diagnostic plot
12.1.2.4 Performing Simple Linear Regression Using Python
12.1.2.5 Confidence intervals and prediction intervals in regression analysis
12.1.2.6 Creating Confidence Interval and Prediction Interval Charts Using Python
12.2 Multiple Regression
12.2.1 Principles of Multiple Regression Analysis
12.2.1.1 The Essence of Multiple Regression Analysis
12.2.1.2 Setting up a multiple regression analysis model
12.2.1.3 Test statistics for multiple regression analysis
12.2.2 Multiple Regression Analysis Using Python
12.2.2.1 Individual Effects Multiple Regression Analysis
12.2.2.2 Performing Standard and Step-by-Step Multiple Regression Analysis Using Python
13.
Time series analysis
13.1 Principles of time series analysis and time series decomposition
13.1.1 Time Series Data Nature and Plot Creation
13.1.1.1 The Nature of Time Series Data
13.1.1.2 Creating a Time Series Plot Using the Python Altair Module
13.1.1.3 Creating a Python Series Plot
13.1.2 Decomposition of time series data
13.1.2.1 Four major determinants of time series
13.1.2.2 Python Date Display
13.1.2.3 Trend Analysis Python Solution
13.1.2.4 Python solution for cyclical variation analysis
13.1.2.5 Seasonal Variation Analysis
13.2 Time Series Forecasting
13.2.1 Nature and Performance of Time Series Forecasting
13.2.1.1 Concepts and Types of Time Series Forecasting
13.2.1.2 Time series forecasting performance and requirements
13.2.1.3 Characteristics of Univariate Time Series Forecasting
13.2.2 Exponential Smoothing Forecasting
13.2.2.1 Smoothing and Pandas rolling & mean method
13.2.2.2 Exponential Smoothing and Python Utilization
13.2.3 Stationarity of Time Series Data and ARIMA Models
13.2.3.1 Stationarity of time series data
13.2.3.2 Autocorrelation Test of Time Series
13.2.3.3 ARIMA Model and SARIMAX
13.2.3.4 Assessing the Accuracy of Time Series Forecasts
13.2.3.5 Python Solutions for ARIMA and SARIMA Models
14.
Chi-square test and nonparametric test
14.1 Chi-square test
14.1.1 Conformity Test
14.1.1.1 Conformity Test Principle
14.1.1.2 Characteristics of the chi-square test of goodness of fit
14.1.1.3 Procedure for performing chi-square test for goodness of fit
14.1.1.4 Python solution for chi-square test of goodness of fit
14.1.2 Independence Test
14.1.2.1 Concept of Independence Test
14.1.2.2 Procedure for performing the chi-square test of independence
14.1.2.3 Python solution for the chi-square test of independence
14.2 Nonparametric tests
14.2.1 Concept and performance principles of nonparametric tests
14.2.1.1 Concepts and advantages of nonparametric tests
14.2.1.2 Disadvantages of nonparametric tests
14.2.1.3 Principles and assumptions of nonparametric tests
14.2.2 Python solution for nonparametric tests
14.2.2.1 Year black
14.2.2.2 Sign Test
14.2.2.3 Wilcoxon rank-sum test
14.2.2.4 Wilcoxon signed-rank test
14.2.2.5 Kruskal-Wallis test
14.2.2.6 Friedman Fr-test
14.2.2.7 McNemar's test
14.2.2.8 Spearman's rank correlation test
References
Search
Publisher's Review
preface
This book explains the concepts of important topics in statistics and provides step-by-step explanations of how to apply them through examples and procedures, along with Python solutions.
The development of computers and information and communication technology led to the birth of a new level of analysis tool called Python in 1990.
Since the 2010s, Python has established itself as an essential tool for data science, including machine learning and deep learning, and is widely used for data collection and processing and various decision-making.
Python provides hundreds of thousands of libraries, which are collections of code containing many reusable functions, and modules such as Matplotlib, Seaborn, Pandas, NumPy, Scikit-learn, Pingouin, and Statmodels can be used to visualize and process data.
Python is now establishing itself as an essential tool for statistical thinking in the era of big data.
For statistical problem solving and data processing, we use Excel and R, along with Euro packages such as SPSS, SAS, Minitab, and Stata.
To use these packages, you need to learn the functions for solving each problem and the unique functions for each package, and this learning process is often quite difficult.
Additionally, it is often difficult to verify the accuracy and validity of the results obtained using the package.
Unlike existing packages, Python provides functions developed for most problems, making it easy and simple to use.
This advantage is even more powerful because the function can be changed to suit the conditions while being applied, and the results can be output on the spot, allowing modification and improvement at any time.
This book consists of 14 chapters.
Many books include an appendix explaining the basics of Python, but this book is structured so that you can learn various major tools and application methods in advance by introducing the Python language, data frames, and basic operations in Chapter 2.
Chapter 1 explains statistical applications and data, the strengths of Python, and how to install related packages.
It is composed of Chapter 3 Descriptive Statistics, Chapter 4 Measures of Central Tendency and Variance, Chapter 5 Probability Calculation, Contingency Tables, and Bayes' Theorem, Chapter 6 Random Variables and Probability Distributions, Chapter 7 Sampling, Sampling, and Sampling Distributions, Chapter 8 Estimation and Confidence Intervals, Chapter 9 Hypothesis Testing, Chapter 10 ANOVA, Chapter 11 Correlation Analysis and Scatterplots, Chapter 12 Regression Analysis, Chapter 13 Time Series Analysis, and Chapter 14 Chi-Square Test and Nonparametric Tests, covering all the content covered in statistics textbooks in various fields.
In particular, this book provides Python solutions along with explanations of the principles of major statistical tools and related examples.
Additionally, we provide charts in a multifaceted manner to help you easily identify key factors and understand their related effects through visualization.
In Chapters 3 and 4 of the descriptive statistics analysis, we will create frequency distributions, histograms, ogives, box-and-swarm integrated plots, and stem-and-leaf diagrams for quantitative and qualitative data, and provide examples and comparative analyses of measures of central tendency and measures of dispersion.
Chapters 5 through 7 explain the concepts and application principles of basic inference tools such as probability, random variables, probability distributions, sampling, sampling, and sampling distributions, and provide Python example solutions to demonstrate how to apply them.
Chapters 8 through 10 cover estimation and comparison of parameters such as population mean, proportion, and variance, and Chapters 11 through 13 present data characteristics, prediction methods, and performance testing methods for performing predictions using general data and time series data.
Chapter 14 explains the analytical methods for categorical variables and the principles of application of major statistical techniques, and identifies the application methods and related characteristics through Python solutions.
Even if you're unfamiliar with programming languages, Python is a well-developed tool that allows you to obtain accurate, multi-faceted analysis results more easily, enjoyably, and more accurately than any other tool, such as Excel, R, or SPSS.
It provides particularly powerful charting capabilities, allowing you to achieve next-level results in visualization.
Going forward, Python will continue to play a leading role in effective decision-making through data processing and statistical thinking, as it continues to develop and provide improved functions.
In response to these contemporary challenges in Python learning, I hope this book will help you acquire Python skills while learning the key statistical principles necessary for real-life situations.
August 2025, all authors
The example solution files used in the book are provided at 〈https://cafe.naver.com/purpleowzkj〉.
This book explains the concepts of important topics in statistics and provides step-by-step explanations of how to apply them through examples and procedures, along with Python solutions.
The development of computers and information and communication technology led to the birth of a new level of analysis tool called Python in 1990.
Since the 2010s, Python has established itself as an essential tool for data science, including machine learning and deep learning, and is widely used for data collection and processing and various decision-making.
Python provides hundreds of thousands of libraries, which are collections of code containing many reusable functions, and modules such as Matplotlib, Seaborn, Pandas, NumPy, Scikit-learn, Pingouin, and Statmodels can be used to visualize and process data.
Python is now establishing itself as an essential tool for statistical thinking in the era of big data.
For statistical problem solving and data processing, we use Excel and R, along with Euro packages such as SPSS, SAS, Minitab, and Stata.
To use these packages, you need to learn the functions for solving each problem and the unique functions for each package, and this learning process is often quite difficult.
Additionally, it is often difficult to verify the accuracy and validity of the results obtained using the package.
Unlike existing packages, Python provides functions developed for most problems, making it easy and simple to use.
This advantage is even more powerful because the function can be changed to suit the conditions while being applied, and the results can be output on the spot, allowing modification and improvement at any time.
This book consists of 14 chapters.
Many books include an appendix explaining the basics of Python, but this book is structured so that you can learn various major tools and application methods in advance by introducing the Python language, data frames, and basic operations in Chapter 2.
Chapter 1 explains statistical applications and data, the strengths of Python, and how to install related packages.
It is composed of Chapter 3 Descriptive Statistics, Chapter 4 Measures of Central Tendency and Variance, Chapter 5 Probability Calculation, Contingency Tables, and Bayes' Theorem, Chapter 6 Random Variables and Probability Distributions, Chapter 7 Sampling, Sampling, and Sampling Distributions, Chapter 8 Estimation and Confidence Intervals, Chapter 9 Hypothesis Testing, Chapter 10 ANOVA, Chapter 11 Correlation Analysis and Scatterplots, Chapter 12 Regression Analysis, Chapter 13 Time Series Analysis, and Chapter 14 Chi-Square Test and Nonparametric Tests, covering all the content covered in statistics textbooks in various fields.
In particular, this book provides Python solutions along with explanations of the principles of major statistical tools and related examples.
Additionally, we provide charts in a multifaceted manner to help you easily identify key factors and understand their related effects through visualization.
In Chapters 3 and 4 of the descriptive statistics analysis, we will create frequency distributions, histograms, ogives, box-and-swarm integrated plots, and stem-and-leaf diagrams for quantitative and qualitative data, and provide examples and comparative analyses of measures of central tendency and measures of dispersion.
Chapters 5 through 7 explain the concepts and application principles of basic inference tools such as probability, random variables, probability distributions, sampling, sampling, and sampling distributions, and provide Python example solutions to demonstrate how to apply them.
Chapters 8 through 10 cover estimation and comparison of parameters such as population mean, proportion, and variance, and Chapters 11 through 13 present data characteristics, prediction methods, and performance testing methods for performing predictions using general data and time series data.
Chapter 14 explains the analytical methods for categorical variables and the principles of application of major statistical techniques, and identifies the application methods and related characteristics through Python solutions.
Even if you're unfamiliar with programming languages, Python is a well-developed tool that allows you to obtain accurate, multi-faceted analysis results more easily, enjoyably, and more accurately than any other tool, such as Excel, R, or SPSS.
It provides particularly powerful charting capabilities, allowing you to achieve next-level results in visualization.
Going forward, Python will continue to play a leading role in effective decision-making through data processing and statistical thinking, as it continues to develop and provide improved functions.
In response to these contemporary challenges in Python learning, I hope this book will help you acquire Python skills while learning the key statistical principles necessary for real-life situations.
August 2025, all authors
The example solution files used in the book are provided at 〈https://cafe.naver.com/purpleowzkj〉.
GOODS SPECIFICS
- Date of issue: August 11, 2025
- Page count, weight, size: 544 pages | 188*257*35mm
- ISBN13: 9791193595336
- ISBN10: 1193595339
You may also like
카테고리
korean
korean