Skip to product information
Statistics classes for people who are weak in numbers
Statistics classes for people who are weak in numbers
Description
Book Introduction
Statistics is the new essential liberal arts for the big data era.
From predicting home prices to tracking disease and analyzing murder patterns.

In a world overflowing with information, all the answers are hidden in statistics!

-They say that ham and sausage increase the risk of cancer by 18%. Does that mean that one in five people will get cancer?
- Did traffic accidents decrease 'because' speed cameras were installed in accident-prone areas?
How many more deaths than the normal expected number would have to occur before it is considered a serial killer?

Statistics analyze individual data to infer the overall situation and express it in specific numbers.
Statistics are used in a variety of ways, from stock charts to population distributions, because they reveal trends and patterns in things that would otherwise appear complex and confusing when viewed individually.
Especially in the era of big data, where even personal tastes and interests are quantified, statistical analysis and thinking skills are as important as reading and writing skills.


"Statistics for People Weak in Numbers" introduces statistics, which goes beyond simply calculating the mean or standard deviation, to studying meaningful patterns and relationships in data.
It also uses real-world data and examples to demonstrate that statistics is a useful tool for solving a wide range of problems, from everyday curiosities to social, economic, scientific, and medical issues.
Through this book, readers can gain the power of statistical thinking to understand the whole from the parts and correctly interpret the flow of the world beyond numbers.
  • You can preview some of the book's contents.
    Preview

index
Entering
Chapter 1: Displaying Proportions: Categorical Data and Percentages
Chapter 2 Summarizing and Communicating Numbers: Location, Spread, and Relationships in Data
Inferring the Whole from Part 3: Populations and Measurement
Chapter 4: What Causes What?: Causality
Chapter 5: Modeling Relationships: Regression Models
Chapter 6: Analyzing and Predicting: Algorithms
Chapter 7 How Confident Can You Be in Your Estimates?: Sample Size and Uncertainty Intervals
Chapter 8: The Language of Uncertainty and Volatility: Probability Laws and Theory
Chapter 9: When Probability and Statistics Meet: Statistical Inference Based on Probabilistic Models
Chapter 10: Answering Questions and Claiming Findings: Hypothesis Testing and Statistical Significance
Chapter 11 Learning from Experience: Bayesian Methods
Chapter 12 How Things Go Wrong: Errors and Deception
Chapter 13: Toward Better Statistics: Reliability and Ethical Issues
Chapter 14 Conclusion: 10 Ways to Become a Good Statistician
Acknowledgements
Glossary
Americas
Search

Detailed image
Detailed Image 1

Into the book
This book is written for both students looking for a non-technical introduction to statistics and the general public who wants to gain more knowledge about statistics encountered in their work and daily lives.
My focus is on enabling them to handle statistics skillfully and meticulously.
The world of numbers may seem cold and rigid, but as we have seen in previous attempts to measure things like trees, happiness, and death, numbers must always be handled with care.
--- p.22

Statistics can bring clarity and insight into the issues we face, but they can also be misused to promote opinions or simply grab attention.
Therefore, the ability to judge which statistical claims are credible is becoming increasingly important in modern society.
I hope this book empowers you to ask questions about the numbers you encounter in your daily life.
--- p.22

Let's learn a little more about framing.
In 2011, an advertisement appeared on the London Underground claiming that “99% of young people in London do not commit serious violence.”
This ad may be intended to reassure passengers, but two simple changes could have the opposite effect.
First, saying that 99% of youth do not commit serious youth violence means that 1% of youth do commit very serious youth violence.
Second, London has a population of about 9 million, of which about 1 million are aged between 15 and 25.
If we consider this age group as youth, there are approximately 10,000 very violent youth.
How can we be reassured by this? Here, we use two methods to manipulate the statistical effect.
First, the positive frame was changed to a negative frame.
Second, we converted the percentages into specific numbers.
Therefore, to provide information that is not biased towards either side, it is desirable to provide both positive and negative frames.

--- p.34

In the UK, approximately 6 in 100 people will develop bowel cancer.
What would happen if those 100 people ate bacon sandwiches every day? Applying the 18% relative risk from the International Agency for Research on Cancer report, that number would increase from six to seven.* In other words, for every lifetime of bacon consumption, one additional person out of every 100 would develop bowel cancer.
This is not as impressive as the relative risk (18% increase).
We need to distinguish between what sounds scary and what is real danger.
--- p.42

The arithmetic mean can be quite misleading when data are not distributed symmetrically around the median but are skewed to one side, or when there are a few very high or very low values ​​that form long tails at the ends, although the majority are standard cases.
For example, compared to people of your age and gender, your chances of dying next year will be much lower than the average (arithmetic mean).
According to UK life expectancy tables, 1% of people currently aged 63 will die before their 64th birthday.
However, most of these people are already in serious condition, and the chances of death for the majority of reasonably healthy people are less than this average risk.
--- p.59

To generalize the results of a sample to the population, the sample must be representative.
Just because there is a lot of data does not necessarily mean that it is a good sample, and it can even give false confidence.
For example, in the 2015 British general election, polling firms surveyed thousands of potential voters but failed to make predictions.
This was because the sampling process was inadequate.
Agencies primarily called landlines, with response rates of less than 10%.
Such samples are hardly representative.
--- p.97

Because this was a large study using the records of a relevant population, the conclusion that people with more education actually have a slightly higher risk of brain tumors is somewhat credible.
So, did all that blood and sweat pouring out of the library overheat my brain and cause my cells to mutate into something strange? I doubt it.
The authors of the paper also add that “these conclusions may be due to the incompleteness of cancer registries and detection bias.”
In other words, wealthier, more educated people may have received more diagnostic tests.
--- pp.113~114

The strong psychological tendency to attribute change to intervention makes before-and-after comparisons unreliable.
A typical example is a speed camera.
These cameras are often installed in places where accidents have recently occurred.
When accident rates go down after installation, people think it's because of the cameras.
But wouldn't the accident rate have gone down anyway? A string of good or bad luck doesn't last forever, and eventually, the situation will subside.
This is regression to the mean.
It's like a tall father having a shorter son.
However, if we believe that this continued good or bad luck indicates a permanent state of affairs, we will be led to the mistaken conclusion that the return to normalcy is the result of some intervention.

--- pp.149~150

There are formulas for the size and power of a test depending on the type of experiment, and they depend crucially on the sample size.
However, with a fixed sample size, there is an inevitable trade-off.
To increase the power of the test, the significance level threshold should be relaxed.
This increases the likelihood of accurately identifying the true effect, but also increases the likelihood of making a Type I error.
To use a legal analogy, the standard for finding guilt can be relaxed by weakening the requirement for proof beyond a reasonable doubt.
As a result, more criminals are being properly convicted, while more innocent people are being wrongly convicted.
--- p.315

The article, titled “Can Watching TV Without a Mind Kill You?”, comes from an epidemiological study.
The study estimated that people who watched more than five hours of TV per night had a relative risk of developing an embolism of 2.5 compared to those who watched less than two and a half hours of TV per night.
However, if we look at the absolute risk in the high-risk group (13 out of 158,000 people per year), this means that you could expect to experience the event if you watched more than five hours of TV every night for 12,000 years.
This weakens the dramatic impact of the article title.
--- pp.390~391

Table 13.1 shows the predictions and final results for the June 2017 elections.
The predicted number of seats was surprisingly close to the actual result.
The difference was at most four seats.
Table 13.1 shows that these statistical predictions have been quite accurate in the last three British elections.
In 2015, when statisticians predicted a major defeat for the Liberal Democrats, dropping their seats from 57 to 10, a prominent Liberal Democrat politician said in a live TV interview that if they were right, he would "eat his hat".
In fact, the party won only eight seats.
--- p.411

Publisher's Review
Avoid lies and coincidences in the overflow of information
The power of statistical thinking to guide wise choices


In 2015, the WHO shocked the world by announcing that processed meats such as ham and sausages were classified as a "Group 1 carcinogen," along with tobacco and asbestos, and that eating 50 grams of processed meat per day could increase the risk of developing bowel cancer by 18%.
So does that mean that almost one in five people will get cancer?
The reaction was immediate.
Sales of ham and sausages at supermarkets have dropped noticeably.
There are growing calls to completely eliminate processed meats such as ham and sausage from school lunch menus.
The restaurant industry, including hamburgers, has been hit hard by the sharp decline in consumption.

As it turns out, what this scary 18% number really means is a relative growth rate.
In the UK, the average person who gets bowel cancer is about 6 out of 100.
If we apply a relative growth rate of 18% here, these 6 people become 7 people.
In other words, the number of people actually getting cancer increases by only one.

"Statistics for People Weak in Numbers" teaches you how to reasonably doubt the numbers you encounter in everyday life and distinguish between truth and falsehood, coincidence and necessity.
The author, David Spiegelhalter, is a world-renowned statistician who served as Professor of Statistics at the University of Cambridge and President of the Royal Statistical Society.
He explains core statistical concepts such as relative and absolute risk, correlation and causality, the wisdom of crowds, and regression to the mean with surprising clarity and ease, without resorting to complex mathematics.

We already talk and think about almost everything in terms of statistics.
This book will help readers develop the practical and productive statistical skills needed in the era of big data.

Statistics can bring clarity and insight into the issues we face, but they can also be misused to promote opinions or simply grab attention.
Therefore, the ability to judge which statistical claims are credible is becoming increasingly important in modern society.
I hope this book empowers you to ask questions about the numbers you encounter in your daily life.
(Page 22)

Even beginners can read it!
Study statistics easily and effectively


Recently, statistics have been heating up Korean society.
There is ongoing debate surrounding the basis for economic indices related to employment, income, and real estate announced by the government.
As public criticism grows over the significant off-target weather forecasts during this rainy season, the Korea Meteorological Administration is struggling to provide an explanation.

The reason people are so sensitive to statistics is because they widely use statistical evidence in their everyday decision-making.
We check the weather forecast and decide whether or not to bring an umbrella.
When deciding which restaurant to go to or which movie to see, I refer to the user ratings provided by the portal.
In addition, opinion poll results influence voting, and economic indicators influence household operations such as investment.

The ability to interpret these numbers and critically analyze statistical conclusions is called data literacy.
Especially in the era of big data, as data grows larger and more complex, it has become more important than ever to determine how valuable a piece of evidence really is.
The author argues that to develop data literacy, we must learn statistics for solving real-world problems, not statistics for solving mathematical problems.
This book takes readers on an intellectual journey to discover the true world from data through 48 fascinating questions.
Here are some of them summarized:

√ How many more deaths than the normal death toll would be required before serial killings could be detected?
Harold Shipman is Britain's most prolific murderer.
A family doctor, he administered excessive doses of sedatives to at least 215 of his patients between 1975 and 1998, causing their deaths.
His family was shocked that he could commit crimes for so long without being suspected.
The investigation team concluded that statistical monitoring would have found evidence of significant excess deaths in 1984, and that if Shipman had been reported then, about 175 people could have been saved.


√ Can prayer help patients recover?
Researchers divided 1,800 patients undergoing heart surgery into three groups.
The first group received prayer but did not know it.
The second group did not receive the prayer and was likewise unaware of it.
The third group received the prayer and knew it.
The results of the experiment showed no significant difference between the group that received prayer and the group that did not.
However, in the third group, there was a slight increase in patients suffering from complications.
One researcher said:
“The patients may have been anxious, wondering if they were so sick that they needed to be prayed for.”

√ Is it really true that speed cameras have reduced traffic accidents?
When accident rates go down after speed cameras are installed, people attribute it to the cameras.
But wouldn't accident rates have gone down anyway? To determine the true effectiveness of speed cameras, researchers randomly placed them.
As a result, it was estimated that about two-thirds of the camera installation effect was due to regression to the mean.
In other words, a string of good or bad luck doesn't last forever, and eventually things will subside again.

√ What is the probability that a person with a positive cancer test result, which is 90% accurate, actually has cancer?
Let's assume that 1000 people were tested and 1% (10 people) actually developed cancer.
Nine out of ten (90%) test positive.
Meanwhile, 99 out of 990 people (10%) who do not have cancer will receive a false positive test result.
Therefore, the probability that a person who tests positive actually has cancer is only 9/108, or about 8%.
This exercise on conditional probability is counterintuitive.
Even though the test is 90% accurate, the majority of people who test positive do not actually have cancer.
We need to distinguish between the probability that a test result will be positive given that we have cancer and the probability that we have cancer given that we get a positive result.


How are the algorithms that govern our lives created?
Statistics as a Liberal Arts Course for Data Science and Artificial Intelligence


Francis Summerton, an Englishman, bought a third-class ticket on the Titanic for eight pounds and one shilling, intending to leave his wife and young daughter behind in England and go to America to earn money.
Meanwhile, Karl Dahl, a 45-year-old Norwegian cabinetmaker, boarded the ship alone, paying the same fare as Somerton.
They were equally poor and desperate, but on the night of April 14, 1912, their fates diverged.
Summerton sank to the bottom of the cold sea, but Dahl managed to climb onto lifeboat number 15.
Was Summerton simply unlucky? Or was his chances of survival truly slim? One might simply assume, based on intuition, that he was male and traveling in third class, and thus unlikely to survive.
However, by designing various prediction algorithms for Titanic passengers' survival using data such as title, gender, age, and ticket price, we can obtain more specific survival rates.
Today, we are surrounded by algorithms that collect real-time traffic information to guide us through optimal routes and deliver personalized ads based on our online shopping history.
Moreover, as data science utilizing big data advances, reliance on algorithms in fields such as healthcare, finance, and public safety will increase.

Therefore, even if you are not an expert, you need to understand the structure of the algorithm and its limitations.
This book provides an easy-to-understand explanation of statistical theories that form the basis of algorithms, such as regression, overfitting, cross-validation, and Bayesian inference.
This book will serve as a valuable guide not only for beginners studying statistics for the first time, but also for those who wish to learn statistics as a liberal arts course for data science and artificial intelligence.

10 Ways to Think Like a Statistics Master

Finally, the author presents 10 tips for those who want to become wise statistics experts.

1.
Statistical methods should enable data to answer scientific questions.
Rather than focusing on a specific technique, let's ask why we are doing this.
2.
The signal always appears with noise.
Trying to distinguish between the two is what makes statistics interesting.
Volatility is unavoidable, and probabilistic models are useful as abstract concepts.
3.
Plan ahead, really plan ahead.
In confirmatory studies, use pre-explanatory notes to avoid researcher bias.
4.
Pay attention to the quality of your data.
It all depends on the data.
5.
Statistical analysis is more than just calculations.
It's not a good idea to just plug data into a formula or run software without knowing why.
6.
Keep it simple.
Important communications should be as basic as possible.
Let's not show off our skills with unnecessarily complicated modeling.
7.
Provide a volatility assessment.
Also with the caveat that the margin of error is larger than commonly claimed.
8.
Check your assumptions.
And let's be clear when this wasn't possible.
9.
If possible, reproduce it! Or encourage others to do so.
10.
Make your analysis reproducible.
Other people should be able to get your data and code.

Statistics plays a vital role in all aspects of our lives, and it is constantly evolving as more and more data becomes available.
It not only affects society, but also affects individuals.
As I prepared this book, I hope you will realize how much statistics has enriched my life.
If not now, then someday in the future. (p. 418)

“I was so shocked by this book.
“I think I learned more from this book than I did from my master’s and doctoral studies.” _Blogger Review

A must-read for every politician, journalist, medical professional, and everyone else _Popular Science

It clearly shows how data can help us better understand the world.
_《Nature》

A Great Correction for Fake News and Fake Thinking _Kirkus Reviews
GOODS SPECIFICS
- Date of issue: August 12, 2020
- Page count, weight, size: 460 pages | 586g | 152*225*21mm
- ISBN13: 9788901244488
- ISBN10: 8901244489

You may also like

카테고리