
Why do machines learn?
Description
Book Introduction
- A word from MD
-
Artificial intelligence is mathematicsWe live in an era where generative artificial intelligence is essential.
This book helps you use artificial intelligence wisely by explaining how it works.
Artificial intelligence identifies patterns in existing data and produces new information.
This process is mathematical.
The book is not easy.
But we live in an age where we must read this masterpiece.
March 18, 2025. Natural Science PD Son Min-gyu
"masterpiece"
Geoffrey Hinton, the 2024 Nobel Prize in Physics, strongly recommends
Core mathematics that will support the amazing world AI creates and shape the future!
The magical arrival of ChatGPT in 2024 quickly captured everyone's attention. It seemed as if AI had leaped beyond providing directions, recommending music, drawing pictures, and organizing documents, to finding information "on its own" and answering questions "by thinking."
Has AI truly become a thinking machine? There's a mixture of rosy hopes and dark concerns about how AI will transform our society and our lives.
This book by science writer Anil Anantaswamy delves into the core mathematics that make up the algorithms that make up today's AI, providing a clear picture of what processes are at work inside the machines.
This book explains the sophisticated mathematical principles and algorithms that have fueled and excited machine learning researchers for decades.
The algorithms used in the vast field of machine learning use relatively simple mathematics.
This mathematics, which dates back hundreds of years, is what you learn in high school or early college: linear algebra, calculus, Bayes' theorem, the Gaussian distribution (and the bell curve), etc.
This book charts the journey from Rosenblatt's perceptron to modern deep neural networks—sophisticated networks of computational units called artificial neurons—through the lens of the core mathematical concepts underpinning the field of machine learning.
After familiarizing yourself with mathematics by understanding relatively simple concepts from the 1950s, the level of difficulty gradually increases to examine the specialized mathematical principles and algorithms that underpin today's machine learning systems.
This course presents the minimum theoretical and conceptual knowledge necessary to understand the immense power we bring to machines.
This book, which helps readers understand the operating principles and foundations of learning machines, will help them finally understand the true nature of AI and confidently embrace a future filled with AI.
Geoffrey Hinton, the 2024 Nobel Prize in Physics, strongly recommends
Core mathematics that will support the amazing world AI creates and shape the future!
The magical arrival of ChatGPT in 2024 quickly captured everyone's attention. It seemed as if AI had leaped beyond providing directions, recommending music, drawing pictures, and organizing documents, to finding information "on its own" and answering questions "by thinking."
Has AI truly become a thinking machine? There's a mixture of rosy hopes and dark concerns about how AI will transform our society and our lives.
This book by science writer Anil Anantaswamy delves into the core mathematics that make up the algorithms that make up today's AI, providing a clear picture of what processes are at work inside the machines.
This book explains the sophisticated mathematical principles and algorithms that have fueled and excited machine learning researchers for decades.
The algorithms used in the vast field of machine learning use relatively simple mathematics.
This mathematics, which dates back hundreds of years, is what you learn in high school or early college: linear algebra, calculus, Bayes' theorem, the Gaussian distribution (and the bell curve), etc.
This book charts the journey from Rosenblatt's perceptron to modern deep neural networks—sophisticated networks of computational units called artificial neurons—through the lens of the core mathematical concepts underpinning the field of machine learning.
After familiarizing yourself with mathematics by understanding relatively simple concepts from the 1950s, the level of difficulty gradually increases to examine the specialized mathematical principles and algorithms that underpin today's machine learning systems.
This course presents the minimum theoretical and conceptual knowledge necessary to understand the immense power we bring to machines.
This book, which helps readers understand the operating principles and foundations of learning machines, will help them finally understand the true nature of AI and confidently embrace a future filled with AI.
- You can preview some of the book's contents.
Preview
index
prolog
Chapter 1: Let's Find the Pattern
Chapter 2: Here, Everyone Is Just a Number
Chapter 3 The Bottom of the Bowl
Chapter 4: Nine out of Ten
Chapter 5: Birds of a feather flock together
Chapter 6: There's Magic in Matrices
Chapter 7: Kernel Rope Escape Show
Chapter 8 With a Little Help from Physics
Chapter 9: The Man Who Held Deep Learning Back (Not Really)
Chapter 10: Algorithms That Shatter Old Myths
Chapter 11 The Eye of the Machine
Chapter 12: Unknown Land
Epilogue
Acknowledgements
main
Translator's Note
Biographical Index
Chapter 1: Let's Find the Pattern
Chapter 2: Here, Everyone Is Just a Number
Chapter 3 The Bottom of the Bowl
Chapter 4: Nine out of Ten
Chapter 5: Birds of a feather flock together
Chapter 6: There's Magic in Matrices
Chapter 7: Kernel Rope Escape Show
Chapter 8 With a Little Help from Physics
Chapter 9: The Man Who Held Deep Learning Back (Not Really)
Chapter 10: Algorithms That Shatter Old Myths
Chapter 11 The Eye of the Machine
Chapter 12: Unknown Land
Epilogue
Acknowledgements
main
Translator's Note
Biographical Index
Detailed image

Publisher's Review
In Chapter 1, we meet the perceptron, developed by Frank Rosenblatt in the late 1950s.
The perceptron, often called the beginning of modern AI, is the first useful "brain-based" algorithm that can discover hidden patterns in data within a finite amount of time, that is, learn patterns simply by examining the data.
The perceptron traces its roots to a 1943 paper by a neuroscientist in his mid-40s with a philosophical bent and a teenage genius.
These two believe that if the brain is a computational device, it would be possible to create artificial neurons that are simple computational models by imitating human brain neurons, and they are called neurodes.
We implemented an artificial cell called a neuron (neuron + node).
Rosenblatt, a psychologist, developed and announced the perceptron, which allows cells to learn directly.
The perceptron learns the data set and inevitably finds a linear separating hyperplane that can linearly separate the data set.
Because perceptrons modeled how human neurons worked, as neuroscientists had thought, they held a certain mystique and raised hopes that the bright future of AI would one day be realized.
For example, when a perceptron is given data on the height and weight of certain people, it learns from this and inevitably finds criteria to classify these people into obese and non-obese, and when given the height and weight of a new person, it can classify whether that person is obese or not.
Chapter 2 takes us to Dublin's Royal Canal Bridge in September 1865 to meet Irish mathematician William Rowan Hamilton.
There, we encounter the mathematical concepts of scalar and vector, which are central to machine learning.
Understanding vectors will help you understand how perceptrons learn.
A perceptron finds weights and a separating hyperplane in the data, which involves finding the relative distance between a data point and the hyperplane using the concept of vectors.
Chapter 3 examines the process of finding the shortest path down to the bottom of a valley.
That is, it follows the process of finding a way to minimize the errors that occur.
After returning from the first AI competition held at Dartmouth College in 1956, Bernard Withrow and his graduate student Hope were thinking about how to reduce the noise of adaptive filters, and they realized that they could train neurons using the least mean squares (LMS) algorithm, which is the method by which an algorithm approaches the minimum value of a function.
Today's deep neural networks use a huge number of weights, all trained using gradient descent.
In Chapter 4, which deals with probability, we meet Thomas Bayes, who is said to have been born in 1701 with a probability of 0.8, and examine Bayes' theorem, a method for mathematically rigorously deriving conclusions from uncertain situations.
Bayes's theorem gave birth to the field of probability and statistics, which, nearly 250 years later, continues to be a tremendous driving force in machine learning.
Chapter 5 covers the nearest neighbor (NN) algorithm, which is as efficient as the optimal classifier based on Bayes' theorem.
First, we explore another conceptual core of powerful machine learning algorithms through the work of John Snow, a physician who made a significant contribution to epidemiology by mapping the cholera outbreaks that swept through 19th-century London.
John Snow mapped the locations of cholera outbreaks and the number of deaths, and by correlating the two, he was able to determine that the disease originated from a single "well" in the village.
To put this problem more generally, we can say that it is a matter of finding the nearest neighbor.
This has become an extremely important algorithm for pattern recognition.
Pattern recognition is how companies, when they want to recommend books, watches, or movies to us, represent us as vectors in a high-dimensional space (based on our tastes in books or movies), find our nearest neighbors, figure out what they like, and then recommend those books or movies to us.
In Chapter 6, we meet the outstanding German mathematician David Hilbert and experience the magic of matrices.
We'll also explore a simple, elegant, and robust method for handling massive amounts of data called principal component analysis—a method that projects high-dimensional data onto a much smaller number of axes to find the dimensions along which the data varies most.
In Chapter 7, we meet researchers who have found ways to get better at finding linear boundaries, or separating hyperplanes, that divide the data.
His colleagues Vladimir Vapnik and Bernhard Boser and his wife Isabelle Guyon pioneered a breakthrough in modern algorithm development by developing an algorithm for finding optimal separating hyperplanes, even in infinite-dimensional spaces, using the “kernel trick.”
The perceptron algorithm, invented by Frank Rosenblatt, can find the separating hyperplane.
But with an infinite number of separating hyperplanes, how can we find the best one? This is precisely the kernel technique they sought to apply.
The combination of Bapnik's 1964 optimal marginal classifier and kernel techniques proved to be extremely powerful.
There is now no data set that cannot be overlooked.
Chapter 8 tells the story of John Hopfield, a neurobiologist with a background in physics who revived the study of neural networks by creating the Hopfield network.
Before that, this book first delves into the world of physics to understand magnetic phenomena and introduce what magnetic moments and spin are.
This leads to concepts such as associative memory, symmetry, disturbance, and energy stability, and leads to the development of a method for learning multilayer neural networks that overcome the limitations of single-layer neural networks such as perceptrons.
In Chapter 9, we meet Georgy Sibenko, who is both praised and criticized for his work on deep learning, that is, deep neural networks with three or more layers (input, hidden, and output layers).
In his 1989 paper, he showed that given enough neurons in a neural network with just one hidden layer, it could approximate any function, meaning it could transform inputs into the outputs we wanted.
However, his paper was sometimes misunderstood as suggesting only one hidden layer.
Chapter 10 covers “backpropagation,” the algorithm that enables training deep neural networks.
We'll explore the collaborative work of Geoffrey Hinton, a key figure in the deep learning revolution, with psychologist David Rumelhart and computer scientist Ronald Williams. After dipping your toes into the gentle river of differentiation, derivatives, and the chain rule, you'll see how to backpropagate errors made by neural networks using the chain rule, demonstrating the power of backpropagation.
Chapter 11 follows the journey to give machines sight.
This led to Yann Lecoup, who introduced convolutional neural networks for image recognition through experiments in which electrodes were implanted in the brains of cats to determine their brain activity.
Additionally, with the advancement of deep neural networks utilizing graphics processing units (GPUs), image recognition is now widely used in various fields such as computer vision, natural language processing, machine translation, medical image analysis, and financial data pattern recognition.
The final chapter, Chapter 12, “Unknown Territory,” examines the current state of deep neural networks and forecasts future challenges and the future they will bring.
This book delves into the simple yet elegant and powerful mathematical principles behind the incredible results AI delivers.
To help readers understand, many graphs are used to visualize mathematical concepts, and the section called “Mathematical Coda” concisely summarizes and proves mathematical principles.
In a sea of graphs and formulas, readers will gain a clearer understanding of AI, which is currently developing at an incredible pace, by grasping the core mathematical concepts and principles that drive AI.
The perceptron, often called the beginning of modern AI, is the first useful "brain-based" algorithm that can discover hidden patterns in data within a finite amount of time, that is, learn patterns simply by examining the data.
The perceptron traces its roots to a 1943 paper by a neuroscientist in his mid-40s with a philosophical bent and a teenage genius.
These two believe that if the brain is a computational device, it would be possible to create artificial neurons that are simple computational models by imitating human brain neurons, and they are called neurodes.
We implemented an artificial cell called a neuron (neuron + node).
Rosenblatt, a psychologist, developed and announced the perceptron, which allows cells to learn directly.
The perceptron learns the data set and inevitably finds a linear separating hyperplane that can linearly separate the data set.
Because perceptrons modeled how human neurons worked, as neuroscientists had thought, they held a certain mystique and raised hopes that the bright future of AI would one day be realized.
For example, when a perceptron is given data on the height and weight of certain people, it learns from this and inevitably finds criteria to classify these people into obese and non-obese, and when given the height and weight of a new person, it can classify whether that person is obese or not.
Chapter 2 takes us to Dublin's Royal Canal Bridge in September 1865 to meet Irish mathematician William Rowan Hamilton.
There, we encounter the mathematical concepts of scalar and vector, which are central to machine learning.
Understanding vectors will help you understand how perceptrons learn.
A perceptron finds weights and a separating hyperplane in the data, which involves finding the relative distance between a data point and the hyperplane using the concept of vectors.
Chapter 3 examines the process of finding the shortest path down to the bottom of a valley.
That is, it follows the process of finding a way to minimize the errors that occur.
After returning from the first AI competition held at Dartmouth College in 1956, Bernard Withrow and his graduate student Hope were thinking about how to reduce the noise of adaptive filters, and they realized that they could train neurons using the least mean squares (LMS) algorithm, which is the method by which an algorithm approaches the minimum value of a function.
Today's deep neural networks use a huge number of weights, all trained using gradient descent.
In Chapter 4, which deals with probability, we meet Thomas Bayes, who is said to have been born in 1701 with a probability of 0.8, and examine Bayes' theorem, a method for mathematically rigorously deriving conclusions from uncertain situations.
Bayes's theorem gave birth to the field of probability and statistics, which, nearly 250 years later, continues to be a tremendous driving force in machine learning.
Chapter 5 covers the nearest neighbor (NN) algorithm, which is as efficient as the optimal classifier based on Bayes' theorem.
First, we explore another conceptual core of powerful machine learning algorithms through the work of John Snow, a physician who made a significant contribution to epidemiology by mapping the cholera outbreaks that swept through 19th-century London.
John Snow mapped the locations of cholera outbreaks and the number of deaths, and by correlating the two, he was able to determine that the disease originated from a single "well" in the village.
To put this problem more generally, we can say that it is a matter of finding the nearest neighbor.
This has become an extremely important algorithm for pattern recognition.
Pattern recognition is how companies, when they want to recommend books, watches, or movies to us, represent us as vectors in a high-dimensional space (based on our tastes in books or movies), find our nearest neighbors, figure out what they like, and then recommend those books or movies to us.
In Chapter 6, we meet the outstanding German mathematician David Hilbert and experience the magic of matrices.
We'll also explore a simple, elegant, and robust method for handling massive amounts of data called principal component analysis—a method that projects high-dimensional data onto a much smaller number of axes to find the dimensions along which the data varies most.
In Chapter 7, we meet researchers who have found ways to get better at finding linear boundaries, or separating hyperplanes, that divide the data.
His colleagues Vladimir Vapnik and Bernhard Boser and his wife Isabelle Guyon pioneered a breakthrough in modern algorithm development by developing an algorithm for finding optimal separating hyperplanes, even in infinite-dimensional spaces, using the “kernel trick.”
The perceptron algorithm, invented by Frank Rosenblatt, can find the separating hyperplane.
But with an infinite number of separating hyperplanes, how can we find the best one? This is precisely the kernel technique they sought to apply.
The combination of Bapnik's 1964 optimal marginal classifier and kernel techniques proved to be extremely powerful.
There is now no data set that cannot be overlooked.
Chapter 8 tells the story of John Hopfield, a neurobiologist with a background in physics who revived the study of neural networks by creating the Hopfield network.
Before that, this book first delves into the world of physics to understand magnetic phenomena and introduce what magnetic moments and spin are.
This leads to concepts such as associative memory, symmetry, disturbance, and energy stability, and leads to the development of a method for learning multilayer neural networks that overcome the limitations of single-layer neural networks such as perceptrons.
In Chapter 9, we meet Georgy Sibenko, who is both praised and criticized for his work on deep learning, that is, deep neural networks with three or more layers (input, hidden, and output layers).
In his 1989 paper, he showed that given enough neurons in a neural network with just one hidden layer, it could approximate any function, meaning it could transform inputs into the outputs we wanted.
However, his paper was sometimes misunderstood as suggesting only one hidden layer.
Chapter 10 covers “backpropagation,” the algorithm that enables training deep neural networks.
We'll explore the collaborative work of Geoffrey Hinton, a key figure in the deep learning revolution, with psychologist David Rumelhart and computer scientist Ronald Williams. After dipping your toes into the gentle river of differentiation, derivatives, and the chain rule, you'll see how to backpropagate errors made by neural networks using the chain rule, demonstrating the power of backpropagation.
Chapter 11 follows the journey to give machines sight.
This led to Yann Lecoup, who introduced convolutional neural networks for image recognition through experiments in which electrodes were implanted in the brains of cats to determine their brain activity.
Additionally, with the advancement of deep neural networks utilizing graphics processing units (GPUs), image recognition is now widely used in various fields such as computer vision, natural language processing, machine translation, medical image analysis, and financial data pattern recognition.
The final chapter, Chapter 12, “Unknown Territory,” examines the current state of deep neural networks and forecasts future challenges and the future they will bring.
This book delves into the simple yet elegant and powerful mathematical principles behind the incredible results AI delivers.
To help readers understand, many graphs are used to visualize mathematical concepts, and the section called “Mathematical Coda” concisely summarizes and proves mathematical principles.
In a sea of graphs and formulas, readers will gain a clearer understanding of AI, which is currently developing at an incredible pace, by grasping the core mathematical concepts and principles that drive AI.
GOODS SPECIFICS
- Date of issue: February 28, 2025
- Page count, weight, size: 464 pages | 668g | 152*225*25mm
- ISBN13: 9788972918660
- ISBN10: 8972918660
You may also like
카테고리
korean
korean