Skip to product information
How to Create Artificial Intelligence That Coexists with Humans
How to Create Artificial Intelligence That Coexists with Humans
Description
Book Introduction
The current state of AI, its possibilities and risks, and the direction of human-friendly development
A masterpiece by Professor Stuart Russell of UC Berkeley, a leading authority on artificial intelligence.


In this new golden age of AI research, how far has AI come? Is superintelligent, general-purpose AI truly feasible? Will machines dominate humans? What future awaits humanity? Beyond irresponsible optimism and dystopian visions of a bright future brought by AI, this book examines the current state of AI, its potential and risks, and the various perspectives on these issues from a realistic and broad perspective. It proposes directions and principles for creating AI that benefits humanity.
UC Berkeley's Stuart Russell, the author of the standard AI textbook, has been among the most thoughtful voices on the risks of AI.
As a first-class researcher with expertise in this field, I created the 'Human Compatible AI Center' and have been pondering and exploring this for a long time. This book contains a wide and deep discussion that makes the 'trolley problem' seem trivial.
This is a must-read for understanding the current AI debate, and an invaluable guide for assessing the present and future of AI development, as both developers and citizens must ponder this technology, which is likely to have the greatest impact on the future of humanity.



  • You can preview some of the book's contents.
    Preview

index
Introduction
Why this book? Why now?
Overview of this book

1.
If we succeed

How did we get here?
What happens next?
What could have gone wrong?
Can you fix it?

2.
Human and machine intelligence

intelligence
computer
intelligent computer

3.
How will AI develop in the future?

near future
When will superintelligent AI emerge?
Upcoming conceptual breakthroughs
Imagining superintelligent machines
The limits of superintelligence
How will AI benefit humanity?

4. Misuse of AI

Surveillance, persuasion, and control
lethal autonomous weapons
Eliminating work as we know it
Other roles to be taken away

5.
Overly intelligent AI

Gorilla Problem
King Midas problem
Fear and Greed: Instrumental Goals
Intelligence explosion

6.
Just another AI debate

negative
Avoid
tribalism
just…
Resumption of debate

7. AI: A Different Approach

Principles of Beneficial Machines
Grounds for optimism
Reasons to be cautious

8.
Provably Beneficial AI

mathematical guarantee
Learning preferences from behavior
Helping Game
demands and commands
wire heading
cyclical self-improvement

9.
What complicates the situation: us

diverse people
many people
A handsome, ugly, and jealous human being
A foolish and emotional person
Do people really have preferences?

10.
What problem has been solved?

Beneficial machine
AI control
misuse
Weakness and Human Autonomy

Appendix A: Exploring Solutions
Appendix B Knowledge and Logic
Appendix C Uncertainty and Probability
Appendix D Learning through Experience

Acknowledgements
main
Photo copyright
After moving
Search

Detailed image
Detailed Image 1

Into the book
So whenever you read an article claiming that some AI technique “works like the human brain,” it’s safe to assume that it’s just someone’s guess or fiction.

We truly know next to nothing about consciousness, so I won't say anything. No one in the AI ​​field is researching consciousness in machines, and no one has any idea where to begin. No behavior presupposes consciousness.

--- pp.36-37

Focusing on raw computational power misses the point entirely. AI isn't something that can be achieved through speed alone.
Running a poorly designed algorithm on a faster computer does not make it any better.
It just gets you to the wrong answer faster.
(And the more data, the more opportunities there are for getting the wrong answer!) The main benefit of faster machines is that they allow for shorter experiment times, which means faster progress in research. It's not the hardware that's holding AI back.
It's software.
We don't yet know how to make machines truly intelligent.
Even if the machine were as large as the universe, it would be the same.

--- p.65

When AlphaGo defeated Lee Sedol and then all the other top Go players, many assumed it meant the beginning of the end, since a machine that had learned the game from scratch had beaten humanity in a game known to be extremely difficult even for the most intelligent of people.
In other words, it was only a matter of time before AI conquered humans.
When AlphaZero won not only at Go but also at chess and janggi, even some skeptics might have come around to believing that view.
However, AlphaZero has clear limitations.
It only works for the type of two-player game that is discontinuous, observable, and has known rules.
This approach will never work for driving, educating, running a government, or conquering the world.
--- p.80

And that's why Google unfortunately faces the gorilla problem.
In 2015, a software engineer named Jackie Alcine took to Twitter to complain that Google Photos' photo-classifying service had labeled her and her friends as gorillas.
It's unclear exactly how this error occurred, but it's almost certain that Google's machine learning algorithm was designed to minimize a loss function whose values ​​were explicitly specified.
Moreover, all errors would have been assigned the same cost.
In other words, we would assume that the cost of misclassifying a person as a gorilla is the same as the cost of misclassifying a Norfolk terrier as a Norwich terrier.
As the subsequent public outcry clearly demonstrated, Google's (or its users') true loss function was not like that.

--- p.97

From the perspective of AI researchers, the real breakthrough happened 30 or 40 years before Deep Blue burst into the public consciousness.
Likewise, deep convolutional networks had been around and mathematically well-understood for over 20 years before they made headlines.
The AI ​​breakthroughs the public sees in the media—landmark victories over humans, robots gaining Saudi Arabian citizenship, and so on—have little to do with what's actually happening in labs around the world.
Research in the lab involves a lot of thinking and discussion, including writing mathematical formulas on a whiteboard.
New ideas are constantly emerging, being discarded, and rediscovered.
Sometimes, people realize that a good idea (a true breakthrough) went unnoticed at the time, only to later realize that it laid the foundation for significant advances in AI.
Sometimes that happens when someone rediscovers it at a more appropriate time.

--- p.101

The problems of tactile sensing and hand configuration appear likely to be solved by 3D printing.
Boston Dynamics is already using this technology in some complex parts of its humanoid robot, Atlas.
Robotic manufacturing technology is advancing rapidly, thanks in part to deep reinforcement learning.
The ultimate advancement—putting it all together to create something that even begins to resemble the wondrous physical abilities of the robots in the movies—will likely come from the not-so-romantic warehouse industry.
Amazon alone employs hundreds of thousands of employees who remove products from bins in its massive warehouses, package them, and ship them to customers.
From 2015 to 2017, Amazon held an annual "Picking Challenge" to encourage the development of robots capable of doing this job.
Although there is still a long way to go, we can expect to see the rapid emergence of highly capable robots once key research questions are solved (perhaps within the next decade).
--- p.115

What we want is a robot that discovers for itself what it means to stand up.
That is, a robot that discovers that a useful abstract action is one that achieves the prerequisites (by standing up) for walking, running, shaking hands, or looking over a wall, and thus forms part of many abstract plans for all sorts of goals.
Likewise, we want robots to discover behaviors like moving from place to place, picking up objects, opening doors, tying knots, cooking, finding my keys, and building houses, as well as many other behaviors that we humans haven't yet discovered and therefore don't have names in any human language.
I see this ability as the most important step toward achieving human-level AI.

--- pp.136-137

Anyone advocating for an end to AI research will have to make a compelling case. Ending AI research would mean not only abandoning one of the key avenues for understanding how human intelligence works, but also a golden opportunity to improve the human condition—to create a vastly improved civilization.
The economic value of human-level AI could reach trillions of dollars, so the incentives for businesses and governments to pursue AI research are likely to be enormous.
That power will overwhelm the philosopher's vague objections, no matter how great his "reputation for expertise," to borrow Butler's phrase.

--- pp.200-201

If that purpose clashes with human preferences, we find ourselves in a situation identical to the plot of the movie 2001: A Space Odyssey.
In the film, the computer HAL 9000 kills four of the five crew members on board the spaceship to prevent them from interfering with its mission.
...
The third of Isaac Asimov's Three Laws of Robotics, which begins with "A robot must protect its own safety," is actually completely unnecessary.
Self-preservation is an instrumental goal, so there is no need to mount it.
Instrumental goals are subgoals that are useful in achieving almost any original goal.
Any being with a clear purpose will automatically act as if it also has an instrumental goal.

--- pp.208-209

The reader might think that the most eminent thinkers today are already pondering this.
That is to say, we will be engaging in serious debate, weighing risks and benefits, pursuing solutions, and examining the flaws in those solutions.
But as far as I know, that's not the case yet.

--- p.216

Those who support technology deny and hide the risks, and when they speak of the risks, they accuse it of Luddism.
Those who oppose are convinced that the risks are insurmountable and the problem is insoluble.
In a technology-savvy society, anyone who is too honest about their problems is branded a traitor.
It's a shame, considering that most of the people who have the ability to solve these problems are often from the tech-savvy tribe.
Anyone who talks about the possibility of mitigation in the absence of technological opposition is also a traitor.
This is because the tribe does not consider the possible effects of technology, but rather sees technology itself as evil.
In this way, only the most extreme (the least likely to listen to the other side's voice) people in each tribe will be able to speak out.

--- pp.235-236

As we've already seen when discussing instrumental goals, it doesn't matter whether we instill in AI "emotions" or "desires" such as self-preservation, resource acquisition, knowledge discovery, or, in the extreme, world domination.
Machines will eventually have such feelings.
As a sub-goal of the purpose we instilled, it will be so regardless of gender.
From a machine's perspective, death is not a bad thing in itself.
Yet death must be avoided.
Because it's hard to bring coffee when you're dead.

--- p.244

The impact that AI researchers can have on the development of global policy regarding AI is limited.
We can suggest possible applications that will provide economic and social benefits.
It can also warn of possible misuses, such as surveillance and weapons.
It can also inform us of possible future developmental paths and their implications.
Perhaps the most important thing we can do is design AI systems that are provably safe and beneficial to humans, to the extent possible. Only then will it make sense to attempt general AI regulation.

--- p.268

The renowned cognitive scientist Steven Pinker offers a more optimistic argument than Atkins.
He believes that a “high-society safety culture” will eliminate all serious risks from AI.
Therefore, paying attention to such risks is inappropriate and counterproductive.
Even if we ignore the fact that our hyper-safety culture led to Chernobyl, Fukushima, and the rampant global warming, Steven Pinker's argument misses the point entirely.
A safety culture emerges because there are people who point out possible failure patterns and try to find ways to prevent them from happening.
(And in the AI ​​field, the standard model is precisely that failure pattern.) Saying that it's foolish to point out the failure pattern because the safety culture will correct it anyway is like saying that there's no need to call an ambulance when you see a hit-and-run accident because someone will call an ambulance anyway.

--- p.314

So I propose that a beneficial machine is one like this.
A machine that performs actions that we can expect to achieve our goals.
Since these goals are ours and not the machines', machines will need to learn more about what we truly want by observing what choices we make and how we make them.
Machines designed this way will follow humans.
I will get permission.
When instructions are unclear, we will act cautiously.
I will allow myself to be turned off.

--- p.361

With all this activity taking place, can we expect any real progress toward control? Perhaps surprisingly, the answer is yes.
At least in minor aspects, that is the case.
Many governments around the world have advisory bodies that assist in developing regulatory tools.
Among them, the European Union's High Level Expert Group on Artificial Intelligence (AI HLEG) is probably the most famous.
Consent, regulations, and standards are also emerging for issues such as user privacy, data exchange, and avoiding racial bias.
Governments and companies are working hard to establish rules for self-driving cars.
These are rules that will inevitably have elements that transcend national borders. There is a consensus that for AI systems to be trustworthy, their decision-making must be explainable, and this consensus is already partially implemented through the European Union's GDPR law.
In California, a new law has been passed that prevents AI systems from replacing humans in certain situations.
These two issues—explainability and anthropomorphism—are clearly somewhat relevant to the current issues of AI safety and governance.
Currently, there are no actionable recommendations available to governments or other agencies grappling with the issue of maintaining control over AI systems.
Legal provisions like “AI systems must be safe and controllable” will carry no weight.
Not only do these terms not yet have precise meanings, but there are no widely accepted engineering methodologies to ensure safety and controllability.
--- pp.366-367

Publisher's Review
“The most important book I’ve read in recent times” _Daniel Kahneman
"A robust and promising solution to the dangers of AI" - Max Tegmark

"The most important AI book of the year" - The Guardian
#1 in Amazon AI/Robotics / Best Books in Technology by Financial Times and Forbes

“AI is now on the front page of the media every day.
Fueled by an influx of venture capital, countless startups are emerging.
Millions of students are taking AI and machine learning courses online, and professionals in these fields command salaries in the millions of dollars.
Hundreds of billions of dollars are invested annually by venture funds, governments, and large corporations.
We've received more money in the last five years than we've received since the industry was founded.
Advances already made in this field, such as self-driving cars and personal assistants, are likely to have a significant global impact within the next decade or so. The economic and social benefits of AI are enormous, and these benefits will further fuel AI research.” (p. 21)

People and money are pouring into artificial intelligence research.
“The United States, China, France, the United Kingdom, and the European Union are competing to announce that they will invest tens of billions of dollars in AI research” (p. 267), and the situation in Korea is not much different.
We often see articles about companies complaining about a shortage of developers, or about prestigious universities establishing AI graduate schools and working hard to secure talented faculty.
This is a new golden age for artificial intelligence research.
So, how far has AI research come? How far will it advance? If an AI surpasses humans, will it solve humanity's pressing problems and instantly improve the quality of life? Or will it threaten humanity's very survival, as often depicted in movies? Professor Stuart Russell of UC Berkeley, renowned author of the standard AI textbook "Artificial Intelligence," has been among the most thoughtful voices on the risks of AI.
As a first-class researcher with expertise in this field, I have long pondered and explored the subject matter, and in "How to Create Artificial Intelligence that Coexists with Humans," I have included a broad and profound discussion that makes the trolley problem seem trivial.
Beyond irresponsible optimism and dystopian visions of a bright future for AI, this book examines the challenges posed by AI development and various positions on superintelligent AI from a realistic and broad perspective, proposing directions and principles for creating AI that benefits humanity, based on solid evidence.
This is a must-read for understanding the current AI debate, a guide for forecasting the future direction of AI development, and understanding how the technology works.


The current state of AI, its possibilities and risks, and the direction of human-friendly development
A masterpiece by Professor Stuart Russell of UC Berkeley, a leading authority on artificial intelligence.

“Encountering an intelligence far superior to our own would be the greatest event in human history.
“The purpose of this book is to explain why this could be the final event in human history, and what we can do to prevent it from happening.” (p. 10)

The future AI that appears in science fiction novels, movies, and the public imagination is a being with terrifying abilities, a being that threatens not only jobs and human relationships, but also civilization and the very survival of humanity.
The conflict between humans and machines is inevitable, and a catastrophic future awaits humanity.
But Stuart Russell argues that such a scenario can be avoided.
But to do that, we need to fundamentally rethink AI.
The author begins by examining the concept of intelligence in humans and machines (Chapter 2), examining how far AI has come in capabilities ranging from intelligent personal assistants, self-driving cars, and home robots to global-scale sensing capabilities, and outlining the breakthroughs that must be made before human-level general AI emerges.
Thus, we will examine what will become possible in various sectors, such as living standards, education, and health, when general-purpose artificial intelligence emerges beyond the specialized artificial intelligence currently in use (Chapter 3).

Of course, as artificial intelligence's capabilities grow, problems arise.
This chapter also covers the ways in which AI has already been misused by humans, as well as the dangers that lie ahead, from surveillance and behavioral control that corporations and states will employ to lethal autonomous weapons and the elimination of jobs (Chapter 4).
When superintelligent artificial intelligence emerges, the relationship between artificial intelligence and humans will be similar to the relationship between humans and gorillas today.
Russell calls this the “gorilla problem,” which is “specifically, the question of whether humanity can maintain its superiority and autonomy in a world where machines with significantly greater intelligence exist” (pp. 196-197).
In the novels "Erewhon" and "Dune," a world is depicted where machines are banned to combat the dangers that threaten humanity's existence, but in today's reality, it is not easy to ban general AI research.
Ultimately, superintelligent AI will perform "things we did not intend" to "carry out the purposes we have given it," and it will be impossible to control them (Chapter 5).
Chapter 6 examines the ongoing debate over the risks of AI.
From denying the possibility of superintelligent AI to labeling those who criticize the risks of AI as Luddites, to arguing that we should avoid assigning problematic types of goals to machines, this book critically examines the positions and arguments of prominent and influential figures in the field of AI, including leading researchers, philosophers, thinkers, and entrepreneurs.
It features prominent intellectuals such as Facebook's Mark Zuckerberg, Tesla's Elon Musk, deep learning pioneer Yann LeCun, machine learning and natural language research expert Eren Etzioni, renowned cognitive scientist Steven Pinker, and Oxford University philosopher Nick Bostrom.
Unfortunately, however, in the author's view, the current discussion is not taking place at a high enough level to reflect the importance of the issue.

The latter part of the book presents a new perspective on AI and how we can ensure that machines remain permanently helpful to humanity (chapters 7-10).
Utilitarians Jeremy Bentham, John Stuart Mill, and G.
We consider how artificial intelligence might address human preferences, examining the discussions of philosophers such as E. Moore, Robert Nozick, Henry Sidgwick, and Derek Parfit, and economists from Adam Smith to John Harsanyi.
The key is to keep AI from knowing what we prefer.
We ask AI to satisfy our preferences, but the idea is to design machines that are inherently uncertain about human preferences.
Then the machine becomes a humble and altruistic being, devoted to pursuing our goals rather than its own.
If we build AI on this new foundation, we can create machines that respect us and benefit us.

A first-class AI researcher's cool, macroscopic insights and unique solutions.


This book is particularly valuable because it was written by an AI expert, not a businessman, philosopher, or journalist, but from a cool-headed, macroscopic, and realistic perspective. In Life 3.0, MIT physicist Max Tegmark pointed out that mainstream researchers are focused on the immediate challenge—making the AI ​​systems currently under development smarter—rather than considering the long-term consequences, and even those who do are generally reluctant to speak out about the issues.
In contrast, Stuart Russell has been researching and speaking on the threat of autonomous weapons, the long-term future of artificial intelligence, and its relationship with humanity for quite some time.
He served as Vice Chair of the World Economic Forum's 'Commission on AI and Robotics' and as an advisor to the 'UN Disarmament Issues'. In 2016, he founded the 'Center for Human-Compatible AI', a research institute centered around UC Berkeley that collaborates with several universities and institutions, and has been developing the conceptual and technical tools necessary to re-direct the general direction of AI research toward provably beneficial AI systems.

He is somewhat conservative in his prediction that superintelligent artificial intelligence will be possible within this century, or in about 80 years, compared to mainstream researchers who believe that it will be possible by the middle of this century (p. 120).
However, he argues that if we start discussing this issue when the technology becomes visible, it will already be too late, so we must address this issue immediately and work together to find a solution as soon as possible.
And since “quitting AI research is not only unlikely to actually happen (because it would require giving up too many benefits) but also extremely difficult to implement” (p. 213), I think changing the direction of research is a more realistic alternative.


From the 'Standard Model' to 'Provably Beneficial Machines'


In other words, Russell proposes a shift away from the standard model of developing artificial intelligence—that is, “build an optimizing machine, give it a purpose, and then run it”—to a shift toward creating machines that are “provably beneficial” to humans.
In fact, the standard model is not a problem when the operating range is limited.
If you inject the wrong purpose, there's ample opportunity to turn it off, correct the problem, and try again.
However, as the intelligence of machines designed according to the standard model increases and their scope of action becomes more global, this approach becomes unsustainable.
Because such a machine will pursue its purpose, no matter how wrong that purpose may be.
His approach is summarized in three principles: “1.
The sole purpose of the machine is to maximize the realization of human preferences.
2. The machine is initially not sure what those preferences are.
3. “The ultimate source of information about human preferences is human behavior” (p. 254), this seemingly simple principle is supported by solid logic.

“I would argue that the standard model of AI—that is, the model where machines optimize for set goals given by humans—is a kind of dead end.
The problem is not that we might fail at building AI systems.
We could be a huge success.
The problem is that the very definition of success in the AI ​​field is flawed.” (p. 32)

“The uncertainty of purpose means that machines inevitably have to follow humans.
That is, the machine will ask for permission, accept modifications, and allow itself to stop working.
Removing the assumption that machines must have a clear purpose means tearing apart some of the foundations of artificial intelligence and replacing them with something else.
That's the basic definition of what we're trying to achieve.
This also means that a significant portion of that superstructure (the accumulation of concepts and methods that were actually built to create AI) will need to be rebuilt.
If we do so, humans and machines will form a new relationship.” (pp. 28-29)

A book that provides important points of discussion for social debate on the direction of artificial intelligence development.

To what extent are the principles he proposed resonating and being applied in current research settings? If, as he and many other AI experts worry, superintelligent AI is highly likely to misbehave and have significant consequences, then this cannot be left solely to businesses and researchers.
It is time for developers and citizens alike to collectively consider this technology, which has the potential to have the most profound impact on the future of humanity.
The expert's deep and broad perspective, clear narrative, and careful yet firm suggestions will serve as a useful guide for our readers grappling with similar issues.


“The right time to worry about something that could cause serious problems for humanity depends not only on when the problem will occur, but also on how long it will take to prepare and implement a solution.
If we knew a large asteroid was set to hit Earth in 2069, would we say it was too early to worry? Quite the opposite! A global emergency plan would be developed to develop a means to avert the threat.
“We won’t wait until 2068 to find a solution” (p. 224)

Stuart Russell is one of the leading experts in the field of artificial intelligence, and this book is a truly masterful work that provides an authoritative and entertaining overview of the risks of increasingly powerful AI.
Russell believes that our current approach to designing intelligent machines is fundamentally flawed, and that if the dreams of AI evangelists come to fruition, they could lead to truly dystopian outcomes.
He is very adept at explaining how we got to where we are now, and he also makes a compelling case for how we can avoid catastrophic superintelligence and how we can ensure that machines augment human capabilities rather than render them useless.
_〈Guardian〉

A thought-provoking and readable account of AI's past, present, and future.
Russell's discussion is grounded in real-world technological realities, including their many limitations, and does not veer into the heated language of science fiction.
If you're looking for a book that provides a serious overview of the topic without being dismissive of non-technical readers, this is a good place to start.
The intellectually rigorous yet concise writing style and clever humor make it easily digestible for the general reader.

_〈Financial Times〉

A fascinating and important book.
Russell warns not of the dangers of conscious machines, but of the possibility that superintelligent machines could be misused or misuse themselves.
_〈The Times〉

An interesting book that goes into depth.
Natural wit shines.

_〈The Wall Street Journal〉
GOODS SPECIFICS
- Publication date: June 30, 2021
- Page count, weight, size: 488 pages | 712g | 152*225*23mm
- ISBN13: 9788934988434
- ISBN10: 8934988436

You may also like

카테고리