Skip to product information
Python data processing
Python data processing
Description
Book Introduction
In data science, there is a concept called the 80/20 rule.
Strictly speaking, this is more of an empirical observation than a law: of the resources (time and manpower) consumed in the process of performing data analysis, only 20% go to the actual modeling and analysis work, while 80% of the resources are spent on finding, refining, and reconstructing data.
Data is always lacking, and even the data we do collect is often missing information or not formatted correctly.
The analysis results do not match intuition, probably because the data contains strange values.
Anyone who has ever done real data analysis work will agree with this sentiment.

The example data we cover in most classes is usually 'clean' data that has been preprocessed to reduce trial and error.
This textbook covers the basics of preprocessing, collecting more realistic "dirty" data and refining the collected data to suit the purpose of analysis.
Since covering each element in depth (web scraping, API programming, missing value filling, etc.) would require several books, we introduce the most common and basic techniques and provide appropriate examples so that you can apply them to real data.

This textbook contains a lot of Python example code.
You can download these codes directly from the author's website, but since data processing is more of a practical skill to be learned than a systematic academic discipline, I recommend that you don't just run these codes; you input them yourself, go through trial and error, and develop them into useful tools for yourself.
I recommend that you go through trial and error to develop it into a tool that you can use usefully.

index
Chapter 1: Python and the Practice Environment
1. Introduction
2 Introduction and Setup of Google Colab
3 Review of basic Python grammar

Chapter 2 Data ETL
1 Introduction to Data ETL
2. Concept of Data ETL
3 Data Extraction
4 Data Conversion
5 Data loading

Chapter 3 Pandas
1 What is Pandas?
2 Pandas data structures
3 Select data
4 Data processing
5 Data Analysis
6 Data Editing

Chapter 4 Web Scraping
1 Introduction to Web Scraping
2 Web Scraping Tools
3. The Web Scraping Process
4 HTML Basics

Chapter 5 Web Scraping Libraries
1. Web scraping method
2 BeautifulSoup
3 Selenium

Chapter 6: Data Collection Using Web APIs
1 Understanding API Concepts and REST APIs
2. Preparing to Use Public APIs
3 Calling the API with Python

Chapter 7 Data Cleaning (1) - Missing Value Detection and Resolution
1. Concept of data cleansing
2 Types of missing values
3. Detecting missing values
4 Handling missing values

Chapter 8 Data Cleaning (2) - Outlier Detection and Resolution
1. The meaning and importance of outliers
2. Detecting Outliers
3 Handling outliers
4 Handling duplicate data

Chapter 9 Data Cleansing (3) - Data Merging and Transformation
1. The concept and necessity of data type conversion
2 Data type conversion using Pandas
3 Merge data frames

Chapter 10 Data Cleaning (4) - Data Scaling and Encoding
1. The concept and necessity of scaling
2 Major Scaling Techniques
3 Categorical data encoding
GOODS SPECIFICS
- Date of issue: July 25, 2025
- Page count, weight, size: 236 pages | 188*257*20mm
- ISBN13: 9788920053382
- ISBN10: 8920053383

You may also like

카테고리