Skip to product information
Open source-based data analysis
Open source-based data analysis
Description
Book Introduction
“In the end you should only measure and look at the numbers that drive action, meaning that the data tells you what you should do next.” -Alexander Peiniger

These words sound like both an insight and a warning about how we, living in the information innovation era, should view data and analytics.
The word "data" has been tossed around endlessly in recent years, yet our ability to properly collect, refine, interpret, and transform it into actionable, summarized forms is severely limited.
Simply storing data is one thing; interpreting meaning from it and putting it into action is another.

This textbook, "Open Source-Based Data Analysis," is an educational attempt and practice to bridge that gap.
Modern society has become a data-driven one, with millions of data points generated every second from social media, sensors, web logs, e-commerce, and more.
For example, companies analyze consumer behavior to suggest products, governments predict the spread of infectious diseases and design quarantine policies, and hospitals analyze patients' genetic information and medical records to provide precision medicine.
Data is no longer just a supplementary tool; it has become a core resource for explaining facts and designing the future.
But just having data doesn't automatically create meaning.
Meaning is born through analysis.
Analysis is not simply a technique; it is a way of interpreting, and the way of analyzing forms an accurate perspective on the world.
This textbook is designed to develop an eye for data, a hand for handling data, and a mind capable of interpreting data.
In particular, this textbook is different from other data analysis textbooks in that it is 'open source-based.'

Open source goes beyond the concept of simply being a "free tool" to embrace the philosophy of sharing, reusing, and evolving technology together.
The analytical tools we use in our practice—Linux, Python, Jupyter Notebooks, Pandas, GitHub, VS Code—all grew out of the open source ecosystem.
These tools, created through the voluntary contributions of countless anonymous developers, researchers, and students, have become the standard for data analysis today, and this textbook is designed to help you understand and participate in this open technology ecosystem.

index
Chapter 1: Understanding Data Analysis and Open Source
1.1.
Understanding Data Analysis
1.2.
Data analysis process
1.3.
Classification of data
1.4.
open source environment

Chapter 2 Python for Data Analysis
2.1.
Lists and dictionaries
2.2.
Data input/output
2.3.
Python syntax elements
2.4.
functional programming

Chapter 3 Data Collection and Storage
3.1.
What is data collection?
3.2.
Type of data
3.3.
Data collection methods
3.4.
Data storage
3.5.
Understanding and Using Pandas' DataFrame
3.6.
Concept organization practice

Chapter 4 Data Preprocessing
4.1.
What is data preprocessing?
4.2.
Data Measurement with Pandas
4.3.
Data cleaning using Pandas
4.4.
Solving Data Problems with Pandas
4.5.
Concept organization practice

Chapter 5 Data Analysis
5.1.
Understanding Data Analysis
5.2.
Advances in data analysis methodologies
5.3.
Exploratory Data Analysis (EDA)
5.4.
Statistical analysis methodology
5.5.
Machine learning-based analysis methodology
5.6.
Deep learning-based analysis methodology

Chapter 6 Data Visualization
6.1.
What is data visualization?
6.2.
Matplotlib library
6.3.
Data visualization techniques
6.4.
Data visualization techniques and styling
6.5.
Advanced data visualization techniques
6.6.
Concept practice summary

Chapter 7: Structured Data Analysis: Utilizing Seoul City Public Bicycle Data
7.1.
Practical Project Overview
7.2.
Data collection and preprocessing
7.3.
data analysis
7.4.
Linkage analysis

Chapter 8: Semi-structured Data Analysis: Utilizing Bulletin Board Post Data
8.1.
Practical Project Overview
8.2.
Data collection
8.3.
Text data preprocessing
8.4.
Keyword analysis
8.5.
Text classification and visualization

Chapter 9: Unstructured Data Analysis: Using Fashion Photo Data
9.1.
Practical Project Overview
9.2.
Data collection
9.3. Image Analysis Using VLM
9.4. Keyword Analysis and Report Writing Using LLM

Chapter 10: Time Series Data Analysis: Using Stock Price Data
10.1.
Practical Project Overview
10.2.
Data collection
10.3.
Financial Time Series Analysis
10.4.
Prophet-based prediction
10.5. GBRT-based prediction
GOODS SPECIFICS
- Date of issue: July 25, 2025
- Page count, weight, size: 540 pages | 188*257*35mm
- ISBN13: 9788920052859
- ISBN10: 8920052859

You may also like

카테고리