spContent=大数据时代亟需数据仓库与数据挖掘等技术集聚和挖掘数据资源,开发和释放数据蕴含的巨大价值,以数据竞争力支撑国家发展,以数据生产力推动社会进步。通过该课程的学习,你可以掌握数据仓库和数据挖掘的基础理论与相关工程技术,实现海量数据的采集、清理、存储、分析与挖掘。
In the era of big data, technologies such as data warehouse and data mining are urgently needed to gather and mine data resources, develop and release the huge value of data, support national development with data competitiveness, and promote social progress with data productivity. Through the study of this course, you can master the basic theories and related engineering techniques of data warehouse and data mining, and realize the collection, cleaning, storage, analysis and mining of massive data.
大数据时代亟需数据仓库与数据挖掘等技术集聚和挖掘数据资源,开发和释放数据蕴含的巨大价值,以数据竞争力支撑国家发展,以数据生产力推动社会进步。通过该课程的学习,你可以掌握数据仓库和数据挖掘的基础理论与相关工程技术,实现海量数据的采集、清理、存储、分析与挖掘。
In the era of big data, technologies such as data warehouse and data mining are urgently needed to gather and mine data resources, develop and release the huge value of data, support national development with data competitiveness, and promote social progress with data productivity. Through the study of this course, you can master the basic theories and related engineering techniques of data warehouse and data mining, and realize the collection, cleaning, storage, analysis and mining of massive data.
—— 课程团队
课程概述
《数据仓库与数据挖掘》在线课程注重理论联系实践,理论为经,应用为纬。立足数据,在统一框架内介绍数据仓库和数据挖掘技术,主要包括数据概念、数据仓库模型、知识类型,数据预处理、数据分类、数据回归、关联挖掘、数据聚类、异常检测、数据可视化等方法,以及大数据挖掘平台的设计与实现。通过学习,学生可以掌握海量数据仓库存储与挖掘的基本原理,利用数据预处理、关联规则挖掘、聚类分析、分类挖掘、异常检测等算法,研制软件工具,解决实际工程中海量数据的高效管理与深度利用问题。该课程为学生今后从事科学研究工作或从事各种数据利用工作提供必要的基础理论和基本技能。
The online course "Data Warehouse and Data Mining" focuses on the connection of theory with practice, with theory as warp and application as weft. Based on data, data warehouse and data mining technology is introduced within a unified framework, including data concepts, data warehouse models, knowledge types, data preprocessing, data classification, data regression, association mining, data clustering, anomaly detection, data visualization and so on, as well as the design and implementation of a big data mining platform. By learning the course, you can master the basic principles of massive data warehouse storage and mining, and further take advantage of data preprocessing, association rule mining, cluster analysis, classification mining, anomaly detection and other algorithms to develop software tools to solve the problems on efficient management and in-depth utilization of massive data in actual projects. This course provides the necessary basic theories and basic skills for students to engage in scientific research or engage in various data utilization tasks in the future.
课程大纲
Introduction
1 What Is Data Mining and Why Data Mining
2 Data Mining Process
3 Data to be Mined
4 Data Mining Tasks
5 Evaluation of Knowledge
Data
1 Data Objects and Attribute Types
2 Basic Statistical Descriptions of Data
3 Measuring Data Similarity and Dissimilarity
Data Preprocessing
1 Overview
2 Data Cleaning
3 Data Integration
4 Data Transformation
5 Data Reduction
Association Rule Mining
1 Basic Concept
2 Frequent Itemset Generation
3 Rule Generation
4 Factors Affecting Complexity of Apriori
5 Compact Representation of Frequent Itemsets
6 Pattern Evaluation
Classification
1 Classification: Basic Concepts
2 Decision Tree Induction
3 Bayes Classification Methods
4 Techniques to Improve Classification Accuracy: Ensemble Methods
5 Classification of Class-Imbalanced Data Sets
6 Model Evaluation and Selection
Cluster Analysis
1 An Introduction
2 Partitioning Methods
3 Hierarchical Methods
4 Density- and Grid-Based Methods
5 Evaluation of Clustering
Outlier Analysis
1 Outlier and Outlier Analysis
2 Outlier Detection Methods
3 Statistical Approaches
4 Proximity-Based Approaches
5 Clustering-Based and Classification–Based Approaches
Data visualization
1 Introduction
2 Function of Data Visualization
3 Data Visualization Methods
4 Tools of Data Visualization
Data warehouse
1 Introduction
2 Data warehouse and Related Technology development
3 The Data Model of Data warehouse
4 Data ETL
5 Data reorganization
6 Conclusion
Perspective
1 Data Resource
2 Data utiliation
3 Data Ecology
展开全部
证书要求
为积极响应国家低碳环保政策, 2021年秋季学期开始,中国大学MOOC平台将取消纸质版的认证证书,仅提供电子版的认证证书服务,证书申请方式和流程不变。
电子版认证证书支持查询验证,可通过扫描证书上的二维码进行有效性查询,或者访问 https://www.icourse163.org/verify,通过证书编号进行查询。学生可在“个人中心-证书-查看证书”页面自行下载、打印电子版认证证书。
完成课程教学内容学习和考核,成绩达到课程考核标准的学生(每门课程的考核标准不同,详见课程内的评分标准),具备申请认证证书资格,可在证书申请开放期间(以申请页面显示的时间为准),完成在线付费申请。
认证证书申请注意事项:
1. 根据国家相关法律法规要求,认证证书申请时要求进行实名认证,请保证所提交的实名认证信息真实完整有效。
2. 完成实名认证并支付后,系统将自动生成并发送电子版认证证书。电子版认证证书生成后不支持退费。
参考资料
袁汉宁,王树良,程永,金福生,宋红,2015, 数据仓库与数据挖掘, 人民邮电出版社
Shuliang Wang, Hanning Yuan, 2014, Spatial data mining: a perspective of big data, International Journal of Data Warehousing and Mining, 10(4):50-70
Deren Li, Shuliang Wang*, Hanning Yuan*, Deyi Li, 2016, Software and Applications of Spatial Data Mining. WIREs Data Mining and Knowledge Discovery, 6(3): 84-114
Li Deren, Shuliang Wang, Li Deyi, 2015, Spatial Data Mining: Theory and Application, Springer
Han Jiawei,Kamber Micheline,Pei Jian, 2017, Data Mining : Concepts and Techniques (3rd Edition), Morgan Kaufmann
William Inmon, 2005, Building the Data Warehouse (4th Edition), Wiley