hi,小慕
课程

中国大学MOOC,为你提供一流的大学教育

hi,小mooc
Data Mining for Transportation
第2次开课
开课时间: 2020年12月09日 ~ 2020年12月31日
学时安排: 3-5 hours per week
当前开课已结束 已有 25 人参加
立即自学
往期不提供结课证书,想参加下学期课程, 点击这里预约>>
课程详情
课程评价(33)
spContent=The motivation for this course rooted from the development of information techniques. The amount of traffic data collected is growing at an increasing rate. At the same time, the users of these datum are expecting more sophisticated analysis of these large data sets. The area of data mining has developed over the last decade to address this problem. Data mining is one of the popular fields in computer science, and particularly relative to intelligent transportation system. It can be very helpful for traffic researchers and managers to solve traffic problems. Data mining is a good field to study not only for computer science students, but also for transportation students, because the same techniques can be used to solve many traffic problems that may arise during their career in the future.
The motivation for this course rooted from the development of information techniques. The amount of traffic data collected is growing at an increasing rate. At the same time, the users of these datum are expecting more sophisticated analysis of these large data sets. The area of data mining has developed over the last decade to address this problem. Data mining is one of the popular fields in computer science, and particularly relative to intelligent transportation system. It can be very helpful for traffic researchers and managers to solve traffic problems. Data mining is a good field to study not only for computer science students, but also for transportation students, because the same techniques can be used to solve many traffic problems that may arise during their career in the future.
—— 课程团队
课程概述

Data Mining is often defined as discovering useful but hidden patterns or relationships in a database, which is one of the popualr fields in computer science. Finding patterns, trends, and outliers in these datasets, and summarizing them with simple quantitative models, are one of the grand challenges of the information age—turning data into knowledge. 

Data mining programs are intended to search through datum for hidden relationships and patterns in the datasets. This approach is particularly relative to intelligent transportation system. It can be very helpful for traffic researchers and managers to solve traffic problems. 

This course provides an introduction to data mining as applied to transportation systems. It intends to cover the basic concepts of data mining as well as specific applications to transportation systems.

授课目标

The objectives of the course are to present the basic concepts of data mining, the principles and ideas underlying the practice of data mining, including data preprocess,  instance based learning, decision tree, support vector machine, outlier mining, and ensemble learning. 

After completing this course, students will have the ability to understand the fundamental terms and concepts of data mining, and to use the methods taught in class for the analysis and processing of real transportation data. 


课程大纲

Week 1. Introduction to Data Mining

1.1 What is Data Mining?

1.2 Data Mining Functionality

1.3 Data Mining Techniques

1.4 Summary

Courseware


Week 2. Data Pre-processing

2.1 Why Preprocess the Data?

2.2 Data Cleaning

2.3 Data Integration

2.4 Data Reduction

2.5 Data Transformation

2.6 Summary

Courseware


Week 3. Instance based Learning

3.1  Overview of IBL

3.2  Components of KNN

3.3  Variants of kNN

3.4  Summary

Courseware


Week 4. Decision Trees

4.1  Decision Tree Representation

4.2  Construct Decision Tree

4.3  Overfitting and Tree Pruning

4.4  Pros and Cons of DTs

Courseware


Week 5. Support Vector Machine

5.1  Linear SVMs

5.2  Non-linear SVMs

5.3  Multiclass

5.4  Support Vector Regression

5.5  Summary

Courseware


Week 6. Outlier Mining

6.1 Background of Outlier Detection

6.2 Statistic-based Method

6.3 Distance-based Method

6.4 Density-based Method

6.5 Conclusions

Courseware


Week 7. Ensemble learning

7.1 General Idea on Ensemble Methods

7.2 Popular methods for ensemble

7.3 Class-Imbalanced Data

7.4 Summary

Courseware

展开全部
预备知识
  • Knowledge of probability, statistics and linear algebra at the undergraduate level

  • Basic knowledge of traffic engineering and basic programing skills


参考资料
  • Jiawei Han, Micheline Kamber and Jian Pei,  Data Mining: Concepts and Techniques, Morgan Kaufmann, 3rd  edition, 2011.

  • Ian H.Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations,  San Francisco: Morgan Kaufmann Publishers, 3rd ed. 2011.

  • Charu C. Aggarwal, Data Mining: The Textbook, Springer,  May 2015.

  • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Pearson, 1st Edition, 2005.

  • Christopher M. Bishop, Pattern recognition and machine learning, the Morgan Kaufmann series in information science and statistics, Springer Science, 2006.

Southeast University
2 位授课老师
Shuyan CHEN

Shuyan CHEN

Professor

Wenbo ZHANG

Wenbo ZHANG

讲师

下载
下载

下载App