hi,小慕
交通数据挖掘技术(Data Mining for Transportation)
第9次开课
开课时间: 2024年01月08日 ~ 2024年09月28日
学时安排: 3-5小时每周
当前开课已结束 已有 165 人参加
立即自学
往期不提供结课证书,想参加下学期课程, 点击这里预约>>
课程详情
课程评价(37)
spContent=The motivation for this course started with the development of information techniques. The amount of traffic data collected is growing at an increasing rate and are expecting more sophisticated analysis of these large data set.
The motivation for this course started with the development of information techniques. The amount of traffic data collected is growing at an increasing rate and are expecting more sophisticated analysis of these large data set.
—— 课程团队
课程概述

The motivation for this course started with the development of information techniques. The amount of traffic data collected is growing at an increasing rate. At the same time, the users of these data are expecting more sophisticated analysis of these large data sets. The area of data mining has been developed over the last decade to address this problem.

Data Mining is often defined as discovering useful but hidden patterns or relationships in a database, which is one of the hottest fields in computer science. It is a good field to study not only for computer science students, but also for transportation students, as well as lots of or engineer students because the same techniques can be used to solve many problems related to data mining that may arise during their career in the future.

This course intends to cover the basic concepts of data mining as well as specific applications to transportation systems, including data preprocessing, instance-based learning, decision tree, support vector machine, neural network, outlier detection and ensemble learning. The instructors will introduce what the techniques are, what they can do, how they are used, and how they work.

Welcome to join us.

授课目标

The objectives of the course are to present the basic concepts of data mining, the principles and ideas underlying the practice of data mining, including data preprocess, instance based learning, decision tree, Support Vector Machine, outlier mining, and ensemble learning. 

After completing this course, students will have the ability to understand the fundamental terms and concepts of data mining, and to use the methods taught in class for the analysis and processing of real transportation data. 

课程大纲

Week 1. Introduction to data mining

1.1 What is data mining?

1.2 Data mining functionality

1.3 Data Mining Techniques

1.4 Summary

Slides

Topic for Discussion: Week 1

Test 1

Term Project

Term Project

Week 2. Data pre-processing

2.1 Why preprocess the data?

2.2 Data cleaning

2.3 Data integration

2.4 Data reduction

2.5 Data transformation

2.6 Summary

Slides

Topic for Discussion: Week 2

Test 2

Week 3. Instance based learning

3.1  Overview of IBL

3.2  Components of KNN

3.3  Variants of kNN

3.4 Summary

Slides

Topic for Discussion: Week 3

Course Record 3.1.1

Course Record 3.1.2

Slides by Bilal Farooq 3.1.1

Slides by Bilal Farooq 3.1.2

Test 3

Week 4. Decision Trees

4.1  Decision Tree Representation

4.2  Construct Decision Tree

4.3  Overfitting and Tree Pruning

4.4  Pros and Cons of DTs

Slides

Topic for Discussion: Week 4

Course Record 4.1.1

Course Record 4.1.2

Course Record 4.2.1

Course Record 4.2.2

Slides by Bilal Farooq 4.1

Slides by Bilal Farooq 4.2

Test 4

Week 5. Support Vector Machine

5.1  Linear SVMs

5.2  Non-linear SVMs

5.3  Multiclass

5.4  Support Vector Regression

5.5  Summary

Slides

Topic for Discussion: Week 5

Test 5

Week 6. Outlier Mining

6.1 Background of Outlier Detection

6.2 Statistic-based Method

6.3 Distance-based Method

6.4 Density-based Method

6.5 Conclusions

Slides

Topic for Discussion: Week 6

Test 6

Week 7. Ensemble Leaning

7.1 General Idea on Ensemble Methods

7.2 Popular methods for ensemble

7.3 Class-Imbalanced Data

7.4 Summary

Slides

Topic for Discussion: Week 7

Test 7

Week 8 Clustering

8.1 Introduction to Clustering

8.2 K-means and K-medoids

8.3 DBSCAN

8.4  Model Based Clustering

Test 8

Extension 1 - Neural Network

Course Record - Neural Network 1

Course Record - Neural Network 2

Slides by Bilal Farooq - Neural Network

Codes for Course

展开全部
预备知识

Knowledge of probability, statistics and linear algebra at the undergraduate level; Basic knowledge of traffic engineering, and basic programing skills.

参考资料


Textbook:

陈淑燕, 马永锋, 乔凤翔. Machine Learning for Transportation, 东南大学出版社, 2022.

Reference books:

A.      Ian H.Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, San Francisco: Morgan Kaufmann Publishers, 3rd ed. 2011.

B.      Charu C. Aggarwal, Data Mining: The Textbook, Springer, May 2015.

C.      Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Pearson, 1st Edition, 2005.

D.      Christopher M. Bishop, Pattern recognition and machine learning, the Morgan Kaufmann series in information science and statistics, Springer Science, 2006.

E.    Jiawei Han, Micheline Kamber and Jian Pei, Data Mining: Concepts and Techniques, Morgan Kaufmann, 3rd edition, 2011.

F.      Required handouts will be provided by the instructors.

东南大学
1 位授课老师
陈淑燕

陈淑燕

教授

推荐课程

下载
下载

下载App