The motivation for this course started with the development of information techniques. The amount of traffic data collected is growing at an increasing rate. At the same time, the users of these data are expecting more sophisticated analysis of these large data sets. The area of data mining has been developed over the last decade to address this problem.
Data Mining is often defined as discovering useful but hidden patterns or relationships in a database, which is one of the hottest fields in computer science. It is a good field to study not only for computer science students, but also for transportation students, as well as lots of or engineer students because the same techniques can be used to solve many problems related to data mining that may arise during their career in the future.
This course intends to cover the basic concepts of data mining as well as specific applications to transportation systems, including data preprocessing, instance-based learning, decision tree, support vector machine, neural network, outlier detection and ensemble learning. The instructors will introduce what the techniques are, what they can do, how they are used, and how they work.
Welcome to join us.
The objectives of the course are to present the basic concepts of data mining, the principles and ideas underlying the practice of data mining, including data preprocess, instance based learning, decision tree, Support Vector Machine, outlier mining, and ensemble learning.
After completing this course, students will have the ability to understand the fundamental terms and concepts of data mining, and to use the methods taught in class for the analysis and processing of real transportation data.
Knowledge of probability, statistics and linear algebra at the undergraduate level; Basic knowledge of traffic engineering, and basic programing skills.
Textbook:
陈淑燕, 马永锋, 乔凤翔. Machine Learning for Transportation, 东南大学出版社, 2022.
Reference books:
A. Ian H.Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, San Francisco: Morgan Kaufmann Publishers, 3rd ed. 2011.
B. Charu C. Aggarwal, Data Mining: The Textbook, Springer, May 2015.
C. Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Pearson, 1st Edition, 2005.
D. Christopher M. Bishop, Pattern recognition and machine learning, the Morgan Kaufmann series in information science and statistics, Springer Science, 2006.
E. Jiawei Han, Micheline Kamber and Jian Pei, Data Mining: Concepts and Techniques, Morgan Kaufmann, 3rd edition, 2011.
F. Required handouts will be provided by the instructors.