Data Mining is often defined as discovering useful but hidden patterns or relationships in a database, which is one of the popualr fields in computer science. Finding patterns, trends, and outliers in these datasets, and summarizing them with simple quantitative models, are one of the grand challenges of the information age—turning data into knowledge.
Data mining programs are intended to search through datum for hidden relationships and patterns in the datasets. This approach is particularly relative to intelligent transportation system. It can be very helpful for traffic researchers and managers to solve traffic problems.
This course provides an introduction to data mining as applied to transportation systems. It intends to cover the basic concepts of data mining as well as specific applications to transportation systems.
The objectives of the course are to present the basic concepts of data mining, the principles and ideas underlying the practice of data mining, including data preprocess, instance based learning, decision tree, support vector machine, outlier mining, and ensemble learning.
After completing this course, students will have the ability to understand the fundamental terms and concepts of data mining, and to use the methods taught in class for the analysis and processing of real transportation data.
Knowledge of probability, statistics and linear algebra at the undergraduate level
Basic knowledge of traffic engineering and basic programing skills
Jiawei Han, Micheline Kamber and Jian Pei, Data Mining: Concepts and Techniques, Morgan Kaufmann, 3rd edition, 2011.
Ian H.Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, San Francisco: Morgan Kaufmann Publishers, 3rd ed. 2011.
Charu C. Aggarwal, Data Mining: The Textbook, Springer, May 2015.
Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Pearson, 1st Edition, 2005.
Christopher M. Bishop, Pattern recognition and machine learning, the Morgan Kaufmann series in information science and statistics, Springer Science, 2006.