This course covers methodology, major software tools and applications in data mining. By introducing principal ideas in statistical learning, the course will help students to understand conceptual underpinnings of methods in data mining. It focuses more on usage of existing software packages (mainly in R) than developing the algorithms by the students. The topics include statistical learning; resampling methods; linear regression; variable selection; regression shrinkage; dimension reduction; non-linear methods; logistic regression, discriminant analysis; nearest-neighbors; decision trees; bagging; boosting; support vector machines; principal components analysis; clustering. Perfect for students and teachers wanting to learn/acquire materials for this topic.