Statistical and Machine Learning Methods for Classification Using R - June 17th to 21st

Date:

June 17th to 21st. Morning, 9:00 AM to 12:00 PM

Classroom:

PC2

Instructor

Pankaj Choudhary
Department of Mathematical Sciences, University of Texas at Dallas, USA

Pankaj Choudhary is a Professor of Statistics at the University of Texas at Dallas. His current research interests include analysis of method comparison data, functional data analysis, and risk prediction. He is co-author of a book titled Measuring Agreement: Models, Methods, and Applications that was published by Wiley in 2017. At present, he is developing an R package to accompany the book. He also teaches a semester-long statistical and machine learning course at his institution.

Language

English

Description

Knowing how to classify a categorical response variable based on information provided by predictor variables is a necessary skill for any data scientist. This course will teach statistical and machine learning methods for classification and illustrate their application by analyzing a variety of real datasets using R. The participants will be given a similar dataset to analyze to gain hands-on experience.Course goalsThis course will provide a hands-on training in statistical and machine learning methods for classification, their implementation using R, and their application.Course contents

The course will be taught using the book, An Introduction to Statistical Learning with Applications in R, by James, Witten, Hastie, and Tibshirani, Springer, 2013. A free PDF copy of the book can be downloaded from http://www-bcf.usc.edu/~gareth/ISL/. The following is a tentative schedule and list of topics to be covered:

Part I: Introduction (Basic concepts; evaluating accuracy of a classification method; resampling methods; bias-variance tradeoff; K-nearest neighbors (KNN) method)
Part II: Logistic regression and discriminant analysis
Part III: Tree-Based Methods
Part IV: Support Vector Machines

Prerequisites

Basic knowledge of linear regression methods
Experience with R at a medium level

Targeted at

Students and researchers interested in statistical and machine learning methods for classificationEvaluationEach day the students will perform analysis of one or two datasets using the statistical methods discussed that day.

Computer class or student's laptop?

Student's laptop

Software requirements

R, freely available from https://www.r-project.org