Scientific Phyton for Data Analysis - June 17th to 21st

Date:

June 17th to 21st. Afternoon, 3:00 PM to 6:00 PM

Classroom:

PC2

Instructor

Alexandre Perera Lluna

Alexandre Perera Lluna (1973) holds a degree in Physics (1996, UB), Electronic Engineer (2001, UB) and a PhD in Physical Sciences (2003 UB), postdoctoral fellow at Texas A&M University (Tx, USA, 2003-2004) and EADS European Aeronautic Defence and Space Company (CRC Forschung,München, DE, 2005), Ramon y Cajal Fellow (2007) is currently tenured at the Polytechnic University of Catalonia (2013). He is also double affiliated as researcher of the Institut de Recerca de Sant Joan de Déu.    Author of more than 60 paper in peer-review journals, six patents and more than 60 contributions to national and international conferences.  He is currently the coordinator of the research group B2Slab (http://b2slab.upc.edu) Bioinformatics and Biomedical Signals Laboratory, member of the board of directors of the Biomedical Research Center  and deputy head for research of the automatic control department at UPC. His research covers artificial intelligence algorithms, multivariate statistics, machine learning applied to bioinformatics and bioengineering.

Language

English

Description

This course will cover a crash course for scientific Python for data analysis for 15h. This crash course will include three main stages:
    • Introduction to Python language as a tool. Workflow, ipython, ipython notebook (jupyter), basic types, mutability and inmutability and object oriented programming.
    • Short introduction to numerical Python and matplotlib for graphical visualization.
    • Introduction to scientific kits for data analysis with machine learning. Principal components analysis, clustering and supervised analysis with multivariate data.

Course goals

    Learn python from scratch.
    Learn the use of numpy and machine learning scientific kits in python.

Course contents

    Lecture plan
    1. Introduction

    a. Why Python?
    b. Python History
    c. Installing Python
    d. Python resources

    2. Working with Python

    a. Workflow
    b. ipython vs. CLI
    c. Text Editors
    d. IDEs
    e. Notebook

    3. Getting started with Python

    a. Introduction
    b. Getting Help
    c. Basic types
    d. Mutable and in-mutable
    e. Assignment operator
    f. Controlling execution flow
    g. Exception handling

    4. Functions and Object Oriented Programming

    a. Defining Functions
    b. Input and Output
    c. Standard Library
    d. Object-oriented programming

    5. Introduction to NumPy

    a. Overview
    b. Arrays
    c. Operations on arrays
    d. Advanced arrays (ndarrays)
    e. Notes on Performance (\%timeit in ipython)

    6. Matplotlib

    a. Introduction
    b. Figures and Subplots
    c. Axes and Further Control of Figures
    d. Other Plot Types
    e. Animations

    7. Python scikits

    a. Introduction
    b. scikit-timeseries
    8. scikit-learn
    a. Datasets
    b. Sample generators
    c. Unsupervised Learning
    d. Supervised Learning

    i. Linear and Quadratic Discriminant Analysis
    ii. Nearest Neighbors
    iii. Support Vector Machines

    e. Feature Selection

    9. Practical Introduction to Scikit-learn

    a. Solving an eigenfaces problem

    i. Goals
    ii. Data description
    iii. Initial Classes
    iv. Importing data

    b. Unsupervised analysis

    i. Descriptive Statistics
    ii. Principal Component Analysis
    iii. Clustering

    c. Supervised Analysis

    i. k-Nearest Neighbors
    ii. Support Vector Classification
    iii. Cross validation

Prerequisites

No prior experience in python is required.

Targeted at

Individuals who want an introduction to python in a nutshell.

Evaluation

Through a questionnaire at final session.

Computer class or student's laptop?

Computer room

Software requirements

Standard python, machine learning scientific kits, all open source.