Share:

Python, more python, pandas and sklearn

Date:

June 26 to 30. MORNING: 9 to 12h

Instructor

Alexandre Perera Lluna

Alexandre Perera Lluna (1973) holds a degree in Physics (1996, UB), Electral Engineering (2001, UB) and a PhD in Physics  (2003 UB), postdoctoral fellow at Texas A&M University (Tx, USA, 2003-2004) and EADS European Aeronautic Defence and Space Company (CRC Forschung,München, DE, 2005), Ramon y Cajal Fellow (2007) is currently tenured as full professor at the Polytechnic University of Catalonia. He is double affiliated as researcher of the Institut de Recerca de Sant Joan de Déu.  

He he leading the B2Slab (http://b2slab.upc.edu) Bioinformatics and Biomedical Signals Laboratory and head of the Biomedical Research Center and is cofounder of eheus.com and vincer.ai start-ups. His interests covers data science in primary care, bioinformatics and open source in general.

Language

English

Description

This course will cover a crash course for scientific Python for data analysis for 15h. This crash course will include three main stages:

  • Introduction to Python language as a tool. Workflow, ipython, ipython notebook (jupyter), basic types, mutability and inmutability and object oriented programming.
  • Creation of CLI tools and APIs with python
  • Short introduction to numerical Python and some libraries for graphical visualization such as seaborn or plotnine, including providing interactivity with Jupyter notebooks.
  • Introduction to scientific kits for data analysis with machine learning. Principal components analysis, clustering and supervised analysis with multivariate data.

Course goals

    • To gain proficiency in coding python, understand basic types
    • To undeerstand decorators and generators
    • To Learn how to build an API, and expose prediction models through webservices
    • To learn how to build and manage data-frame-based representations of data
    • To learn how to use machine learning scientific kit (sklearn)

Course contents

1. Introduction

  • a. Why Python?
  • b. Python History
  • c. Installing Python
  • d. Python resources

2. Working with Python

  • a. Workflow
  • b. ipython vs. CLI
  • c. Text Editors
  • d. IDEs
  • e. Notebook

3. Getting started with Python

  • a. Introduction
  • b. Getting Help
  • c. Basic types
  • d. Mutable and in-mutable
  • e. Assignment operator
  • f. Controlling execution flow
  • g. Exception handling

4. Functions and Object Oriented Programming

  • a. Defining Functions
  • b. Input and Output
  • c. Standard Library
  • d. Object-oriented programming

5. Python as a tool

  • a. Creation of API for serving data services
  • b. Building CLI tols with python 

6. Introduction to NumPy

  • a. Overview
  • b. Arrays
  • c. Operations on arrays
  • d. Advanced arrays (ndarrays)
  • e. Notes on Performance

7. Matplotlib

  • a. Introduction
  • b. Figures and Subplots
  • c. Axes and Further Control of Figures
  • d. Other Plot Types
  • e. Animations

8. Python scikits

  • a. Introduction
  • b. Pandas

9. scikit-learn

  • a. Datasets
  • b. Sample generators
  • c. Unsupervised Learning
  • d. Supervised Learning
    • i. Linear and Quadratic Discriminant Analysis
    • ii. Nearest Neighbors
    • iii. Support Vector Machines
  • e. Feature Selection

10. Practical work with python

Prerequisites

Basic coding skills preferible

Targeted at

People aiming to learn python from scratch up to data management in python.

Evaluation

Final short assignment

Software requirements

Preferably a GNU/Linux machine, or a virtualbox instance of GNU/Linux with administration rights