Python

install.packages(c('rzmq','repr','IRkernel','IRdisplay'),
                 repos = c('http://irkernel.github.io/', getOption('repos')),
                 type = 'source')
IRkernel::installspec()

Scala Kernel

github: alexarchambault: jupyter-scala

install: $ cd ~/Downloads
$ wget https://oss.sonatype.org/content/repositories/snapshots/com/github/alexarchambault/jupyter/jupyter-scala-cli_2.11.6/0.2.0-SNAPSHOT/jupyter-scala_2.11.6-0.2.0-SNAPSHOT.tar.xz
$ cd ~/Downloads/jupyter-scala_2.11.6-0.2.0-SNAPSHOT
$ ./bin/jupyter-scala
$ jupyter console --kernel scala211

Zen of Python: https://www.python.org/dev/peps/pep-0020/

Django

surveys

github: jessykate: django-survey

Machine Learning

AI: building decision rules 80’s machine learning: learn these from observations 90’s statistical learning: model the noise in the observations big data: many observation, simple rules

hasher.fit-transform: transform word count list of strings into matrix
estimator.partial_fit
www.wendelin.io: Wendelin Industrial Big Data
Microsoft Benjamin: benguin@microsoft.com

scikit-learn

ENSAE course material

sklearn_ensae_course
clone repository and navigate to “rendered notebooks” folder and execute ipython notebook
alternatively, copy link in http://nbviewer.ipython.org/github/[name]/[repo]

IPython notebooks

Data Science with Hadoop - predicting airline delays - part 1

Webscraping

Crawling part

Scrapy

Extraction part

Books

Python for Data Analysis: Author: McKinney, Wes
Subtitle: Agile Tools for Real-World Data
Publisher: O’Reilly
ISBN: 978-1-449-31979-3
Year: 2013
Tags: NumPy, pandas, matplotlib, IPython, SciPy
GitHub: git://github.com/pydata/pydata-book.git
Python for Informatics: Author: Charles Severance
Subtitle: Exploring Information

Gavin Hackeling - Mastering Machine Learning with scikit-learn

Think Series by Allen B. Downey

Think Bayes: Subtitle: Bayesian Statistics in Python
Publisher: O’Reilly
ISBN: 978-1-449-37078-7
Year: 2013
Think Complexity: Publisher: Green Tea Press
Year: 2012
URL: greenteapress.com/complexity
Think Python (v3): Subtitle: How to Think Like a Computer Scientist
Publisher: Green Tea Press
Year: 2008
URL: thinkpython.com
Think Stats (2ed): Subtitle: Exploratory Data Analysis in Python
Publisher: Green Tea Press
Year: 2014
URL: thinkstats2.com

Mailing Lists

pydata: a Google Group list for questions related to Python for data analysis and pandas
pystatsmodels: for statsmodels or pandas-related questions
numpy-discussion: for NumPy-related questions
scipy-user: for general SciPy or scientific Python questions

Programming Concepts

stupidpythonideas.blogspot.fr: If you don’t like exceptions, you don’t like Python

IDEs

PyCharm

← Previous Archive Next →

Published

13 February 2015

Modules

Sample Project Structure

os

XlsxWriter

coursera-dl

SimpleHTTPServer

IPython / Jupyter

JupyterLab

R Kernel