A Biomed Data Analyst Training Program
Modern Statistics: A Computer Based Approach with Python
Publisher: Springer International Publishing; 1st edition (September 15, 2022) Errata: See known errata here |
Slides
- Introduction
- Data types and data integration
- Supervised learning
- Model performance
- Time series
- Data visualization
- Causality and experimental design
Code and data files
This part of the repository contains:
notebooks
: Python code of individual chapters in Jupyter notebooks - download notebooks and data as notebooks.zip
The Python package mistat
contains datafiles and utility functions referred to in the Modern Statistics book. It is available
for installation from the Python package index https://pypi.org/project/mistat/.
The mistat
packages is maintained in a GitHub repository at https://github.com/gedeck/mistat.
Try the code
You can explore the code on Binder .
Installation instructions
Instructions on installing Python and required packages are here.
These Python packages are used in the code examples of Modern Statistics:
- mistat (for access to data sets and additional functionality)
- numpy
- scipy
- scikit-learn
- statsmodels
- pingouin
- xgboost
- KDEpy
- networkx
- scikit-fda
- pgmpy
- dtreeviz
- svglib
- pydotplus
The notebook InstallPackages.ipynb contains the pip command to install the required packages. Note that some of the packages may need to be pinned to specific versions.
If you have a problem with visualizing the decision tree or creating a network graph, follow the installation instructions for graphviz in the dtreeviz github site. On Windows, the problem is usually resolved by adding the path to the graphviz binaries to the PATH system variable.