Biomedical Data Science
Welcome
Preface
Introduction for readers
What you will learn from this course/book
What we recommend you do while reading this book
Other reference books
Acknowledgements
I Data Science Foundations
1
Introduction to R
1.1
Intro to R programming (Session 1)
1.1.1
R as a basic calculator
1.1.2
Variables
1.1.3
Data types
1.1.4
Functions
1.1.5
R scripts
1.1.6
Resource links
1.2
Intro to R programming (Session 2)
1.2.1
Data structures
1.2.2
Vector
1.2.3
Matrix
1.2.4
List
1.2.5
Data Frame
1.2.6
Reading/writing data from/to R
1.3
Intro to R programming (Session 4)
1.3.1
Operators
1.3.2
If/else
1.3.3
Loop
2
Introduction to Hypothesis testing
2.1
Hypothesis testing and
p
value
2.1.1
Example 1: probability of rolling a six?
2.2
Permutation test
2.2.1
Example 2: difference in birth weight
2.2.2
Null distribution approximated by resampling
2.3
t
test
2.3.1
Derivation of t distribution
2.3.2
Direct use of
t.test()
2.4
regression-based test
2.5
Multiple testing
2.5.1
Null distribution (of test statistic)
2.5.2
Null distribution of p value
2.5.3
Minimal p values in 10 tests
2.6
Explore power and sample size (optional) { power }
3
Introduction to Linear Regression
3.1
Linear Regression Using Simulated Data
3.1.1
Simulating data:
3.1.2
Model efficacy
3.1.3
R-Squared
3.2
Least Squares Using Simulated Data
3.3
Diagnostic check of a fitted regression model
3.3.1
Residual Standard Errors (RSE)
3.3.2
p-values
3.3.3
F-statistics
3.4
Simple Linear Regression with
lm
function
3.5
Multiple Regression with
lm
function
4
Introduction to Classification
4.1
Visualise logistic and logit functions
4.1.1
Logistic function
4.1.2
Logit function
4.1.3
Visualise the distribution
4.2
Logistic regression on Diabetes
4.2.1
Load Pima Indians Diabetes Database
4.2.2
Fit logistic regression
4.2.3
Assess on test data
4.2.4
Model selection and diagnosis
4.3
Cross-validation
4.4
More assessment metrics
4.4.1
Two types of error
4.4.2
ROC curve
4.4.3
Homework
II Biomedical Data Modules
5
Medical Image and Digital Health
6
Cancer genomics
6.1
Case study 1: analysis of cBioportal mutation data
6.1.1
Exploratory analysis
6.1.2
Statistical analysis
6.1.3
Literature search
7
Epidemiology
8
Population Genetics
8.1
Case study 1: Heritability and human traits
8.1.1
Part 1
8.1.2
Part 2
8.1.3
References
III Modules in previous years
9
Cancer Epidemiology
9.0.1
Scenario
9.0.2
Hong Kong population
9.0.3
Cancer registry data
9.0.4
Existing cancer funding and publication data
9.0.5
Open discussion
10
Genetic sequence
10.1
Case study 1: Genetic sequence analysis
10.1.1
Sequence motif and k-mer
10.1.2
Functional mapping
10.1.3
References
11
Genomics and prediction
11.1
Case study 1: splicing fedility prediction
11.1.1
Questions for Discussion
11.1.2
Hands-on with regression model
11.1.3
Open question
IV Appendix
Appendix A: Install R & RStudio
A.1 Install R (>=4.3.1)
R on Windows
R on macOS
R on Linux (Ubuntu)
A.2 Install RStudio
A.3 Use R inside RStudio
R studio
Set working directory
Some general knowledge
Install packages
A4. Cloud computing
References
Published with bookdown
Biomedical Data Science - introduction with case studies
References