Level – 3M

Course units effective from academic year 2016/2017 to date


STA301M3: Advanced Design of Experiments

Course CodeSTA301M3
Course TitleAdvanced Design of Experiments
Credit Value03
Prerequisite    PrerequisiteSTA203G3
Hourly BreakdownTheoryPracticalIndependentLearning
45105
Objective:
  • Provide an introduction to the  block  and factorial  experimental designs
  • Introduce and explain the design aspects of the experiments
Intended Learning Outcomes:
  • Explain the mathematical models and issues such as interaction, confounding etc.
  • Construct  factorial experiments confounded with blocks
  • Design fractional factorial experiments
  • Analyze experimental data of  incomplete block designs
Course Contents:
  • Factorial Designs: 2k Factorial Designs, 3kFactorial Designs, Yates’ Algorithm, Blocking, Confounding, Partial Confounding, Fractional Factorial Designs, Design Resolution, Blocking Fractional Factorials, Alias Structure.
  • Block Designs: Randomized Complete Block Design, Latin Square Design, Balanced Incomplete Block Design, Graeco- Latin Square Design, Partially Balanced Incomplete Block Design, Youden Square Design.
Teaching Methods:
  • Lectures and  Tutorial discussions
Assessment/ Evaluation Details:
  • In-course Assessments               30%
  • End-of-course Examination       70%
Recommended Readings:
  • Douglas C. Montgomery., Design and Analysis of Experiments, Wiley Series, 2013.
  • H.R. Lindman., Analysis of Variance in Experimental Design, Springer Series, 1992.
  • R. A. Fisher., The Design of Experiments, Oliver and Boyd, 1960.
  • G. W. Cobb., Design and Analysis of Experiments, Springer Series, 1998.

STA302M3: Medical Statistics

Course CodeSTA302M3
Course TitleMedical Statistics
Academic Credits03
Hourly BreakdownTheoryPracticalIndependentLearning
45      105
Objective:
Introduce the statistical methods used in medical science
Intended Learning Outcomes:
  • Discuss  common terms used in epidemiology
  • Discuss direct and indirect methods of adjustment of overall rates
  • Compare the disease occurrence between two groups
  • Evaluate the common odds ratio with confidence interval
  • Discuss observational and experimental studies in medical field
  • Compare the analytical studies in medical field
  • Test the possible effects in crossover trial
  • Express the relationship among the survival function, distribution function, hazard function and cumulative hazard function
  • Estimate survival function using parametric and non-parametric methods
  •  Illustrate two-sample comparison for survival data using common statistical procedures
  • Apply Cox proportional hazard model to real life data
Course Contents:
  • Epidemiology: definition of  epidemiology, measuring disease frequency: population at risk, incidence, prevalence, case fatality, birth rate, death rate, life expectancy, direct and indirect standardized rate, comparing disease occurrence: absolute and relative comparison, common odds ratio: Cochran Mantel Haenszel and logit method, confidence interval for common odds ratio, Cochran Mantel Haenszel test
  • Types of studies: observational studies: descriptive and  analytical studies, ecological, cross-sectional, case-control and cohort studies, experimental studies: clinical trial, parallel group design, in-series design and crossover design
  • Survival Analysis: censoring, survival function, hazard function, cumulative hazard function, mean and median survival time, mean residual life time, estimation of survival function: parametric and non-parametric method: Kaplan-Meier estimator, life table, cumulative hazard estimator, two sample comparison: log-rank test, maximum likelihood test and likelihood ratio test, Cox proportional hazard model
Teaching Methods:
Lectures and Tutorial discussions
Assessment/ Evaluation Details:
  • In-course assessments             30%
  •  End of course Examination     70%
Recommended Readings:
  • David W. H., Stanley, L. and Susanne M., Applied Survival Analysis: Regression Modeling of Time to Event Data, John Wiley and Sons, New Jersey, 2008.
  • Bonita, R.,Beaglehole, R. and Kjellstrom, T., Basic Epidemiology, 2nd edition, World Health Organization, 2006.
  • Kleinbaum, D.G., Survival Analysis: A Self-Learning Text, Springer, New York, 1996.

STA303M3: Categorical Data Analysis

Course CodeSTA303M3
Course TitleCategorical Data Analysis
Academic Credits03
Hourly Breakdown TheoryPracticalIndependentLearning
 45     105

Objective:

Provide knowledge for analyzing categorical data.

Intended Learning Outcomes:
  • Discuss major types of categorical data and their probability distributions
  • Apply appropriate descriptive and inferential statistical methods for contingency tables
  • Build appropriate statistical models for different types of categorical response data
  • Analyze repeated/longitudinal categorical response data
Course Contents:
  • Introduction:Categorical response data, Probability distributions for categorical data: Bernoulli, binomial, multinomial, and Poisson,Likelihood function and Maximum likelihood estimate, Likelihood‐based inference methods: Wald test, score test.
  • Contingency tables:Two-way contingency tables; Table structure, comparing proportions, relative risk, odds ratio, Pearson’s Chi-square test, Likelihood ratio test, testing independence for ordinal data, Fisher’s exact test for small samples.

Three-way Contingency tables; Conditional versus marginal tables, Simpson’s paradox, Conditional versus marginal odds ratios, Conditional versus marginal independence, Cochran-Mantel-Haenszel (CMH) test Homogeneous association for   tables.

  • Generalized Linear Model (GLM):Components of generalized linear models, GLMs for binary and count data; Logistic regression and Log-linear model, Statistical inference for GLM, Comparing models, Model selection, Model diagnostics, Logit models, Probit Models, Analysis of repeated responses.
Teaching Methods:
Lectures  and tutorial discussions
Assessment/ Evaluation Details:
  • In-course assessments              30%
  • End of course Examination     70%
Recommended Readings:
  • Agresti. A, Categorical Data Analysis, 3rd  Edition. New York: Wiley, 2012.
  • McCullagh. P, and Nelder. J. A, Generalized Linear Models, 2nd Edition, London: Chapman and Hall, 1989.
  • Powers. A. D, and Y. Xie., Statistical Methods for Categorical Data Analysis, San Diego, CA: Academic Press, 2000.

STA304M3: Computational Statistics

Course CodeSTA304M3
CourseTitleComputational Statistics
Credit Value03
Hourly BreakdownTheoryPracticalIndependent Learning
1560 75
Objectives:
  • Provide an introduction to the  computational statistics
  • Introduce some software for the statistical computing
Intended Learning Outcomes:
  • Formulate simple functions for data management
  • Develop  algorithms for simulation of random numbers
  • Apply Monte- Carlo simulation techniques to  real world problems
  • Apply Bootstrap methods to real world problems
  • Develop the ability to use some statistical software in a real world situation.
Course Contents:
  • Introduction: Make use of a statistical software to write simple functions for data management and analysis
  • Simulation of random numbers: Box-Muller Algorithm, Inverse Transformation Method, Acceptance-Rejection Method, Polar Algorithm, Composition
  • Monte- Carlo methods: Monte-Carlo integration, Markov chain Monte Carlo methods,  Metropolis-Hastings algorithms,  the Gibbs sampler
  • Bootstrap methods: Bootstrap re-sampling techniques,  Bootstrap confidence intervals, Bootstrap estimate of bias
Teaching Methods:
  • Lectures,  Laboratory practical, group assignments and e-resources
Assessment/ Evaluation Details:
  • In-course Assessments               30%
  • End-of-course Practical Examination       70%
Recommended Readings:
  • Wendy L. Martinez, Angel R. Martinez, Computational Statistics handbook with MATLAB, Chapman and Hall/CRC, 2015.
  • Venables, W.N., Ripley, B.D., Modern Applied Statistics with S, Springer Series, 1999.
  • MoonJung Cho, Wendy L. Martinez., Statistics in MATLAB: A Primer, Chapman and Hall/CRC, 2014.

STA305M3: Time Series Analysis


STA306M3: Multivariate Analysis I

Course CodeSTA306M3
Course TitleMultivariate Analysis I
Academic Credits03
Hourly BreakdownTheoryPracticalIndepInde  Independent Learning
45105
Objective:
Introduce multivariate techniques and their applications to real world problems
Intended Learning Outcomes:
  • Distinguish univariate and multivariate data
  •  Find mean vector, covariance matrix and correlation matrix for a multivariate data
  • Determine the distribution of linear combination of random variables
  • Discuss the use of Wishart distribution in multivariate data
  • Discuss the properties of multivariate normal and Wishart distributions
  • Apply Hotelling T2 statistics for testing the plausible value for mean vector
  • Compare several covariance matrices
  • Apply statistical tests to multivariate normal distributions
  • Construct confidence intervals for mean vector and treatment effects
Syllabus Outline
Course Contents:
  • Introduction :Multivariate Data, Multivariate marginal and conditional distributions, mean vector, variance-covariance and correlation matrices, properties of covariance and correlation matrices, linear combination of random variables
  • Multivariate distribution : Multivariate Normal distribution; probability density of multivariate Normal distribution and its properties, transforming multivariate observations, multivariate likelihood estimation of mean vector and covariance matrix, Wishart distribution; Probability density of Wishart distribution and its properties, Sampling distribution of sample mean vector and sample covariance matrix
  • Inference about mean vector :Hotelling T2 distribution, Hotelling T2 test for plausible value for mean vector, confidence region, Comparisons of component means: Simultaneous and Bonferroniconfidence intervals, large sample inference about population mean vector, profile analysis
  • Comparison of several multivariate means : Comparing mean vectors from two population, simultaneous and Bonferroni confidence intervals, Large sample inference for comparing mean vector, profile analysis, Box-M test for comparing several covariance matrices, Paired comparisons, One way MANOVA, Two way MANOVA, Simultaneous and Bonferroni confidence intervals for treatment effects
Teaching Methods:
Lectures,  demonstration and Tutorial discussions
Assessment/ Evaluation Details:
  • In-course assessments             30%
  • End of course Examination     70%
Recommended Readings:
  • Chatfield, C., and Collins, A. J., Introduction to multivariate analysis, New York: Chapman and Hall, 1980.
  • Johnson, R. A., and Wichern, D. W., Applied multivariate statistical analysis, Englewood Cliffs, 6th Edition, N.J: Prentice Hall, 2006.
  • Everitt, B. S. and Hothorn T., An Introduction to Applied Multivariate Analysis with R, Springer, (2011).



Level – 4M

Course units effective from academic year 2016/2017 to date


STA401M4: Measure Theory

Course Code

STA401 M4

Course Title

Measure Theory

Credit Value

04

Prerequisites

PMM202G2 and PMM203G3

 

Hourly Breakdown

Theory

Practical

Independent Learning

60

140

Objectives:

  •  Introduce the fundamental concepts of Lebesgue measure spaces and abstract measure spaces
  •  Develop clear ideas on the concept of Lebesgue measurable functions, integrals, and their convergence properties
  • Discussthe fundamental connection between differentiation, and integration

Intended Learning Outcomes:

  • Construct Lebesgue measures on the real line
  • Define abstract measure space
  • Illustrate the properties of abstract measure space
  • Discuss the properties of measurable functions and the convergence of sequence of measurable functions
  • Explain the simple function approximation of measurable functions
  • Formulate integrals in a measure space
  • Discuss the convergence of integrals
  • Extend the measures from algebras/semialgebra to σ-algebras
  • Formulateproduct measures
  • Prove Fubini’s theorem, and Tonelli’s theorem
  • Discuss the fundamental connection between differentiation, and integration

Course Contents:

Measure Spaces:Preliminaries:Algebra and σ-algebras of sets,Borel sets; Lebesgue measure: Outer measure, Measurable sets, and Lebesgue measure,Properties,Example of a non-measurable set, Borel measures; General measure: Definition of measure, Measure space, Complete measure space, Examples, Properties.

Measurable Functions:  Basic properties of measurable functions, Examples, Borel measurable functions, Approximation Theorem; Littlewoods’s three principles: Egoroff’s theorem.

Integration:Integral of nonnegative functions, Integrability of a nonnegative function, Fatou’s Lemma, Monotone convergence theorem, Lebesgue Convergence Theorem, Generalized Convergence Theorem.

Extension of Measure:Measure on an algebra, Extension of measures from algebras to σ-algebras, Carathéodory’s theorem, and Lebesgue-Stieltjes integral.

Product Measure:Measurable rectangle, Semialgebra, Construction of product measures, Fubini’s theorem, and Tonelli’s theorem

Differentiation and Integration:Differentiation of monotone functions: Vitali’s lemma, Functions of bounded variations; Differentiation of an integral: Indefinite integral, and Absolutely continuous functions.

Teaching Methods:

  •  Lectures, Tutorials, Handouts, Problem solving, Use of e-resources

Assessment/ Evaluation Details:

  •  In-course Assessments         30%
  •  End-of-course Examination 70%

Recommended Readings:

  • Halsey Royden, Patrick Fitzpatrick, Real Analysis,4th Edition, 2010.
  • Walter Rudin, Real and Complex Analysis, 3rd Edition, 1986.
  •  G De Barra, Measure theory and Integration, 2nd Edition, 2003.
  • Gerald, B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd Edition, 2007.

STA402M2: Advanced Statistical Computing

Course Code

STA402M2

Course Title

Advanced Statistical Computing

Academic Credits

02

Hourly Breakdown

Theory

Practical

Independent Learning

60 Hours

40 Hours

Objective:

Introduce the Statistical concepts and principles to perform numerical computation using statistical software

Intended Learning Outcomes:

  • Utilize build-in functions to analyze categorical data sets
  • Analyse the survival data by applying build-in functions
  • Develop time series models using statistical software
  • List summary statistics for given multivariate data sets
  • Examine the effects of factors by applying build- in functions
  • Explore standard statistical methods using statistical software
  • Write computer programms to accomplish a task

Syllabus Outline

Contents:

  • Analysis of large data sets: Build-in functions for categorical data, survival data, time series data and multivariate data.
  • Analysis of experimental data sets: Analysis of variance (ANOVA), multivariate hypothesis tests, Multivariate Analysis of variance (MANOVA)
  • Simple function: writing simple functions to perform specific tasks.

Teaching Methods:

Laboratory practical

Assessment/ Evaluation Details:

  • In-course assessments (practical)          30%
  • End of course Examination (practical)  70%

Recommended Readings:

  • Lafaye de Micheaux, Pierre and Drouilhet, Rémy and Liquet, Benoit, The R software: Fundamentals of programming and statistical analysis, Springer, 2013.
  •  Michael, J. Crawley ,The R Book, Second Edition, John Wiley and Sons, Ltd, 2013.
  • Dirk, F. Moore, Applied Survival Analysis Using R, Springer, 2016.
  • Daniel Zelterman, Applied Multivariate Statistics with R, Springer, 2015.
  • Robert H. Shumway and David S. Stoffer, Time Series Analysis and Its Application With R Examples, Springer, 2011.

STA403M3: Markov Processes for Stochastic Modelling

Course Code

STA403M3

Course Title

Markov Processes for Stochastic Modelling

Academic Credits

03

Prerequisite   

STA302G3

Hourly Breakdown

Theory

Practical

Independent Learning

              45Hours

_

105 Hours

Objectives:

  •  Impart sound understanding on the Markov processes and their properties
  •  Introduce the basic concept and modelling methods on birth and death processes
  • Provide rigorous knowledge in queueing theory and applications

Intended Learning Outcomes:

  •  Recall basic characteristics of Markov processes
  • Discuss important properties of Markov chain
  • Evaluate the first passage and absorption probabilities
  • Find stationary distribution of a Markov chain
  • Illustrate the canonical form of a Markov chain
  • Model therelevant birth and death processes for randomly varying dynamic systems
  • Apply Chapman-Kolmogorov equation to formulate the forward differential equations
  • Construct probability distribution of random processes
  •  Explain the probability generating function for stochastic models
  •  Find average waiting time and queue length of the systems
  • Determine steady state distribution of a queueing system

Syllabus Outline

Contents:

  • Markov processes in discrete parameter space:

Basic properties of Markov chain,Transition probability matrix, Classification of states (recurrent and transient classes), Periodicity of a class, Irreducible Markov chains, Ergodic Markov chains,First passage and recurrent times, Probabilities of absorption of transient states in one of the recurrent classes, Expected value and standard deviation of the number of transitions till absorption, Stationary distributions, Canonical form, The fundamental matrix. Random walk with absorbing and reflecting barriers.

  • Markov processes in continuous parameter space:

Markov pure jump process, Chapman-Kolmogorov equation, Birth and death process, pure birth process, pure death process, Forward and backward Kolmogorov differential equations, transition rate matrix, Analysis of random process using probability generating function, expected value and variance, probability extinction.

  • Queueing processes:

Arrival and service processes, single and multiple server queueing systems, Steady state distribution, Traffic intensity, mean of waiting time, Network of queues, Martingale, Stochastic differential equations.

Teaching Methods:

  • Lectures, Tutorials, Handouts, Problem solving, Use e-resources

Assessment/ Evaluation Details:

  • In-course Assessments:             30%  
  • End-of-course Examination:     70% 

Recommended Readings:

  • Parzen, E.,StochasticProcesses. SIAM Edition; Society for Industrial and Applied Mathematics Philadelphia, 1999.
  •  Sheldon M. Ross,Introduction to Probability Models, 10th ed. Academic Press Elsevier, 2013. 
  • Jones, P.W, and Smith, P., Stochastic Processes An Introduction, 1st Edition, ARNOLD A member of the Hodder Headline Group London, Co-published in the USA Oxford University Press Inc, New York, 2001.

STA404M3: Generalized Linear Models for Familial Longitudinal Data

Course Code

STA404M3

Course Title

Generalized Linear Models for Familial Longitudinal Data

Academic Credits

03

Prerequisite

 

Hourly Breakdown

Theory

Practical

Independent Learning

45 Hours

 

105 Hours

Objective:

Provide knowledge in fitting models to familial longitudinal data and apply these models to real life problems.

Intended Learning Outcomes:

  • Distinguish familial and longitudinal models
  • Formulate marginal and conditional models for the analysis of familial longitudinal data
  • Compare different types of parameter estimation techniques
  • Apply standard correlation structures for familial longitudinal data
  • Build appropriate statistical models for count/binary familial longitudinal data

Syllabus Outline

Contents:

  • Overview of Linear fixed models

Estimation of parameters: Method of moments, Ordinary Least Squares method (OLS),  Generalized Least square method (GLS), OLS Vs GLS estimation performance; Estimation under stationary general autocorrelation structure: A class of autocorrelations

  • Familial models for count data

Poisson mixed models and basic properties; Estimation for single random effect based parametric mixed models: Exact likelihood estimation Method of moments, Generalized Estimating Equation (GEE) approach , Generalized Quasi-likelihood (GQL) Approach

  • Familial models for binary data

Binary mixed models and basic properties: Computational formulas for binary moments; Estimation for single random effect based parametric mixed models: Method of moments,  Generalized Quasi-likelihood approach, Maximum likelihood estimation (MLE)

  • Longitudinal models for count data

Marginal model; Marginal model based estimation of regression effects; Correlation models for stationary count data: Poisson AR(1) model,  Poisson MA(1) model, Poisson Equicorrelation (EQC) model; Inferences for stationary correlation models; Nonstaionary correlation models

  • Longitudinal models for binary data

Marginal model; Marginal model based estimation of regression effects; Some selected correlation models for longitudinal binary data; Low-order autocorrelation models for stationary binary data: Binary AR(1) model,  Binary MA(1) model, Binary EQC  model; Inferences in Non-stationary correlation models for repeated binary data

Teaching Methods:

Lecture demonstration, and tutorial discussions

Assessment/ Evaluation Details:

  • In-course assessments              30%
  • End of course Examination     70%

Recommended Readings:

  • Diggle, P.,Heagerty, K.Y.Liang, K.Y. and Zeger, S. L, Analysis of Longitudinal Data, 2nd  Edition, Oxford University Press, Oxford, 2002.
  • Brajendra, C. Sutradhar, Dynamic Mixed Models for Familial Longitudinal Data, Springer, 2011. 
  • McCullagh, P and Nelder, A. J, Generalized Linear Models, Chapman and Hall, 1989.  

STA405M3: Advanced Statistical Theory

Course Code

STA405M3

Course Title

Advanced Statistical  Theory

Academic Credits

03

Prerequisite

 

Hourly Breakdown

Theory

Practical

Independent  Learning

45 Hours

105 Hours

Objective:

Introduce concept of advanced statistical theory

Intended Learning Outcomes:

  • Recall sufficient statistics, minimal sufficient statistics, complete sufficient statistics
  • Identify the exponential families of distribution
  • Prove Basu’s theorem and use it for showing independence of statistics
  • Obtain point estimator using various estimation techniques
  • Prove Cramer-Rao  inequality, Rao- Blackwell, Lehmann-Sceffé theorems
  • Find minimum variance of an unbiased estimator using Cramer-Rao  inequality
  • Obtain the minimum variance unbiased estimator for various probability distribution
  • Analyze point estimators in terms of consistency, asymptotic normality and efficiency properties
  • Determine interval estimators
  • Evaluate the efficiency of the interval estimators
  • Apply statistical methods for hypothesis testing
  • Prove Neyman-Pearson lemma

Course Contents:

  • Data Reduction: Scale and location families, sufficiency, factorization theorem, minimal sufficiency, ancillary statistics, complete statistics, Basu’s theorem, exponential families; one parameter case,  parameter case
  • Point Estimation: Method of moments, maximum likelihood, properties of maximum likelihood estimator, Bayesian point estimation; prior and posterior distributions, bias, variance, mean square error, minimum variance unbiased estimator, Fisher information, Cramer-Rao lower bound, the Rao-Blackwell theorem, the Lehmann-Sceffé theorem, large sample theory; consistency, asymptotic normality and related properties, asymptotic efficiency and optimality
  • Interval Estimation: Methods of finding interval estimators; Inverting to a test statistic, pivotal quantities, pivoting the cumulative density function, Bayesian interval.Method of evaluating interval estimators;  size and coverage probability, test related to optimality, loss function optimality
  • Hypothesis test: Simple hypothesis, composite hypothesis, Neyman-Pearson lemma, uniformly most powerful test, likelihood ratio test, the sequential probability test, Bayesian testing procedures

Teaching Methods:

Lecture demonstration and tutorial discussions

Assessment/ Evaluation Details:

  • In-course assessments              30%
  • End of course Examination      70%

Recommended Readings:

  • Casella, G., and Berger. R., Statistical Inference, 2nd Edition, Pacific Grove, CA: Wadsworth, 2001.
  • Knight, K., Mathematical Statistics, 1stEdition, Chapman and Hall/CRC, 1999.
  • Hogg, V., McKean, W., and Craig, T., Introduction to Mathematical Statistics, 7th Edition, Pearson, 2012.
  • Bickel, P. J., and Doksum, K. A., Mathematical Statistics: Basic Ideas and Selected Topics, 6th Edition, San Francisco: Holden-Day, 1977.

STA406M3: Multivariate Analysis II

Course Code

STA406M3

Course Title

Multivariate Analysis II

Academic Credits

03

Prerequisite

 

Hourly Breakdown

Theory

Practical

Independent Learning

45 Hours

105 Hours

Objective:

Introduce further multivariate techniques and their application to real world problems

Intended Learning Outcomes:

  • Use principal component analysis effectively for data exploration and dimension reduction
  • Apply factor analysis effectively for exploratory and confirmatory data analysis
  • Apply multivariate regression to real world data
  •  Classify the groups using discriminate function
  • Apply discriminate function among groups
  • Find groupings and associations using cluster analysis

Syllabus Outline

Course Contents:

  • Principal Component Analysis: Derivation of principal components: Covariance matrix and Correlation matrix, loading matrix, Scree plot, principal component scores.
  • Factor Analysis:Orthogonal factor model, Methods of Estimation: Principal Component Method and Maximum Likelihood Method, Factor Rotation: Graphical method, Varimax and Oblique rotation, Factor Scores.
  • Multivariate regression:Multivariate linear regression model, Assumptions of multivariate linear regression, Least squares method ofparameter estimation, Statistical Inference on regression coefficients.
  • Canonical Correlation Analysis: Canonical variates and canonical correlation, test for significant canonical correlation
  • Discrimination and Classification: Separation and Classification for two population, Fisher’s Discriminant function, Classification with several population.
  • Cluster analysis: Similarity measures: Pairs of items and Pairs of variables, Clustering methods: Single Linkage, Complete Linkage, Average Linkage and K-mean method

Teaching Methods:

Lecture demonstration and Tutorial discussions

Assessment/ Evaluation Details:

  • In-course assessments            30%
  • End of course Examination     70%

Recommended Readings:

  • Chatfield, C.and Collins, A. J., Introduction to multivariate analysis, New York: Chapman and Hall, 1980.
  • Johnson, R. A.and Wichern, D. W., Applied multivariate statistical analysis, Englewood Cliffs, N.J: Prentice Hall, 1992.
  • Everitt, B.S. andHothorn, T., An Introduction to Applied Multivariate Analysis with R, Springer, 2011.

STA407M4: Advanced Probability Theory

 

Course Code

STA407M4

Course Title

Advanced Probability Theory

Academic Credits

04

Prerequisite   

PMM202G2 and PMM203G3

Hourly Breakdown

Theory

Practical

Independent Learning

              60Hours

_

140 Hours

Objectives:

  • Introduce basic concepts of probability theory in measure theoretic approach
  • Develop clear ideas on the concept of integration in probability measure space and the expectation of random variables
  • Impart profound knowledge and application methods on distribution functions, mode convergence and characteristic functions
  • Provide sound theoretical basis for further studies in mathematical statistics

Intended Learning Outcomes:

  • Recall the concepts of probability theory
  • Construct probability measures and measurable spaces
  • Illustrate the properties of random variables and expectation
  • Formulate integrals with respect to probability measures
  • Express probability and moment inequalities
  • Discuss the modes of convergence
  • Apply Fatou’s lemma, monotone and dominated convergence theorems
  • Explain Borel-Cantelli lemmas and Kolmogorov zero-one law
  • Discuss the properties of distribution functions and characteristic functions
  • Explain weak and complete convergence of sequence of distribution functions
  • Apply decomposition theorem, Helly-Bray lemma and theorem, uniqueness theorem, inversion theorem, Levy continuity theorem and central limit theorem

Syllabus Outline

Contents:

Mathematical Foundation of Probability Theory:

Sets and Operations, Collection of sets, Algebra and Sigma-algebras of sets, limits of sets, monotone sequence of sets, Probability spaces and properties, Construction of a probability measures and continuity theorem, Conditional probability and Independent events, Borel sets.

Random Variables:

Basic properties of random variables and vectors, random elements, induced probability measures and spaces, measurability and limits, Functions of random variables, simple random variables, induced sigma-algebras.

Expectation and Convergence:

Definitions and Properties of Expectation, Convergence concepts; Uniformly and point-wise, Mode of convergence; almost surely, in probability, in rth mean, in distribution. Convergence of function of random variables, Markov and Chebyshov’s inequalities. Moment Inequalities: Holder’s, Minkowski and Jensen’s. Fatou’s Lemma, Monotone and Dominated Convergence Theorems, Product measures. Independence of function of random variables and sigma algebras. Borel-Cantelli Lemmas, Kolmogorov zero-one Law, Strong Law of Large Numbers.

Distribution Functions: Properties of distribution functions, Decomposition theorem, weak and complete convergent, Helly-Bray lemma, extended lemma and theorem, Convolution, Conditional Distributions and Expectations.

Characteristic Functions: Definition and Basic Properties, Uniqueness theorem, Inversion theorem, Levy Continuity theorem, Examples of Characteristic functions, Law of Large Numbers, Stirling’s formula, Central Limit Theorem, Martingales.

Teaching Methods:

  • Lectures, Tutorial discussion, Handouts, Use e-resources

Assessment/ Evaluation Details:

  • In-course Assessments:             30%
  • End-of-course Examination:     70% 

Recommended Readings:

  • Alan F. Karr., Probability, 1st Edition; Springer-Verlag New York, Inc, 1993.
  • Ramdas Bhat. B., Modern Probability Theory, An Introductory Text Book,2nd Edition, Wiley Eastern Limited, 1985.
  • Kai Lai Chung. ,A Course in Probability Theory. 3rd Edition, Elsevier (USA), 2000.
  • Clarke. L. E.,Random Variables, 1st Edition, Longman Mathematical Texts, USA by Longman Inc., New York, 1975.
  • Allan Gut., Probability: A Graduate Course. II. series, Springer texts in Statistics, Springer-Verlag New York, Inc, 2005.

STA408M3: Theory of Linear Models

Course Code

STA408M3

Course Title

Theory of Linear Models

Academic Credits

03

Prerequisite

 

Hourly Breakdown

Theory

Practical

Independent Learning

45 Hours

105 Hours

Objective:

 

Provide depth knowledge in theory of linear models and its applications

Intended Learning Outcomes:

  • Prove basic results related to the statistical theory of linear models
  • Discuss different type of parameter estimation in linear models
  • Perform hypothesis testing related to different characteristics of a linear model
  • Assess the fit of a linear model to data and the validity of its assumptions
  • Develop theoretical knowledge on the  concepts behind the robust regression

Syllabus Outline

Contents:

  • Introduction

Multivariate Normal Distribution, Distribution of Quadratic forms,  Estimation by Least Squares, Orthonormal Bases, Q-R decompositions, Hat Matrices

  • Variances and Covariances

Gauss- Markov Theorem,  Estimation of variance, Generalized Least Squares, Collinearity in Least square estimation, Consequences and Identification, Biased Estimation, Ridge Regression, Sensitivity Analysis of Least Squares using Residuals

  • Statistical Inference for Normal Errors

Chi-square, t and F distributions, Distribution theory, Hypothesis testing, Robustness of F-tests, Non-central Chi-square and Power of tests, Power and Size of F-tests

  • Non-Full Rank Models

Analysis of Variance Models, Singular Value Decompositions, Estimable Functions and their properties, Hypotheses testing, Analysis of Variance Models with Covariates

  • Robust Regression

Influence Curves, Sensitivity Analysis based on the Influence Curve, M-Estimation, GM- Estimation, Influence curves of estimators (GLS and GM)

Teaching Methods:

Lecture demonstration, and tutorial discussions

Assessment/ Evaluation Details:

  • In-course assessments              30%
  • End of course Examination     70%

Recommended Readings:

  • James H. Stapleton,Linear Statistical Models, John Wiley and Sons, 2009.
  • George, A. F. Seber and Alan, J. Lee, Linear Regression Analysis, secondEdition, John Wiley and Sons, 2011.
  • Alvin, C. Rencher and Bruce Schaalje,G., Linear Models in Statistics, Second Edition, John Wiley and Sons, 2007.
  • Cook, R.D and S. Weisberg, S., Residuals and Influence in Regression, Taylor and Francis, 1982.
  • John Fox., Regression Diagnostics, Sage Publications, 1991.

STA409M6: Research project

Course Code

STA409M6

Course Title

Research Project

Academic Credits

06

Hourly Breakdown

Theory

Practical

Independent Learning

300 Hours

Objective:

 Provide training in scientific skills of problem analysis, research design, evaluation of empirical evidence and dissemination.

Intended Learning Outcomes:

  • Identify a research problem
  • List appropriate literature to discuss the research findings
  • Plan a proper research methodology
  • Formulate a suitable hypothesis for the research problem
  • Apply suitable statistical techniques to make decisions
  • Develop skills of scientific writing and presenting results

Syllabus Outline

 Course Description:

Students are expected to carry out an independent research project in the field of Statistics under the supervision of a senior staff member in the department. Students need to give presentations in the beginning, middle, and the end of their research. At the completion of the research project, students are expected to write a comprehensive report. During the research, students are expected to maintain a research diary.

Teaching Methods:

Guided independent study, Discussion with the supervisor, Use of e-resources

Assessment/ Evaluation Details:

  • Presentation               30%
  • Project Report        70%

Recommended Readings:

  • Kothari, C. R., Research Methodology: Methods and Techniques, Second Edition, New Age International (P) Limited, Publishers, 2004.
  • McMillan, K. and Weyers, J., How to Write Dissertations and Project Reports, Prentice Hall, 2011.
  • Denicolo, P. and Becker, L., Developing a Research Proposal. SAGE Publications, 2012.

STA410M2: Bayesian statistics

Course Code

STA410M2

Course Title

Bayesian Statistics

Academic Credits

02

Prerequisite   

STA201G3 and STA204G2

Hourly Breakdown

Theory

Practical

Independent Learning

              30Hours

_

70 Hours

Objectives:

  • Introduce the basic concepts of Bayesian theory.
  • Apply Bayesian statistics in a real world problem.

Intended Learning Outcomes:

  • Distinguish classical and Bayesian approaches
  • Recall various priors such as conjugate, non informative, Jeffreys’
  • Determine the posterior and predictive distributions for standard prior distributions
  • Find mean and variance for the posterior distributions
  • Evaluate Bayes’ estimate for the parameter of the posterior distribution
  • Construct the credible interval and highest posterior density interval
  • Test the simple hypotheses using Bayes’ factor
  • Formulate the linear hierarchical models
  • Utilize the Bayes’ risk to select the best decisions

Syllabus Outline

Contents:

Fundamentals of Bayesian Analysis:

Definitions of classical and Bayesian approaches to inference about parameters. Bayes’ theorem for parametric inference, likelihood functions, exponential families and conjugate priors. Mixtures of conjugate priors, Non informative priors, Jeffreys’ prior. Prior and Posterior analysis of standard distributions; binomial-beta, Poisson-gamma, exponential-gamma, uniform-Pareto, normal(mean)-normal, normal(precision)-gamma, normal(mean and precision)–normal-gamma. Predictive distributions. Exchangeability, Point and interval estimations; maximum a posteriori (MAP) estimators, credible intervals and highest posterior density intervals. Bayes’ factors, Bayesian hypothesis testing. Two sample problems.

Bayesian Linear Models:

Uniform priors, Normal priors, Hierarchical models; Two and Three stage models.

Statistical Decision Theory:

Loss functions, Bayes’ risk, Bayes’ rule, Minimax and Bayes’ procedures.

Teaching Methods:

·        Lectures, Tutorial discussion, Handouts, Use e-resources

Assessment/ Evaluation Details:

  • In-course Assessments:             30%     
  • End-of-course Examination:     70% 

Recommended Readings:

  • PeterM. Lee., Bayesian Statistics: An Introduction. 4 th edition, John Wiley and Sons Limited. U.K.,2012.
  • Peter D. Hoff., A First Course in Bayesian Statistical Methods. Springer-Heidelberg London, New York, 2009.
  • Vladimir P. Savchuk and Chris P. Tsokos., Bayesian Theory and Methods with Applications. Atlantis Press, 8, square des Bouleaux, 75019 Paris, France, 2011.
  • James O. Berger., Statistical Decision Theory and Bayesian Analysis. Second Edition, Springer-Verlag, New York, 1988.

STA411M3: Multivariate Analysis I

Course Code

STA411M3

Course Title

Data Mining

Academic Credits

03

Prerequisite

 

Hourly Breakdown:

Theory

Practical

Independent Learning

45

105

Objectives:

Provide knowledge on the concepts behind various  data mining techniques and techniques for learning from data as well as data analysis and modelling

Intended Learning Outcomes:

  • Plan pre and post-processing operations for data mining
  • Describe a range of supervised and unsupervised learning algorithms
  • Use machine learning algorithms on data to identify new patterns or concepts
  • Evaluate the performance of learning algorithms

Course Contents:

  • Introduction to data mining: Data mining and its applications; Data handling–instances, attributes and their types
  • Data mining process: Data preparation/cleansing, sparse data, missing data, inaccurate values, task identification, use of Weka tool
  • Supervised learning: Introduction to classification and regression, rule-based learning, decision tree learning, Naive Bayes, k-nearest neighbour, support vector machines, neural networks, linear regressions, introduction to boosting
  • Unsupervised learning: K-means clustering, Gaussian mixture models (GMMs), Hierarchical clustering, Latent Dirichlet Allocation(LDA)
  • Dimensionality reduction: Principal Component Analysis, Multidimensional Scaling, Filter methods
  • Evaluation of learning algorithms: Training and testing, Error rates, over- and under- fitting, Cross-validation, Confusion matrices and ROC graphs

Teaching/Learning Methods:

Lecture demonstration, and tutorial discussions and laboratory experiments

Assessment Strategy:

  • In-course Assessments                  30%
  • End-of-course Examination         70%

References:

  • Bishop,C. M,  Pattern Recognition and Machine Learning, 2007.
  • Duda. R. O, Hart,P. E. and Stork, D. G., Pattern Classification, 2ndEdition., Wiley, 2000.
  • Mitchell, T., Machine Learning, McGraw Hill, 1997.
  • Witten, I. H., Frank, E. and Hall, M. A, Data Mining: Practical Machine Learning Tools and Techniques, 3rdEdition, Morgan Kaufmann Series, 2011.

STA412M3: Biostatistical techniques

Course Code

STA412M3

Course Title

Biostatistical Techniques

Academic Credits

03

Hourly Breakdown

Theory

Practical

Independent Learning

45 Hours

105 Hours

Objective:

Introduce the applied Biostatistical techniques used in statistical collaboration with various clinical trials.

Intended Learning Outcomes:

  • Distinguish kinds and source of data
  • How to apply best tools and approaches for data collection or retrieval
  • Identify pitfalls and best practices for turning data into analyzable data
  • Examine the data qualities
  • Identify the correct use of models under different response domains
  •  Use observational data for comparative studies
  • Discuss different types of follow-up and time to event responses
  • Apply basic actuarial and parametric approaches for time to event responses
  • Outline different types of follow-up and longitudinal data
  • Develop basic modeling approaches for longitudinal response

Course Contents:

  • Data: Kinds of raw data: Unstructured, semi-structured; structured; Sources of data: active versus passive data, clinical databases, registries, administrative data; Data Collection tools: spreadsheets, databases, text mining; Analysis dataset: event-based, longitudinal, unique record versus multiple records; Data screening: manual review, descriptive summary, exploratory data analysis; best practices when using tables and figures.
  • Study Initiation: Introduction to types of studies; power calculation; measures of agreement: kappa statistic, concordance correlation coefficient; diagnostic tests: sensitivity, specificity, positive predictive value, negative predictive value; Simple Statistical Tests: parametric and nonparametric tests: comparison tests, trend tests, tests for correlated responses and assumptions.
  • Modelling:Understand the basic principles of different kinds of statistical models, and their applicability to the analysis of clinical data; Models for continuous Response, Categorical Response, Count data, Zero-inflated data; Methods used in comparative studies: weighting, stratification, adjustment, matching.
  • Time related responses: Introduction to Time to Event data: left censored data, competing risk data, repeated events; Introduction to Longitudinal Responses: continuous, binary, ordinal and nominal. Brief introduction to marginal and conditional (mixed-effects) models.
  • Machine Learning Methods: Use of Bootstrap to estimate standard errors and confidence interval, Introduction toRandom Forests: continuous response, categorical response, and time to event response.

Teaching Methods:

Lecture demonstration, Quizzes and Tutorial discussions

Assessment/ Evaluation Details:

  • In-course assessments             30%
  • End of course Examination     70%

Recommended Readings:

  • Forthofer, R. N., Lee, E. S. and Hernandez, M., Biostatistics: A Guide to Design, Analysis and Discovery, 2nd Edition, Elsevier- Academic Press, Boston, 2007.
  • Gordis, L., Epidemiology, 5th Edition, Elsevier- Academic Press, Philadelphia, 2014.
  • Harrell, F.E., Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis, 2nd Edition, Springer, New York, 2001.