# NESS2015sc

## Comments

## Transcription

NESS2015sc

Three One-day Short-Courses at the 29th New England Statistics Symposium Friday, April 24, 2015, University of Connecticut 8:30am — 5pm, at Student Union Room TBA To register, please visit http://merlot.stat.uconn.edu/ness15/?info=reg. To know more, please visit http://merlot.stat.uconn.edu/ness15/?info=shortcourse. Course 1: Bayesian Biostatistics: Design of Clinical Trials and Subgroup Analysis Instructor Dr. Peter M¨ uller is Professor of Statistics and Mathematics at the University of Texas, Austin. He works on Bayesian inference, with a focus on nonparametric Bayesian methods, simulation based methods, optimal design and multiple comparison procedures. He is interested in applications in biostatistics and bioinformatics, including in particular Bayesian clinical trial design, hierarchical models, population PK/PD models, inference for histone modifications and tumor heterogeneity. Dr. M¨ uller is a Fellow of the ASA, a Fellow of the IMS, and served as president of ISBA. Outline This shortcourse is an introduction to Bayesian inference methods that are commonly used in biomedical applications, with a focus on two important problems: Bayesian adaptive clinical trial design and subgroup analysis. The course is organized in 4 classes: 1) Review of Bayesian inference, including basics of MCMC, posterior asymptotics and frequentist operating characteristics Linear regression and hierarchical models. 2) Clinical trial design (phase I and II): CRM, TITE-CRM and more. 3) Clinical trial design (phase II): predictive probability designs, proper Bayes designs, adaptive randomization and decision theoretic designs. 4) Subgroup analysis and multiplicities: how posterior inference adjusts for multiplicities, but solves only half the problem. The emphasis will be on data analysis and practical implementation issues. Prerequisites The course is accessible to anyone with a knowledge of statistical inference at the level of introductory graduate level courses in mathematical statistics and probability. An appreciation of statistical inference in biomedical research is desirable, but not strictly required. An important part of the course are problem sets which students are asked to work independently (outside the course). Familiarity with R or a comparable computing environment is essential for the problem sets (but not for the lecture). Working the problem sets is important for an optimal learning experience, but is not part of the lectures. Course 2: Modern Multivariate Statistical Learning: Methods and Applications Instructors Dr. Kun Chen is Assistant Professor, and Dr. Jun Yan is Associate Professor in the Department of Statistics, University of Connecticut. Dr. Chen is interested in multivariate analysis, dimension reduction, variable selection, time series analysis and statistical computing, with a focus on analyzing large-scale multivariate data. He has experience working on a variety of statistical applications in ecology, genetics, medical imaging, and health sciences. Dr. Yan works on multivariate dependence, survival analysis, clustered data analysis, spatial data analysis, spatial extremes, estimating functions, and statistical computing. He is committed to making his statistical methods available via open source software and has authored and is actively maintaining a collection of R packages in the public domain. Outline This short course focuses on the state-of-art developments in multivariate statistical learning, which exemplify the successful marriage between statistical modeling and optimization. It targets many applications in various fields where the essential goal is to decode the underlying associations between/within a possibly large number of features and outcomes. The challenges in dealing with noisy multivariate data of high dimensionality/large volume have pushed a genuine refinement and expansion of the classical multivariate analysis toolkit. Several classes of multivariate tools for simultaneous structured dimension reduction and model estimation will be introduced, in which multiple indispensable data attributes and modeling elements, e.g., low-rank, sparsity, variable grouping, multi-view data, etc, are seamlessly integrated. Taking into account such complex structures in an integrative yet manageable way significantly enhances model predictive power, improves model interpretation, and enables data analysts to gain critical insights from the data. The course consists of 5 sessions: 1) overview of multivariate learning; 2) principal component analysis and new variants; 3) canonical correlation analysis and new variants; 4) multivariate regression and new variants; and 5) other multivariate methods and recent developments. Case studies are provided with examples in finance, insurance, ecology, imaging analysis, genetics, health science, and industrial engineering. Prerequisites Entry level graduate courses in statistics or exposures to regression and multivariate analysis are desirable. Participants are encouraged to bring their own laptop computers to the session and to have the latest versions of R installed on their computers. The participants will have the opportunity to go through several real data examples and case studies together with the instructors. Course 3: Boosting R Skills and Automating Statistical Reports Instructor Dr. Yihui Xie is a software engineer at RStudio, Inc. He is interested in interactive statistical graphics, statistical computing, and web applications. He is an active R user and the author of several R packages, such as animation, formatR, Rd2roxygen, and knitr, among which the animation package won the 2009 John M. Chambers Statistical Software Award of the ASA. He is also the author of the book “Dynamic Documents with R and knitr”. In 2006, he founded the Capital of Statistics, which has grown into a large online community on statistics in China. He initiated the first Chinese R conference in 2008, and has been organizing R conferences in China since then. During his PhD training at the Iowa State University, he won the Vince Sposito Statistical Computing Award (2011) and the Snedecor Award (2012) in the Department of Statistics. Outline This intermediate level short course consists of Part I “R Programming” and Part II “Dynamic Reporting with R”. Part I aims to improve your R programming skills. It covers some basic and advanced topics in R, as well as R package development. Part II is a tutorial on two packages for automatic reporting, knitr (Xie, 2013), and rmarkdown (Allaire et al., 2014). It covers the basic idea of literate programming as well as its role in reproducible research. A variety of document formats supported by knitr will be introduced, including R LATEX (.Rnw) and R Markdown (.Rmd). We will show useful features of knitr, such as creating tables and plots from data, caching, and cross references. We will also provide examples of advanced features such as chunk hooks, and calling foreign languages (shell scripts, Python, C++, Julia, etc.). Finally we will introduce the simple Markdown language, as well as how to convert R Markdown to many other document formats, such as LATEX/PDF, HTML, and Word. Hopefully R Markdown will make it much easier for data analysts to prepare reports and authors to publish their work related to data analysis. Many people agree that reproducible research is important (see, for example, the Duke Saga http://www.economist.com/node/ 21528593), but have an impression that it implies more work at the same time. We will show that this is wrong. Generating reports from knitr dynamic documents is not only a better approach to reproducible research, but also easier and even fun! Prerequisites The attendees should have some familiarity with programming (not necessarily with R). Some prior knowledge of LATEX and HTML can be helpful but not required for this tutorial. Potential attendees include those who write reports that involve data analysis, ranging from homework, project reports, papers, books, and websites. Please have the latest R and package rmarkdown, and RStudio (http://www.rstudio.com) installed on your laptop. RStudio IDE is optional but recommended. I will be using RStudio for demo purposes. If you have already been using other text editors such as Emacs + ESS, you are free to stay with your own choice, and I will explain how things work outside RStudio.