Thursday February 13, 2014
at 3:30pm
Sidney Smith Hall, Room 1074
**Refreshments will be served at 3:15pm
Computational Foundations of Bayesian Inference and Probabilistic Programming
Dr. Daniel Roy, University of Cambridge
The complexity, scale, and variety of data sets we now have access to have grown enormously, and present exciting opportunities for new applications. Just as high-level programming languages and compilers empowered experts to solve computational problems more quickly, and made it possible for non-experts to solve them at all, a number of high-level probabilistic programming languages with computationally universal inference engines have been developed with the potential to similarly transform the practice of Bayesian statistics. These systems provide formal languages for specifying probabilistic models compositionally, and general algorithms for turning these specifications into efficient algorithms for inference.
In this talk, I will address three key questions at the theoretical and algorithmic foundations of probabilistic programming—and probabilistic modeling more generally—that can be answered using tools from probability theory, computability and complexity theory, and nonparametric Bayesian statistics. Which Bayesian inference problems can be automated, and which cannot? Can probabilistic programming languages represent the stochastic processes at the core of state-of-the-art nonparametric Bayesian models? And if not, can we construct useful approximations? I’ll close by relating these questions to other challenges and opportunities ahead at the intersections of computer science, statistics, and probability.
http://www.utstat.toronto.edu/wordpress/?page_id=18
Speaker: Dr. Andrea Howard, Carleton University
Department of Psychology
Title: Integrating Ratings of Child Psychopathology across Multiple Informants
Abstract: One of the most significant challenges facing researchers and practitioners who assess child psychopathology is how to integrate information about a child’s symptoms from multiple sources when those sources provide discrepant ratings (De Los Reyes & Kazdin, 2004). It is common to obtain ratings for a single target child from informants such as parents, teachers, and peers, but it is less clear how to combine the information provided by multiple informants to derive an integrated measure of the psychopathology trait of interest that is not confounded with informants’ unique perspectives. A new approach to this problem stipulates a trifactor measurement model to analytically disaggregate informants’ unique perspectives of children’s symptoms from a cross-informant consensus rating of their true symptoms (Bauer, Howard et al., 2013). Preliminary results from a new study expand the trifactor model to a three-informant, multi-trait assessment of inattention and hyperactivity/impulsivity symptoms using data drawn from baseline assessments of children enrolled in a randomized controlled trial study of treatments for Attention-Deficit/Hyperactivity Disorder (ADHD).
Suggested Readings:
Bauer, D. J., Howard, A. L., Baldasaro, R. E., Curran, P. J., Hussong, A. M., Chassin, L., & Zucker, R. A. (2013). A trifactor model for integrating ratings across multiple informants . Psychological methods, 18(4), 475-493.
Title: Data mining with R: Let R ‘rattle’ you
Description:
This hands-on workshop will provide training in the rattle data mining package for R. rattle is a graphical user interface to transform, visualize and analyze data.
For hands-on exercises, please bring a laptop installed with R and rattle.
**R is available at http://probability.ca/cran/. Instructions for the rattle installation are available at http://rattle.togaware.com/rattle-install-mswindows.html or http://rattle.togaware.com/rattle-install-mac.html
Instructor:
Murtaza Haider
Associate Dean of Research & Graduate Programs
Ted Rogers School of Management
Ryerson University
Date and Time: Thursday, February 27, 2014 (2:00 pm — 4:00 pm)
Location:
Ted Rogers School of Management
55 Dundas Street West
Room TRS 2-166 (8th floor)
RSVP: Nik Ashton (nashton@ryerson.ca)
SEMINAR
Thursday March 6, 2014 at 3:30pm
Sidney Smith Hall, Room 1074
**Refreshments will be served at 3:15pm**
Dr. Alexander Kreinin, IBM
Head of Quantitative Research, RFE Risk Analytics, Business Analytics
Backward Simulation of Poisson Processes
Multivariate Poisson processes have many applications in financial modeling. In particular, in the area of Operational Risk they are used for description and simulation of the frequencies of operational events. Practitioners often model the operational events independently despite observed correlations between the components. In this talk we discuss simulation of the multivariate Poisson model based on the “Poisson Bridge” idea, extreme correlations and their dependence on the intensities of the processes.
Quantitative Methods Forum @ Norm Endler Room (BSB 164)
Speaker: Carrie Smith, York University
Department of Psychology
Title: Evaluation of Intimate Partner Risk Assessment Inventories
Abstract: In the interest of improving transparency, replicability, and validity, many jurisdictions and agencies are now favouring the use of empirically validated measures in violence risk assessment. To date, dozens of risk assessment inventories have been proposed for use in the domestic violence context, but none have been properly validated and their predictive efficacy remains limited. In this talk, I will discuss the methodological limitations of the existing literature, the challenges in producing defensible research in this domain, and the ethical implications of actuarial style risk assessment in the domestic violence context. I hope to inspire discussion about ways to improve the quality of future research in this important domain.
SEMINAR
Thursday March 13, 2014 at 3:30pm
Sidney Smith Hall, Room 1074 *Refreshments will be served at 3:15pm*
EMVS: The EM Approach to Bayesian Variable Selection
Veronika Rockova, University of Pennsylvania
Despite rapid developments in stochastic search algorithms, the practicality of Bayesian variable selection methods has continued to pose challenges. High-dimensional data are now routinely analyzed, typically with many more covariates than observations. To broaden the applicability of Bayesian variable selection for such high-dimensional linear regression contexts, we propose EMVS, a deterministic alternative to stochastic search based on an EM algorithm which exploits a conjugate mixture prior formulation to quickly and posterior modes.
Combining a spike-and-slab regularization diagram for the discovery of active predictor sets with subsequent rigorous evaluation of posterior model probabilities, EMVS rapidly identifies promising sparse high posterior probability submodels. External structural information such as likely covariate groupings or network topologies is easily incorporated into the EMVS framework. Deterministic annealing variants are seen to improve the effectiveness of our algorithms by mitigating the posterior multi-modality associated with variable selection priors.
The usefulness the EMVS approach is demonstrated on real high-dimensional data, where computational complexity renders stochastic search to be less practical. (Joint work with Edward George)
Seminars 2013-14 http://www.utstat.toronto.edu/wordpress/?page_id=18
Quantitative Methods Forum @ Norm Endler Room (BSB 164)
Speaker: Dr. Dave Flora, York University
Department of Psychology
Title: Two-part Models for Semicontinuous Variables in Psychological Research
Abstract: “Semicontinuous” outcome variables arise regularly in psychological research, such as substance use research and developmental psychopathology, and applications in experimental psychology are also possible (e.g., response-time data). Such variables are characterized by a strictly non-negative continuous distribution coupled with a high frequency of observations equal to zero. Although models for zero-inflated count data are relatively well understood, models for continuous outcomes with a preponderance of zeros are less commonly used. I will describe both cross-sectional and longitudinal two-part models for such variables. Part 1 is a model for a binary outcome (whether a zero is observed) while Part 2 is a model for a continuous outcome, given that a non-zero value was observed in the first part. In the longitudinal case, model choice for Part 1 has implications for the interpretation of parameters in Part 2. These models will be illustrated with an application to adolescent alcohol use.