Speaker: Dr. Augustine Wong, York University
Department of Mathematics and Statistics
Title: Overview of Likelihood-Based Inference
Abstract: Obtaining a confidence region or a performing significance test of a parameter based on the likelihood function is commonly used in statistics. Professor Pek in last year’s presentation introduced two likelihood-based methods: Wald method (based on the maximum likelihood estimate of the parameter) and Wilks method (likelihood ratio method). In this talk, the accuracy of these two methods is examined. When the parameter of interest is a scalar parameter, a special way of combining the Wald method and the Wilks method is proposed. This proposed method gives extremely accurate inference results even when the sample size is extremely small.
Suggested Readings:
1. Barndorff-Nielsen, O.E., & Cox, D.R. (1994). Inference and Asymptotics. Chapman & Hall.
2. Bedard, M., Fraser, D.A.S., & Wong, A. (2007). Higher accuracy for Bayesian and frequentist inference: large sample theory for small sample likelihood . Statistical Science 22, 301-321.
3. Doganaksoy, N. & Schmee, J. (1993). Comparisons of approximate confidence intervals for distributions used in life-data analysis . Technometrics 35, 175-184.
4. Fraser, D.A.S., 1990. Tail probabilities from observed likelihoods. Biometrika 77, 65-76.
5. Fraser, D.A.S., Reid, N. & Wu, J. (1999). A simple general formula for tail probabilities for frequentist and Bayesian inference . Biometrika 86, 249-264.
6. Reid, N. (1988). Saddlepoint methods and statistical inference. Statistical Science 3, 213-238.
7. Reid, N. (1996). Higher order asymptotics and likelihood: a review and annotated bibliography . Canadian Journal of Statistics 24, 141-166.
8. Wong, A. & Wu, J. (2000). Practical use of small sample asymptotics for distributions used in life-data analysis . Technometrics 42, 149-155.
9. Wong, A. & Wu, J., (2001). Approximate inference for the factor loading of a simple factor analysis model . Scandinavian Journal of Statistics 28, 407-414.
(Note: 1, 4, 5, 6, 7 are background material, 2 is to related to Bayesian, and the rest are specific applications.)
Pseudo-likelihood methods
for community detection in
large sparse networks
Arash Amini, University of Michigan
We consider the problem of community detection in a
network, that is, partitioning the nodes into groups that, in
some sense, reveal the structure of the network. Many
algorithms have been proposed for fitting network
models with communities, but most of them do not scale
well to large networks, and often fail on sparse networks.
We present a fast pseudo-likelihood method for fitting the
stochastic block model, a well-known model for networks
with communities, as well as a variant that allows for an
arbitrary degree distribution by conditioning on degrees.
We provide empirical results showing that the algorithms
perform well under a range of settings, including on very
sparse networks, and illustrate on the example of a
network of political blogs. We also present spectral
clustering with perturbations, a method of independent
interest, which works well on sparse networks where
regular spectral clustering fails, and use it to provide an
initial value for pseudo-likelihood. We discuss theoretical
results showing that pseudo-likelihood provides
consistent estimates of the communities under mild
conditions on the starting value, for the case of a block
model with two communities. Time permitting, we give
some insights as to why perturbations help with spectral
clustering on sparse networks.
Tuesday
February 11,
2014
at 3:30pm
Sidney Smith
Hall, Room
2118
Refreshments
will be served
at 3:15p
Tuesday February 11, 2014 at 3:30pm
Sidney Smith Hall, Room 2118
*Refreshments will be served at 3:15pm
Pseudo-likelihood methods for community detection in large sparse networks
Dr. Arash Amini, University of Michigan
We consider the problem of community detection in a network, that is, partitioning the nodes into groups that, in some sense, reveal the structure of the network. Many algorithms have been proposed for fitting network models with communities, but most of them do not scale well to large networks, and often fail on sparse networks. We present a fast pseudo-likelihood method for fitting the stochastic block model, a well-known model for networks with communities, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees.
We provide empirical results showing that the algorithms perform well under a range of settings, including on very sparse networks, and illustrate on the example of a network of political blogs. We also present spectral clustering with perturbations, a method of independent interest, which works well on sparse networks where regular spectral clustering fails, and use it to provide an initial value for pseudo-likelihood. We discuss theoretical results showing that pseudo-likelihood provides consistent estimates of the communities under mild conditions on the starting value, for the case of a block model with two communities. Time permitting, we give some insights as to why perturbations help with spectral clustering on sparse networks.
http://www.utstat.toronto.edu/wordpress/wp-content/uploads/2014/01/ArashAminiFeb112014.pdf
Thursday February 13, 2014
at 3:30pm
Sidney Smith Hall, Room 1074
**Refreshments will be served at 3:15pm
Computational Foundations of Bayesian Inference and Probabilistic Programming
Dr. Daniel Roy, University of Cambridge
The complexity, scale, and variety of data sets we now have access to have grown enormously, and present exciting opportunities for new applications. Just as high-level programming languages and compilers empowered experts to solve computational problems more quickly, and made it possible for non-experts to solve them at all, a number of high-level probabilistic programming languages with computationally universal inference engines have been developed with the potential to similarly transform the practice of Bayesian statistics. These systems provide formal languages for specifying probabilistic models compositionally, and general algorithms for turning these specifications into efficient algorithms for inference.
In this talk, I will address three key questions at the theoretical and algorithmic foundations of probabilistic programming—and probabilistic modeling more generally—that can be answered using tools from probability theory, computability and complexity theory, and nonparametric Bayesian statistics. Which Bayesian inference problems can be automated, and which cannot? Can probabilistic programming languages represent the stochastic processes at the core of state-of-the-art nonparametric Bayesian models? And if not, can we construct useful approximations? I’ll close by relating these questions to other challenges and opportunities ahead at the intersections of computer science, statistics, and probability.
http://www.utstat.toronto.edu/wordpress/?page_id=18
Speaker: Dr. Andrea Howard, Carleton University
Department of Psychology
Title: Integrating Ratings of Child Psychopathology across Multiple Informants
Abstract: One of the most significant challenges facing researchers and practitioners who assess child psychopathology is how to integrate information about a child’s symptoms from multiple sources when those sources provide discrepant ratings (De Los Reyes & Kazdin, 2004). It is common to obtain ratings for a single target child from informants such as parents, teachers, and peers, but it is less clear how to combine the information provided by multiple informants to derive an integrated measure of the psychopathology trait of interest that is not confounded with informants’ unique perspectives. A new approach to this problem stipulates a trifactor measurement model to analytically disaggregate informants’ unique perspectives of children’s symptoms from a cross-informant consensus rating of their true symptoms (Bauer, Howard et al., 2013). Preliminary results from a new study expand the trifactor model to a three-informant, multi-trait assessment of inattention and hyperactivity/impulsivity symptoms using data drawn from baseline assessments of children enrolled in a randomized controlled trial study of treatments for Attention-Deficit/Hyperactivity Disorder (ADHD).
Suggested Readings:
Bauer, D. J., Howard, A. L., Baldasaro, R. E., Curran, P. J., Hussong, A. M., Chassin, L., & Zucker, R. A. (2013). A trifactor model for integrating ratings across multiple informants . Psychological methods, 18(4), 475-493.
Title: Data mining with R: Let R ‘rattle’ you
Description:
This hands-on workshop will provide training in the rattle data mining package for R. rattle is a graphical user interface to transform, visualize and analyze data.
For hands-on exercises, please bring a laptop installed with R and rattle.
**R is available at http://probability.ca/cran/. Instructions for the rattle installation are available at http://rattle.togaware.com/rattle-install-mswindows.html or http://rattle.togaware.com/rattle-install-mac.html
Instructor:
Murtaza Haider
Associate Dean of Research & Graduate Programs
Ted Rogers School of Management
Ryerson University
Date and Time: Thursday, February 27, 2014 (2:00 pm — 4:00 pm)
Location:
Ted Rogers School of Management
55 Dundas Street West
Room TRS 2-166 (8th floor)
RSVP: Nik Ashton (nashton@ryerson.ca)
SEMINAR
Thursday March 6, 2014 at 3:30pm
Sidney Smith Hall, Room 1074
**Refreshments will be served at 3:15pm**
Dr. Alexander Kreinin, IBM
Head of Quantitative Research, RFE Risk Analytics, Business Analytics
Backward Simulation of Poisson Processes
Multivariate Poisson processes have many applications in financial modeling. In particular, in the area of Operational Risk they are used for description and simulation of the frequencies of operational events. Practitioners often model the operational events independently despite observed correlations between the components. In this talk we discuss simulation of the multivariate Poisson model based on the “Poisson Bridge” idea, extreme correlations and their dependence on the intensities of the processes.
Quantitative Methods Forum @ Norm Endler Room (BSB 164)
Speaker: Carrie Smith, York University
Department of Psychology
Title: Evaluation of Intimate Partner Risk Assessment Inventories
Abstract: In the interest of improving transparency, replicability, and validity, many jurisdictions and agencies are now favouring the use of empirically validated measures in violence risk assessment. To date, dozens of risk assessment inventories have been proposed for use in the domestic violence context, but none have been properly validated and their predictive efficacy remains limited. In this talk, I will discuss the methodological limitations of the existing literature, the challenges in producing defensible research in this domain, and the ethical implications of actuarial style risk assessment in the domestic violence context. I hope to inspire discussion about ways to improve the quality of future research in this important domain.