Calendar

Feb
27
Thu
Data mining with R: Let R ‘rattle’ you @ Ted Rogers School of Management, TRS 2-166 (8th floor)
Feb 27 @ 14:00 – 16:00

Title:  Data mining with R: Let R ‘rattle’ you

 

Description: 

 

This hands-on workshop will provide training in the rattle data mining package for R. rattle is a graphical user interface to transform, visualize and analyze data.

 

For hands-on exercises, please bring a laptop installed with R and rattle.

 

**R is available at http://probability.ca/cran/. Instructions for the rattle installation are available at http://rattle.togaware.com/rattle-install-mswindows.html or http://rattle.togaware.com/rattle-install-mac.html

 

Instructor:

 

Murtaza Haider

Associate Dean of Research & Graduate Programs

Ted Rogers School of Management

Ryerson University

 

Date and Time:  Thursday, February 27, 2014 (2:00 pm — 4:00 pm)

 

Location: 

 

Ted Rogers School of Management

55 Dundas Street West

Room TRS 2-166 (8th floor)

 

RSVP: Nik Ashton (nashton@ryerson.ca)

 

Feb
28
Fri
On a multiple-shock dependence structure @ North Ross 638
Feb 28 @ 10:30 – 11:30
 Su Jianxi supervised by Prof. Edward Furman will be giving a talk in the York University Statistics seminar series.
The title of the talk is: On a multiple-shock dependence structure. The seminar will be given on Friday Feb 28, 10:30-11:30 at N638Ross.
Mar
6
Thu
**CANCELLED** Backward Simulation of Poisson Processes @ Sidney Smith Room 1074
Mar 6 @ 15:30 – 16:30

SEMINAR

Thursday March 6, 2014 at 3:30pm
Sidney Smith Hall, Room 1074

**Refreshments will be served at 3:15pm**

Dr. Alexander Kreinin, IBM
Head of Quantitative Research, RFE Risk Analytics, Business Analytics

Backward Simulation of Poisson Processes

Multivariate Poisson processes have many applications in financial modeling. In particular, in the area of Operational Risk they are used for description and simulation of the frequencies of operational events. Practitioners often model the operational events independently despite observed correlations between the components. In this talk we discuss simulation of the multivariate Poisson model based on the “Poisson Bridge” idea, extreme correlations and their dependence on the intensities of the processes.

Mar
10
Mon
Evaluation of Intimate Partner Risk Assessment Inventories @ Norm Endler Room (BSB 164)
Mar 10 @ 10:15 – 11:30

Quantitative Methods Forum @ Norm Endler Room (BSB 164)

Mar 10 @ 10:15 AM – 11:15 AM

Speaker: Carrie Smith, York University
Department of Psychology

Title: Evaluation of Intimate Partner Risk Assessment Inventories

Abstract: In the interest of improving transparency, replicability, and validity, many jurisdictions and agencies are now favouring the use of empirically validated measures in violence risk assessment. To date, dozens of risk assessment inventories have been proposed for use in the domestic violence context, but none have been properly validated and their predictive efficacy remains limited.  In this talk, I will discuss the methodological limitations of the existing literature, the challenges in producing defensible research in this domain, and the ethical implications of actuarial style risk assessment in the domestic violence context.  I hope to inspire discussion about ways to improve the quality of future research in this important domain.

Mar
13
Thu
EMVS: The EM Approach to Bayesian Variable Selection @ Sidney Smith Hall, Room 1074
Mar 13 @ 15:30 – 16:30

SEMINAR

Thursday March 13, 2014 at 3:30pm

Sidney Smith Hall, Room 1074   *Refreshments will be served at 3:15pm*

EMVS: The EM Approach to Bayesian Variable Selection

Veronika Rockova, University of Pennsylvania

 

Despite rapid developments in stochastic search algorithms, the practicality of Bayesian variable selection methods has continued to pose challenges. High-dimensional data are now routinely analyzed, typically with many more covariates than observations. To broaden the applicability of Bayesian variable selection for such high-dimensional linear regression contexts, we propose EMVS, a deterministic alternative to stochastic search based on an EM algorithm which exploits a conjugate mixture prior formulation to quickly and posterior modes.

Combining a spike-and-slab regularization diagram for the discovery of active predictor sets with subsequent rigorous evaluation of posterior model probabilities, EMVS rapidly identifies promising sparse high posterior probability submodels. External structural information such as likely covariate groupings or network topologies is easily incorporated into the EMVS framework. Deterministic annealing variants are seen to improve the effectiveness of our algorithms by mitigating the posterior multi-modality associated with variable selection priors.

The usefulness the EMVS approach is demonstrated on real high-dimensional data, where computational complexity renders stochastic search to be less practical. (Joint work with Edward George)

 

Seminars 2013-14 http://www.utstat.toronto.edu/wordpress/?page_id=18

 

 

Mar
17
Mon
Two-part Models for Semicontinuous Variables in Psychological Research @ Norm Endler Room (BSB 164)
Mar 17 @ 10:15 – 11:15

Quantitative Methods Forum @ Norm Endler Room (BSB 164)

Mar 17 @ 10:15 AM – 11:15 AM

Speaker: Dr. Dave Flora, York University
Department of Psychology

Title: Two-part Models for Semicontinuous Variables in Psychological Research

Abstract: “Semicontinuous” outcome variables arise regularly in psychological research, such as substance use research and developmental psychopathology, and applications in experimental psychology are also possible (e.g., response-time data). Such variables are characterized by a strictly non-negative continuous distribution coupled with a high frequency of observations equal to zero. Although models for zero-inflated count data are relatively well understood, models for continuous outcomes with a preponderance of zeros are less commonly used. I will describe both cross-sectional and longitudinal two-part models for such variables. Part 1 is a model for a binary outcome (whether a zero is observed) while Part 2 is a model for a continuous outcome, given that a non-zero value was observed in the first part. In the longitudinal case, model choice for Part 1 has implications for the interpretation of parameters in Part 2. These models will be illustrated with an application to adolescent alcohol use.

Mar
21
Fri
A Pipeline for High-Dimensional Time Course Gene Expression Data to Study Dynamic Network Responses to Viral Infections @ North Ross 638
Mar 21 @ 10:30 – 11:30
A Pipeline for High-Dimensional Time Course Gene Expression Data to Study Dynamic Network Responses to Viral Infections
Hulin Wu, Ph.D., Dean’s Professor
Department of Biostatistics and Computational Biology
Director, Center for Integrative Bioinformatics and Experimental Mathematics
University of Rochester School of Medicine and Dentistry
Email: Hulin_Wu@urmc.rochester.edu
A new pipeline for high-dimensional time course gene expression data is developed based on the concept of function data analysis (FDA) with a purpose to study dynamic network responses at gene level. The pipeline includes significant testing for dynamic response genes (DRGs), clustering gene response curves, constructing dynamic gene response networks using differential equation models, network feature analysis, dynamic system analysis, and biological annotations. Novel statistical methods and modeling approaches are developed for the pipeline, which include high-dimensional ODE model selection, parameter estimation, and dynamic system characteristic analysis. We illustrate the pipeline and the proposed methods using genome-wide time course gene expression data from mice and human subjects challenged by influenza viruses. Some interesting biological findings will be discussed.
Mar
24
Mon
Is Item-Level Non-Invariance Always Important? @ Norm Endler Room (BSB 164)
Mar 24 @ 10:15 – 11:15

Quantitative Methods Forum @ Norm Endler Room (BSB 164)

Mar 24 @ 10:15 AM – 11:15 AM

Speaker: Alyssa Counsell, York University
Department of Psychology

Title: Is Item-Level Non-Invariance Always Important?

Abstract: Differential Item Functioning (DIF) refers to measurement non-invariance across groups. In other words, DIF is present when individuals from two distinct groups with equivalent levels of the latent trait or ability demonstrate different response patterns. The implication is that group membership (instead of the latent trait) accounts for the difference in responding. There are several methods that test for DIF but I will use item response theory (IRT) in the current presentation. Specifically I will discuss DIF results that compare Canadian and German participants’ response patterns on each of the items of the General Self Efficacy Scale (Schwarzer & Jerusalem, 1995). The results demonstrate a practical concern for researchers. When DIF is present in some items, the implications for research are not always clear. In some instances the pattern of DIF may not consistently favour one group, and instead, item-level group differences may appear to cancel each other out if the total test information curve is examined. In psychology where groups are typically compared on test information rather than on an item-to-item basis, DIF may not represent a meaningful or important effect.

Mar
31
Mon
The Yuen-Welch and Generalized Linear Model Approaches for Analyzing Skewed and Heteroscedastic Data in Psychology and A brief survey of current statistical methods for meta-analyzing data produced by single-case experimental designs @ Norm Endler Room (BSB 164)
Mar 31 @ 10:15 – 11:15

Speaker: Victoria Ng and Joo Ann Lee, York University
Department of Psychology

Speaker: Victoria Ng

Title: The Yuen-Welch and Generalized Linear Model Approaches for Analyzing Skewed and Heteroscedastic Data in Psychology

Abstract: Many psychological studies are designed for testing whether there are group mean differences for some continuous outcome variable. However, the assumptions of normality and heteroscedasticity underlying traditional methods (i.e., ANOVA/OLS regression) are often violated. Two alternative methods are discussed: the Yuen-Welch with trimmed means, and the Generalized Linear Model (GLM). Given the many specifications that are possible in the GLM, selected studies on competing estimators from health outcomes literature are touched upon. With the premise that one would ideally choose the method that yields both adequate power and estimates that represent all relevant data (i.e., including distribution tails), I address the motivation for comparing the Yuen-Welch and the GLM by simulation and discuss potential implementations of such a study.

Speaker: Joo Ann Lee

Title: A brief survey of current statistical methods for
meta-analyzing data produced by single-case experimental designs

Abstract: Single-case experimental designs (SCEDs; also known as n-of-1 trials,
small-n designs, single-subject designs, and interrupted time-series
experimental designs, among others) are a set of experimental designs
that employ repeated data collection over time on a single unit of
interest such as an individual, a family, or an institution. SCEDs are
especially beneficial when the research areas studied have high
variability, or a low prevalence rate, because the unit serves as its
own control. More specifically, SCEDs explicitly focus on
within-individual variability. Unfortunately, a single SCED provides
very little, if any, information about between-individual variability.
This disadvantage however, can be remediated by meta-analyzing results
from separate SCEDs. Nonetheless, the meta-analytic methods of SCEDs
are just beginning to be developed. The presentation will begin with a
review of the type of data common to SCEDs, followed by illustrations
of current popular methods to analyze and meta-analyze SCED data, and
conclude with future research in the area.

Apr
30
Wed
SORA / BN / TABA Workshop
Apr 30 all day

SORA-TABA workshop will be held on Wednesday, April 30th, 2014

 University of Toronto
Health Sciences Building (the auditorium – HS610),

155 College Street, Toronto ON.

 

Recent Advanced in Deep Learning:

Learning Structured, Robust, and Multimodal Models

Building intelligent systems that are capable of extracting meaningful representations from high-dimensional data lies at the core of solving many Artificial Intelligence tasks, including visual object recognition, information retrieval, speech perception, and language understanding.

In this talk I will first introduce a broad class of hierarchical probabilistic models called Deep Boltzmann Machines (DBMs) and show that DBMs can learn useful hierarchical representations from large volumes of high-dimensional data with applications in information retrieval, object recognition, and speech perception. I will then describe a new class of more complex models that combine Deep Boltzmann Machines with structured hierarchical Bayesian models and show how these models can learn a deep hierarchical structure for sharing knowledge across hundreds of visual categories, which allows accurate learning of novel visual concepts from few examples. Finally, I will introduce deep models that are capable of extracting a unified representation that fuses together multiple data modalities. I will show that on several tasks, including modelling images and text, video and sound, these models significantly improve upon many of the existing techniques.

Ruslan Salakhutdinov

Assistant Professor,
Department of Computer Science and
Department of Statistical Sciences
University of Toronto