Conversational-Side-Specific Inter-Session Variability Compensation

Inter-session variability compensation techniques in speaker recognition systems are typically crucial for achieving a satisfactory performance. General techniques for inter-session variability compensation may not capture session and channel information specific to a given conversational side. This paper investigates three methods for estimating a conversational-side-specific projection or affine transform to compensate for session and channel effects. In the first, we estimate the projection based on an estimate of the within-class covariance matrix from the statistics of a conversational-side-specific subset of the development data. In the second, we use a discriminative objective function to estimate the projection parameters. We present an iterative algorithm similar to the expectation maximization (EM) algorithm to estimate the projection parameters which maximize this objective function. An affine transform of the observation vectors of each conversational side is estimated using maximum likelihood estimation in the third method. The maximum likelihood objective function is estimated on a selected subset of the training data. We present several experiments that show how these three techniques perform compared to our baseline system on the interview tasks of the NIST 2008 and the NIST 2010 speaker recognition evaluations. The best method of these techniques gives a performance improvement of up to 20% relative compared to the baseline system.

By: Mohamed Kamal Omar, Jason Pelecanos

Published in: RC25139 in 2011


This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.


Questions about this service can be mailed to .