About me

Hey! This is Wei. I am a Ph.D. student in Electrical Engineering at the University of Texas at Dallas. I work with Professor John H.L. Hansen at the Center for Robust Speech Systems. My current research mainly involve machine learning for speech, audio and language processing, speaker recognition and diarization, speech enhancement and separation, acoustic event detection and localization, affective computing, and multi-modal analysis of audio-visual-lexical content.

I received my master degree in the Signal and Image Processing Institute at the University of Southern California. I was advised by Professor Panayiotis Georgiou at the SCUBA Lab and the BSP group. My work was mainly related to computational models for emotion recognition using speech, lexical, and psychological signal processing techniques, especially dynamic modeling for couples' interactions.

Previously, I completed my bachelor degree at the University of Electronic Science and Technology of China. Even before that, I grew up in Hefei, Anhui Province in Eastern China. I spent all my childhood at the University of Science and Technology of China, which is one of the best memories ever.

Some news

  • Summer research intern & Fall student researcher at Google with Quan Wang and Han Lu.
  • Summer research intern at Tencent AI with Chunlei Zhang and Dong Yu.
  • Spring research intern at Microsoft Applied Science Lab with Kazuhito Koishida.
  • Summer research intern at JD AI Research with Jing Huang.
  • Move to the Center for Robust Speech Systems at the University of Texas at Dallas.
  • Visiting the National Engineering Lab for Speech and Language Information Processing at USTC with Prof. Jun Du.
  • Our paper "Still Together?: The Role of Acoustic Features in Predicting Marital Outcome" featured on TechCrunch, Scientific American, Wall Street Journal, and more
  • Won Interspeech Degree of Nativeness Sub-challenge as a team member.
  • Passed the USC-SIPI Ph.D. screening exam.
  • Initiated the Behavioral Signal Processing and Machine Learning Reading Group.
  • Glad to join the SCUBA lab.

Publications

[Google Scholar]     Highlighted

Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection

Wei Xia*, Han Lu*, Quan Wang*, Anshuman Tripathi, Yiling Huang, Ignacio Lopez Moreno, Hasim Sak.

(* equal contribution), arXiv:2109.11641
[Preprint] [BibTex]

Scenario Aware Speech Recognition: Advancements for Apollo Fearless Steps & CHIME-4 Corpora

Szu-Jui Chen, Wei Xia, John H.L. Hansen.

ASRU 2021, Cartagena, Colombia
[PDF] [BibTex]

Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning

Wei Xia, Chunlei Zhang, Chao Weng, Meng Yu, Dong Yu.

ICASSP 2021, Toronto, CA
[PDF] [BibTex]

DEAAN: Disentangled Embedding and Adversarial Adaptation Network for Robust Speaker Representation Learning

Mufan Sang, Wei Xia, John. H.L. Hansen.

ICASSP 2021, Toronto, CA
[PDF] [BibTex]

Speaker Representation Learning using Global Context Guided Channel and Time-Frequency Transformations

Wei Xia, John. H.L. Hansen.

Interspeech 2020, Shanghai, China
[PDF] [BibTex]

Open-set Short Utterance Forensic Speaker Verification using Teacher-Student Network with Explicit Inductive Bias

Mufan Sang, Wei Xia, John H.L. Hansen.

Interspeech 2020, Shanghai, China
[PDF] [BibTex]

SkipConvNet: Skip Convolutional Neural Network for Speech Dereverberation using Optimally Smoothed Spectral Mapping

Vinay Kothapally, Wei Xia, Shahram Ghorbani, John H.L. Hansen, Wei Xue, Jing Huang.

Interspeech 2020, Shanghai, China
[PDF] [BibTex] [Audio Samples]

Cross-domain Adaptation with Discrepancy Minimization for Forensic Speaker Verification

Zhenyu Wang, Wei Xia, John H.L. Hansen.

Interspeech 2020, Shanghai, China
[PDF] [BibTex]

Sound Event Detection in Multichannel Audio using Convolutional Time-Frequency-Channel Squeeze and Excitation

Wei Xia, Kazuhito Koishida.

Interspeech 2019, Graz, Austria
[PDF] [Poster] [BibTex]

Cross-lingual Text-independent Speaker Verification using Unsupervised Adversarial Discriminative Domain Adaptation

Wei Xia, Jing Huang and John H.L. Hansen.

ICASSP 2019, Brighton, UK
[PDF] [BibTex]

UTD-CRSS systems for 2018 NIST Speaker Recognition Evaluation

Chunlei Zhang, Fahimeh Bahmaniezhad, Shivesh Ranjan, Harishchandra Dubey, Wei Xia and John H.L. Hansen.

ICASSP 2019, Brighton, UK
[PDF] [BibTex]

Speaker Recognition with Nonlinear Distortion: Clipping Analysis and Impact

Wei Xia and John H.L. Hansen.

Interspeech 2018, Hyderabad, India.
[PDF] [BibTex]

A Dynamic Model for Behavioral Analysis of Couple Interactions using Acoustic Features.

Wei Xia, James Gibson, Bo Xiao, Brian Baucom and Panayiotis Georgiou.

Interspeech 2015, Dresden, Germany.
[PDF] [BibTex]

Still Together?: The Role of Acoustic Features in Predicting Marital Outcome.

Md Nasir, Wei Xia, Bo Xiao, Brian Baucom, Shrikanth Narayanan and Panayiotis Georgiou.

Interspeech 2015, Dresden, Germany.
[PDF] [BibTex] [Media Coverage]

Automated Evaluation of Non-Native English Pronunciation Quality: Combining Knowledge- and Data-Driven Features at Multiple Time Scales

Matthew Black, Daniel Bone, Zisis Iason Skordilis, Rahul Gupta, Wei Xia, Pavlos Papadopoulos, Sandeep Nallan Chakravarthula, Bo Xiao, Maarten Van Segbroeck, Jangwon Kim, Panayiotis Georgiou and Shrikanth Narayanan.

Interspeech 2015, Dresden, Germany. Winner of the Degree of Nativeness Sub-challenge
[PDF] [BibTex]


Error concealment for low bit rate video communication

Wei Xia and Di Yang.

International Conference on Information Science and Technology (ICIST 13), Yangzhou, China.
[PDF] [BibTex]

Services

  • Reviewer for IEEE/ACM Transactions on Audio, Speech and Language Processing.
  • Reviewer for IEEE Signal Processing Letters.
  • Reviewer for Computer Speech and Language.
  • Reviewer for APSIPA Transactions on Signal and Information Processing.
  • Reviewer for Interspeech.
  • Secondary reviewer for ICASSP, SLT, ASRU, etc.
  • Maintainer of CRSS CPU & GPU clusters.

Personal

  • Some Photoes
  • Hobbies: most ball games; jazz, indie pop, rhythm&blues