🎯 What is this tutorial for?

This tutorial provides a complete, hands-on workflow for aligning speech recordings at the phone level using WhisperX and the Montreal Forced Aligner (MFA). It is designed for linguists, speech researchers, and language students, especially those without a technical background who want to benefit from large AI models without writing complex code.

The instructions are based on real deployment experience and include platform-specific notes, common pitfalls, and practical scripts for batch processing.

πŸ“„ Download the Full Tutorial

From Audio to Phone-Level Alignment Using WhisperX and MFA_20June.pdf

🐍 Download the Python Script

vtt2tgt.py

βœ… Optional: How to Cite

If this tutorial or script saves you time or helps in your project, a mention or citation would be greatly appreciated!

Jingyi Sun (2025). From Audio to Phone-Level Alignment Using WhisperX and MFA. Available at: https://www.notion.so/From-Audio-to-Phone-Level-Alignment-Using-WhisperX-and-MFA-218d6663b67a807d9d07f6549172141f

Written by Jingyi Sun,

Laboratoire de PhonΓ©tique et Phonologie (CNRS & Sorbonne Nouvelle), Paris, June 2025.