DARLA: Automated Vowel Extraction

Semi-Automated Alignment and Vowel Extraction

This system runs FAVE-style semi-automated analysis using both the audio and manual transcriptions, with three different transcription types.

The pronunciations of words in your transcriptions are taken from the CMU Dictionary, which contains Standard American English pronunciations. We use a grapheme-to-phoneme model (Sequitur trained on the same dictionary) to predict pronunciations for words that are not in the dictionary, such as proper names or slang terms. This model is quite accurate, but may introduce a few errors.

The returned results are in formats convenient for analysis: a basic vowel plot, spreadsheets with raw and Lobanov-normalized data, a spreadsheet formatted for the online NORM system (which has other plotting and normalization options), the transcription, and the aligned TextGrid file.

Audio with transcriptions showing sentence or breath group boundaries (as TextGrids)

Recommended

This option is designed to make it possible to easily upload a TextGrid with simple paired boundaries around each sentence, and receive alignment and formant extraction results (helpful hints). To produce the boundaries annotations:
1. Open the audio file in Praat. In the Praat Objects window, click "Annotate", then "To TextGrid".
2. For "All tier names", type "sentence"
3. For "Which of these are point tiers?", just leave it blank. Click OK.
4. Now highlight both the audio file and the TextGrid file (hold down Cmd or Ctrl on the keyboard). Click "View & Edit".
5. Use the mouse to select a sentence or clause, hit Return. A pair of boundaries (blue) should appear on either side of your sentence.
6. Transcribe the sentence inside that pair of boundaries
7. Move on to the next sentence and create a pair of boundaries in the same way. When finished with the whole audio file, click "File", then "Save TextGrid as..."
Try it out
Audio with transcriptions in a plaintext file

This option lets you upload a plaintext (.txt) file with a transcription along with your audio file. The transcription could be your own manual transcription or a transcription from any ASR method (DARLA, Nuance Dragon, etc.).

If you want to make corrections on the aligned TextGrid file, upload it to the TextGrid input interface and re-run formant extraction.

For best results, we recommend that you first delete noises, loud breaths, other voices, etc., from your audio file (in Praat, select the noise and use Cmd-X or Ctrl-X to delete).

As for the plaintext file, we recommend creating this file using Notepad or TextEdit (with "smart replace" turned off) or emacs or another plaintext editor. Transcripts created by "rich text" editors like Word may contain markup that will interfere with your results, or cause DARLA to fail on your job. Please remove punctuation and other extraneous symbols as far as possible.

Try it out
Audio with aligned TextGrids, for formant extraction only

With the other options on this website, the system produces an automatically aligned TextGrid. This option is designed to give you the opportunity to upload a hand-corrected alignment and re-run it through the formant extraction step.

Try it out

Semi-Automated Alignment and Vowel Extraction

Audio with transcriptions showing sentence or breath group boundaries (as TextGrids)

Audio with transcriptions in a plaintext file

Audio with aligned TextGrids, for formant extraction only