What is this?

Learners of a spoken language acquire intonation mainly through the auditory speech signal, not so much through their nostrils, finger tips, or pupils. For example, questions end with a high tone, statements end with a low tone, but they don't have to. How does this work? We distinguish intonation patterns in a language based on their acoustic form and (para)linguistic meaning. Can we simulate this distinction task in ways that our ears, brains and mouths do it? Here's a tool that offers a tiny bit of help for you to explore this big question.
Go ahead and contribute to the research so that we can help each other understand. It's not too late, we've only just begun :)


Contour Clustering gives you:
- a workflow for cluster analysis on (time-series) measures of speech
- a graphical user interface (no coding required)
- time-series f0 measures
- interpolated, extrapolated, smoothed, and error-free f0 contours
- time-series intensity measures
- duration measures
- several speaker-correction methods
- several f0 contour representations (scales, derivatives, principal components)
- several clustering settings (method, distance metric, linkage criterion)
- evaluations of the optimal number of clusters
- well documented examples and instructions
- data (.csv) for further (manual) analysis, including plots and tables

What Contour Clustering needs from you:
- a research question
- segmented speech (.wav/.mp3 and
.textgrid) or an existing datafile with one row per measurement point (long format)
- R and R Studio installed

Learn more

Download

Complete file package as zip:
2025-03 - current (changelog, old versions)

App (requires R and R-studio):
~ contour_clustering.R - 2025-03

Praat script:
- time-series_f0.praat - 2025-03 (uses Praat's new filtered autocorrelation f0 tracking method)

Cite as:

Kaland, Constantijn. 2023. Contour clustering: A field-data-driven approach for documenting and analysing prototypical f0 contours. Journal of the International Phonetic Association 53(1), 159-188. doi:10.1017/S0025100321000049 [pdf]

Old versions

Complete file package as zip:
- 2024-08
- 2024-02
- 2023-06
- 2023-05
- 2022-08
- 2022-04
- 2021-08
- 2021-03

Documentation

Manual:
~ manual for Contour Clustering (recommended)

Learning materials:
- video tutorials: read folder, load file - 03-2025
- two-day workshop Methods @ LingLab [Day 1] [Day 2] [OSF repository] - 11-2023

Training datasets:
- Papuan Malay scripted noun phrases (article section 2)
- Papuan Malay spontaneous phrases (article section 3)
- Zhagawa scripted tone syllables (article section 4)

R packages:
cluster, data.table, dplyr, dtwclust, fda, ggdendro, ggplot2, Hmisc, Metrics, pracma, proxy, purrr,
readr, readtextgrid, scales, shiny, sound, stats, stringr, TSdist, usedist, utils, wrassp, zip, zoo

Cite as:

Kaland, Constantijn. 2023. Contour clustering: A field-data-driven approach for documenting and analysing prototypical f0 contours. Journal of the International Phonetic Association 53(1), 159-188. doi:10.1017/S0025100321000049 [pdf]

Languages

Contour Clustering has been applied to the following languages:

Language Unit of analysis Paper(s)
American English phrases/words Kaland et al., 2024 Tatár et al., 2024 Albert et al., 2024
Steffman et al., 2024 Cole et al., 2023
Bardi phrases Babinski, 2022 Babinski & Bowern, 2022
Burarra phrases Babinski, 2022

Cantonese syllable rhymes Li et al., 2023

Dalabon phrases Babinski, 2022

Ewe syllable rhymes Lam & Chen, 2024

German phrases Kaland & Ellison, 2023 Seeliger & Kaland, 2022
Gija phrases Babinski, 2022

Gunnartpa phrases Babinski, 2022

Gunwinggu phrases Babinski, 2022

Kayardild phrases Babinski, 2022

Kera'a words Kaland et al., 2021

Korean syllables Jeon et al., 2024

Kunbarlang phrases Babinski, 2022

Malak Malak phrases Babinski, 2022

Mandarin Chinese words Laméris et al., 2023

Murrinhpatha phrases Babinski, 2022

Muyu phrases Zahrer, 2024

Nasal phrases Hakim, 2024

Ngan’gi phrases Babinski, 2022

Papuan Malay (noun) phrases/words Kaland & Grice, 2024 Kaland & Ellison, 2023 Kaland, 2023
Patwin words Björklund, 2024

Punjabi hesitations Jabeen & Wagner, 2023

Wanyjirra phrases Babinski, 2022

Warlpiri phrases Babinski, 2022

Warnman phrases Babinski, 2022
Yan-nhangu phrases Babinski, 2022

Yidiny phrases Babinski, 2022

Zhagawa syllables Kaland & Ellison, 2023 Kaland, 2023

References

Cite as:
Kaland, Constantijn. 2023. Contour clustering: A field-data-driven approach for documenting and analysing prototypical f0 contours. Journal of the International Phonetic Association 53(1), 159-188. doi:10.1017/S0025100321000049 [pdf]

Also applied in:
Lam, Man Yan Priscilla & Chen, Yiya. 2024. Tonal contour clustering in Tongugbe Ewe: a preliminary investigation. Dag van de Fonetiek 2024. Utrecht, The Netherlands. Nederlandse Vereniging voor Fonetische Wetenschappen. [pdf]

Kaland, Constantijn; Steffman, Jeremy & Cole, Jennifer. 2024. K-means and hierarchical clustering of f0 contours. Proc. Interspeech 2024 (1520-1524). Kos, Greece. doi:10.21437/Interspeech.2024-181 [pdf]

Hakim, Jacob. 2024. Using role-playing tasks to document intonational tune prototypes in Nasal, an endangered language of Sumatra. Proc. Speech Prosody 2024 (1175-1179). Leiden, The Netherlands. doi:10.21437/SpeechProsody.2024-237 [pdf]

Zahrer, Alexander. 2024. Exploring natural speech intonation of an under-researched Papuan language. Proc. Speech Prosody 2024 (1095-1099). Leiden, The Netherlands. doi:10.21437/SpeechProsody.2024-221 [pdf]

Tatár, Csilla; Brennan, Jonathan R.; Krivokapić, Jelena & Keshet, Ezra. 2024. Examining melodiousness in sarcasm: wiggliness, spaciousness, and contour clustering. Proc. Speech Prosody 2024 (677-681). Leiden, The Netherlands. doi:10.21437/SpeechProsody.2024-137 [pdf]

Jeon, Hae-Sung; Kaland, Constantijn & Grice, Martine. 2024. Cluster analysis of Korean IP-final intonation. In Proc. Speech Prosody 2024 (1025-1029). Leiden, The Netherlands. doi:10.21437/SpeechProsody.2024-207 [pdf]

Albert, Aviad; Kaland, Constantijn; Ellison, T. Mark; Cangemi, Francesco; Winter, Bodo & Grice, Martine. 2024. Harvesting spontaneous speech data from digital reservoirs to study prosody. In Proceedings of the 19th Conference on Laboratory Phonology (LabPhon 19). [pdf]

Björklund, Anna. 2024. Automatic intonational contour clustering in Patwin. Proceedings of the Linguistic Society of America, 9(1), 5713. doi:10.3765/plsa.v9i1.5713 [pdf]

Kaland, Constantijn & Grice, Martine. 2024. Exploring and explaining variation in phrase-final f0 movements in spontaneous Papuan Malay. Phonetica, 81(3). doi:10.1515/phon-2023-0031 [pdf]

Steffman, Jeremy; Cole, Jennifer & Shattuck-Hufnagel, Stefanie. 2024. Intonational categories and continua in American English rising nuclear tunes. Journal of Phonetics, 104(101310) . doi:10.1016/j.wocn.2024.101310 [pdf]

Li, Katrina Kechun; Nolan, Francis & Post, Brechtje. 2023. Clustering lexical tones with intonation variation. Proceedings of the Second International Conference on Tone and Intonation (TAI) (pp. 87-88). Chinese and Oriental Languages information Procesing Society. [pdf]

Seeliger, Heiko; Lützeler, Anne & Kaland, Constantijn. 2023. The perception of German wh-phrase-final intonation: a contour clustering evaluation. In Proceedings of the 2nd International Conference on Tone and Intonation (TAI 2023) (pp. 10-14). Singapore. doi:10.21437/TAI.2023-3 [pdf]

Jabeen, Farhat & Wagner, Petra. 2023. Variability in hesitations in Punjabi semi-spontaneous narrative speech: An automatic clustering based analysis. Proc. Disfluency in Spontaneous Speech (DiSS) Workshop 2023, 71-75. doi: 10.21437/DiSS.2023-15 [pdf]

Kaland, Constantijn. 2023. Intonation contour similarity: F0 representations and distance measures compared to human perception in two languages. The Journal of the Acoustical Society of America, 154(1), 95–107. doi: 10.1121/10.0019850 [pdf]

Kaland, Constantijn & Ellison, T. Mark. 2023. Evaluating cluster analysis on f0 contours: An information theoretic approach on three languages. In R. Skarnitzl & J. Volín (Eds.), Proceedings of the 20th International Congress of Phonetic Sciences (pp. 3448–3452). Guarant International. [pdf]

Cole, Jennifer; Steffman, Jeremy; Shattuck-hufnagel, Stefanie & Tilsen, Sam. 2023. Hierarchical distinctions in the production and perception of nuclear tunes in American English. Laboratory Phonology, 14(1), 1–51. doi: 10.16995/labphon.9437 [pdf]

Laméris, Tim Joris; Li, Katrina Kechun & Post, Brechtje. 2023. Phonetic and Phono-Lexical Accuracy of Non-Native Tone Production by English-L1 and Mandarin-L1 Speakers. Language and Speech, 66(4). doi:10.1177/00238309221143719 [pdf]

Babinski, Sarah. 2022. Archival Phonetics & Prosodic Typology in Sixteen Australian Languages. PhD Thesis, Yale University. [pdf]

Seeliger, Heiko & Kaland, Constantijn. 2022. Boundary tones in German wh-questions and wh-exclamatives - a cluster-based approach. In Proceedings of the 11th International Conference on Speech Prosody 2022 (pp. 27–31). Lisbon, Portugal. doi:10.21437/SpeechProsody.2022-6 [pdf]

Babinski, Sarah & Bowern, Claire. 2022. Automatic categorization of prosodic contours in Bardi. Proceedings of the Linguistic Society of America 7(1) (pp. 5218). doi:10.3765/plsa.v7i1.5218 [pdf]

Kaland, Constantijn; Peck, Naomi; Ellison, T. Mark & Reinöhl, Uta. 2021. An initial exploration of the interaction of tone and intonation in Kera'a. Proceedings of the 1st International Conference on Tone and Intonation (TAI) (pp. 132-136). Sønderborg, Denmark. doi:10.21437/TAI.2021-27 [pdf]

Kaland, Constantijn. 2020. Contour clustering: a tool for exploring prototypical f0 patterns. Middag van de Fonetiek, Nederlandse Vereniging voor Fonetiek. Online oral presentation. abstract [pdf]

Credits

Marlene Böttcher (testing)
T. Mark Ellison (coding MDL evaluation)
Max Hörl (statistical advice)
Hae-Sung Jeon (testing)
Priscilla Lam (testing)
Gilly Marchini (testing)
Heiko Seeliger (testing)
Gustavo Silveira (testing)
Jeremy Steffman (testing)
SFB-1252 (funding)

Organizers and participants of:
Phonology Colloquium, Institut für Linguistik, Goethe-Universität Frankfurt (December 2024)
Phonetics Lunch Meeting, University of Zürich (November 2024)
The 31st Annual Meeting of the Austronesian Formal Linguistics Association (AFLA 31), University of Massachusetts (June 2024)
Methods @ LingLab, University of Konstanz (November 2023)
Lunch & Linguistics, Institut für Linguistik, Universität zu Köln (November 2023)
Methoden und Ansätze moderner phonetischer Forschung, Inst. Phonetics and Speech Processing, Ludwig-Maximilians-Universität München (June 2023)
Workshop Speech Units, University of Zürich (April 2023)
Klausurtagung Kloster Steinfeld, SFB-1252, Universität zu Köln (October 2022)
Forschungskolloquium, IfL Phonetik, Universität zu Köln (November 2021)
Phonetics & Phonology Seminar, Dept. Theoretical & Applied Linguistics, University of Cambridge (January 2021)
Middag van de Fonetiek 2020, Nederlandse Vereniging voor Fonetische Wetenschappen (December 2020)

Contact

Contact me for feedback, suggestions and requests for new versions: