Skip to content

UUDigitalHumanitieslab/sasta

Repository files navigation

SASTA: Semi-Automatische Spontane Taal Analyse

DOI

SASTA is a tool for the analysis of spontaneous language transcripts, to aid clinical linguists and research into language development and language disorders. SASTA analyzes a transcript grammatically using Alpino, an automatic utterance parser for Dutch, and can recognize a significant number of forms of deviant language use and analyze them correctly, following multiple assessment methods available for Dutch (TARSP, STAP and ASTA).

Overview

  • SASTA can analyze transcripts following multiple assessment methods available for Dutch:
    • TARSP (Schlichting 2005, 2017) for young children (1–4 years), inspired by LARSP for English (Crystal et al. 1989);
    • STAP (Verbeek et al. 2007, van Ierland et al. 2008) for older children (4–8 years);
    • ASTA (Boxum et al. 2013) for adults suffering from aphasia.
  • SASTA generates as output a method-specific form and an annotated transcript. The generated transcript can be corrected by a linguist, if needed, and re-uploaded into SASTA, after which SASTA generates an adapted method-specific form. Overall, SASTA achieves an accuracy between 88 and 95% on training data for TARSP and STAP.
  • SASTA accepts as input transcripts in MS Word or plain text (given some SASTA-specific requirements), as well as CHAT (MacWhinney 2000), and uses AuCHAnn to generate valid CHAT files for transcripts accompanied by an interpretation, which significantly improves results.
  • SASTA analyzes a transcript grammatically using Alpino. It then uses specially constructed (XPath) queries for all measures defined within the assessment method to count the frequencies of linguistic phenomena in the spontaneous language sample. As such, SASTA may be considered a spin-off of GrETEL, that can be used to investigate syntactic phenomena using query-by-example.
  • Further development of SASTA is ongoing, in close collaboration with researchers in language development and with linguists in clinics.

Contents

This repository contains the source code for the SASTA web application, which consists of a Django backend and Angular frontend.

This repository does not include input data, as these can be privacy sensistive. Refer to the documentation for instructions on constructing your own input data.

Sastadev

SASTA relies on a Python package called sastadev in the backend. This package is freely available on Github, with documentation available on Read the Docs.

Usage

If you are interested in using SASTA, the most straightforward way to get started is to make an account at sasta.hum.uu.nl. This server is maintained by the Research Software Lab and runs the most current release.

Consult the user documentation for all information on using the application, input formats, and output formats.

Self-hosting is an option, though support by the Research Software Lab is not provided.

Development

The documentation directory contains documentation for developers. This includes running the application through Docker.

License

SASTA is shared under a BSD-3 Clause licence See LICENSE for more information.

Citation

If you wish to cite this repository, please use the metadata provided in our CITATION.cff file.

Contact

For questions, small feature suggestions, and bug reports, feel free to create an issue. You can also contact the Centre for Digital Humanities.

Publications on SASTA

Other relevant publications

  • Boxum, E., van der Scheer, F. and Zwaga, M. (2013). ASTA: Analyse voor Spontane Taal bij Afasie (4th ed.). Vereniging voor Klinische Linguïstiek.
  • Crystal, D., Fletcher, P. and Garman, M. (1989). Grammatical Analysis of Language Disability (2nd ed.). London: Cole and Whurr. https://hdl.handle.net/10092/17651
  • van Ierland, M., Verbeek, J. and van den Dungen, L. (2008). Spontane Taal Analyse Procedure: Handleiding van het STAP-instrument. Universiteit van Amsterdam.
  • MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk: Transcription format and programs (3rd ed.). Lawrence Erlbaum Associates Publishers.
  • Odijk, J. (2023, 30 Jan.). Taaltechnologie voor taalkundig onderzoek. Valedictory speech, Utrecht University. https://surfdrive.surf.nl/files/index.php/s/pzNHSgd6t8L0Wnk
  • Schlichting, L. (2005). TARSP: Taal Analyse Remediëring en Screening Procedure: Taalontwikkelingsschaal van Nederlandse kinderen van 1–4 jaar (7th ed.). Amsterdam: Pearson. ISBN 978 90 265 1355 8.
  • Schlichting, L. (2017). TARSP: Taal analyse remediëring en screening procedure: Taalontwikkelingsschaal Van Nederlandse Kinderen van 1–4 Jaar met Aanvullende Structuren tot 6 jaar (8th ed.). Amsterdam: Pearson. ISBN 978 90 430 3561 3.
  • Verbeek, J., van Ierland, M. and van den Dungen, L. (2007). Spontane Taal Analyse Procedure: Verantwoording van het STAP-instrument. Universiteit van Amsterdam.