Wednesday, September 2, 2020

STIP Student Sydney Allworth begins to Build a Non-Native Articulatory (NNA) Corpus Online

 The broad goal for this summer was to begin building a Non-Native Articulatory (NNA) Corpus online; the idea was that other researchers could use the articulatory and acoustic data compiled on this corpus in order to explore their own research questions about second language (L2) acquisition. Although the long term goal for the corpus was to compile data from L2 speakers of any language, we planned on starting with speakers who’s L2 is French this summer. Little did we know that we would need to narrow our scope much more significantly in order to adapt to the changes caused by Covid-19. 

Because the goal was to build an articulatory corpus, we originally planned to collect both auditory and articulatory data to measure how each speaker was producing speech sounds in French. We set out to use ultrasound technology in order to capture images of speech while in action, but we soon discovered that even collecting audio in person was unlikely. This is when we changed our course for the summer. Instead of launching the corpus itself, we would be focusing on building its core structures such as the language background questionnaire and list of stimuli. In addition to this, we would conduct a pilot study focusing only on the analysis of auditory data that was collected and sent to us remotely.

While this is a much more modest plan than we had before, we still had a lot of work cut out for us. After a full month of remote training, we put all our effort into developing a questionnaire that would provide pertinent information on the many variables that go into L2 speech production, as well as creating a stimuli list that contained as many possible sounds—and sound combinations—as are known to appear in the target language, French. However, the hardest part was yet to come: data collection and analysis. We were only able to recruit five speakers—three non-native French speakers, and two native French speakers who served as controls—but annotating the 825 sound files they produced certainly took up plenty of our time, and we were able to find many interesting patterns.

Though we were limited greatly by the reality of the global pandemic, I gained extremely valuable experience while participating in this Summer Team Impact Project. Not only did I learn more about Linguistics and the research community, I also learned some computer programming techniques that I doubt I would have encountered on my own had I not participated this summer. As a freshman at GMU last year, I didn’t expect to come across an opportunity like this. I’m incredibly proud to have been a part of such an interesting and important program.