In conversation with Marcelo Sevilla

Marcelo Sevilla is a PhD student in the Linguistics department at the University of Hong Kong, currently working on the Xiang subgroup of Sinitic. He got his BA in Linguistics from the University of Illinois at Urbana-Champaign and his MA at HKU, writing a dissertation on the synchronic and diachronic phonology of the Qidong Xiang dialect. His research interests include the typology of Sinitic languages, phonetics and phonology, and the syntax-semantics of classifier constructions. In this interview, which we did in February 2021, we discuss his experiences with remote data collection during the pandemic.

Tell me about your research and the kinds of data that you have collected and are collecting now?

The most recent research I’ve been doing all has to do with the Sinitic languages, either the Xiang subgroup of Sinitic, which is a dialect group spoken in Hunan province, or Cantonese. Originally my research was in syntax-semantics, and it involved some light fieldwork, where I got more descriptive data on classifier usage in the Xiang dialects. That was very theoretical, or theory-minded, as syntax and semantics tend to be. But, recently, I’ve switched to doing more research on phonetics and phonology, still in the Sinitic languages. In particular, I’m focusing on hesitation markers in Sinitic languages, which are a class of discourse markers used to indicate a pause in speech, whose equivalents in English are ‘uh’ and ‘um’. I’m looking at how the Sinitic languages incorporate these items into their tonal and vowel systems, which is interesting because not a lot of research has been done on the Xiang dialects. In particular, hesitation markers are not so well-known because they’re not often considered part of language. The traditional view is that they are somewhat language-independent, because they’re not lexical items in the traditional sense. For a long time, they were sort of thought of as being non-linguistic sounds, like a sneeze or a yawn. More recently, there’s been a lot of very interesting research that talks about how these items have cross-linguistic similarities in form. This research comes out of the Netherlands, primarily by Mark Dingemanse. Very interesting research about how these items are actually language-specific and are internalized into the phonological system of the languages from which they come, but they also have these cross-linguistic convergences in form, emerging from their function. I find that very interesting, because they’re sort of language-independent, while at the same time remaining language-specific.

What kind of data are you collecting in terms of the participants that you’re interested in or the kinds of data that you’re interested in?

The research I do on the Xiang dialects used to be through the HKU sound booth. I would ask people to come to the Linguistics department and I would have them watch the Pear Stories video, which has been used for Sinitic languages very successfully to gather data on narrative, and then participants recount the video, and I record that, and then go through, and find the items I’m looking for. From the recordings, I extract three bits of information: F0, F1 and F2. I look at the distribution of that to define the vowel space, and the tone and pitch. For Cantonese, which is some other research I’m doing, I’m using a recorded corpus to analyze the hesitation markers. The corpus was recorded by someone else, and the use of the corpus was part of how I had to adjust my data collection post-pandemic. The adjustments to my research came in three directions. First, for my research on classifiers, I was no longer able to go to Hunan to gather narratives. Second, for the Xiang hesitation markers I was no longer able to bring people into the sound booth, so I had to record people on Zoom, which is problematic for various reasons, and I’ve had to adjust that in certain ways because Zoom isn’t very reliable for formant transitions. And then for the third way, for the research on Cantonese, I decided to use a corpus, instead of recording people in the sound booth, because it was more pragmatic given the concerns that people might have about travelling. The corpus, called the Corpus of Spoken Chinese, was collected by the Polytechnic University of Hong Kong in 2015. It includes interviews conducted in Cantonese, and the corpus contains audio recordings and the transcription of the interviews side-by-side. This is very helpful to me, because my Cantonese is really lacking, so it’s good to be able to hear and have the transcription on hand as well.

Can you talk a bit about how you’ve had to adjust to the new situation, and what you’ve learned until now on the basis of that?

One of the more frustrating things about the acoustic data collection at the moment is that I’ve collected a few items in a sound booth, a few items on Zoom, and a few items through sending participants a link to an online microphone, and then they record themselves and send the recording back to me. Having these three sets of data is really problematic for comparison. In addition, collecting data for these dialects is hard because they’re not so present in Hong Kong and it requires me to network with people, and invite them to work with me and take time out of their day for no compensation. So it’s hard to find speakers in this way. I don’t want to do away with the data that I’ve collected but it’s hard for me to say that I can compare these three sets realistically, because there’s something wrong in some cases with the formant transitions. My progression has been to go from the sound booth to Zoom recordings, realizing that Zoom recordings aren’t the best, and then moving on to sending people a microphone through a website that they can access and using that. This is the best I can do right now, but even then, I can’t verify how good of a microphone the informant has. As long as the file isn’t compressed, I think it’s okay. That is the best method to collect data remotely that I’ve identified so far: having the participants record themselves.

Can you say a bit more about the technical side of this whole procedure?

The way that it’s developed now is I’ll talk to participants on Zoom, or if they’re in mainland China on WeChat, because Zoom doesn’t work in mainland China. Then I’ll send them a link that opens in a browser, via Zoom, to a website with a built-in microphone, which works in a very simple way. All you have to do is click on it, and then it will record, and then hit stop and it will save a file on the computer. While they’re doing this, I’m on the Zoom call with them at the same time, also recording on Zoom, so that I have a back up recording, in case something goes wrong, and also to make sure that they perform the task without repeating it, because you get more hesitation markers if you practice less. Then they send the file they’ve recorded locally back to me. What was complicated with this kind of remote data collection procedure was working in China, because many things are blocked there and sending files through WeChat can get complicated. They only allow files of up to a certain size. So, I have to use a Chinese Drive program to send videos, like the Pear Stories video, and having participants then send the recording back to me through that. The website in question is the Simple Recorder.js demo. It’s really easy to use, and it’s easy for the informant to access and work with. Sometimes it doesn’t work in China for some reason, but it worked after some trials last time.

How easy is it to find participants this way? Is it easier than before, or more difficult?

Getting participants to give me their time is easier. Since it’s all online, it’s less effort for them and they find it very simple to just do a 15-minute task. If you’re in Hong Kong, there’s really no risk to going to a sound booth to record, especially at the university. Some of the people I talked to, for instance, who were from my university could have come to the sound booth, but I couldn’t convince them to do that, because they were accustomed to everything being online. So, even in Hong Kong, people want to do it online if there is the possibility, and they are very insistent on it now. So, to collect data online is not too difficult. People are happy to, usually. It’s also very schedule-friendly because we just have to make it work for both of us on the computer at the same time, we can be at home, so it’s not difficult to be flexible with time. I had one informant who wanted to do it at 7 or 8 pm, which was fine. So that’s helpful to both parties.

Could you single out top three positive lessons that have come out of this whole situation for you, on the basis of your own research and your own experience?

It’s really made me aware of how important it is to be flexible in your approach to data gathering. I used to be very strict, but the current situation has really made me realize that you need to be very open-minded about the technology you use, and be able to rise to an eventuality or issues that might show up. It’s really made me aware of the different options at my disposal that my university provides, which are very helpful to managing research, even in a situation like this, which no one could predict. The other benefit is how it’s much easier to do research. It’s much easier to get people to participate for no compensation if it’s online because it’s less effort for them.

How do you think things are going to progress in the next couple of years in terms of how this might influence the way that we do linguistic data collection?

My impression from the subjects I work with is that people are going to expect increasingly that things can be done online, unless we absolutely have to do it in person. So I think it’s something that people will come to expect increasingly in the future. And probably, with or without a pandemic, we have to plan for participants being more comfortable with doing things online and that, in some cases, they might refuse to meet in person simply because they realize how much easier it would be to do it online. So, you might have to insist more, if you really need people to be there in person. I think that’s not going away. I think this really made it apparent how useful these tools can be and how not everything has to be in person. Which is a good thing and a bad thing. But I think that people are going to get more used to that in the future. That’s my prediction.

What would be some ways in which we can make this kind of data collection a bit more attractive or convincing for participants to take part?

Sometimes when some speakers are difficult to convince, I try to stress how important it is for me to do this kind of data collection in the appropriate way, since I work with acoustic data, and it’s very important that I get good quality. But sometimes participants insist that they don’t want to come to the sound booth. So, I end up talking to them a lot about the dialect situation in China, and why this research is important on a larger scale, as well as about language documentation in general. And I think though they usually respond ambivalently to that, it’s an avenue that I should pursue more, making my research relatable or real to people, and talking about the practical aspects of it. It’s good to make things practical, I think, or find ways to connect linguistic research to people’s lives, so that the importance that you put in your own work can be apparent to other people, and so that people can see the value that I see in it.