Thursday, November 8, 2018

Chat History as a Corpus?

Well, I just did something pretty weird and nerdy.

I figured out a way to download both my skype chat history (that was easy) and my hangouts history (still easy but only because of this guy's parser) to make an objective pile of chat history. A...chat corpus.

Except it is not a real corpus because I don't know enough about how to format the files, and also as of now there are two kinds of files: one is an .html the other a .txt. I guess it will have to be in a .txt file in order to run it through a concordancer (that is the word I did not know yet) in order to look at the language quickly and much more objectively.

Actually, I decided to make several...uh...pre-corpora piles...divided by who I was texting with. By the way, the person who I chat most with on hangouts is, surprisingly, my brother.

As interesting as this is for me on a personal level, it's also very interesting to me on a much broader level. If I can do this on a small scale, it can certainly be done on a much bigger scale. One of my oldest online friends once told me that he thinks that online friends are exactly the same as in person friends.

I've wondered a lot about if that is true. But since it was impossible to measure, it remained an intuitive, gut feeling. Not something really answerable. Sure, good fodder for pontification (this blog was almost named pontifiKate for the record) and rantings on blogs that nobody reads. But that's about it - nothing you can really do about it.

But you can analyze and measure properties of written language by sticking 'em in a corpus.

I suppose that it would not be impossible to measure spoken text, but perhaps we are limited because there is no way to do so that is socially um...fluid? Acceptable? Legal? First, there are not very many spoken language corpora (it's possible - but it would be a problem to convert the language into a written form, and you'd lose data). It's not like you walk around with a tape recorder on your shoulder collecting spoken text - and even if you did, it would be altered because of the mere presence of the recorder (people would not say the same things because they see the recorder). Whereas with texting, you have a written record. It is there. It is downloadable. It is analyzable. It is...corporeal. Corporeable. Hahaha.

So you probably could not compare IRL communication with online communication after all, and perhaps then, there is no way to answer that question about whether or not friends are friends are friends, or if the ones in real life are more or less valuable than the online ones, then how.

But still, analyzing the written texting communication, you could find interesting answers to questions about how people interact - you can even measure hesitation and even like, emotions (!) because of time stamps and emojis and punctuation and...

You could also analyze this kind of communicative language and look for patterns in language learning. You could even potentially measure progress in L2 learning (perhaps based on hesitancy/reaction time?). This could then inform best practices in teaching/learning an L2 (or 3 or 4 or whatever). I mean, we live in a digital world. I obviously firmly believe that there is no reason I should be prevented from learning Czech just because I'm an Iowa housewife with no foreseeable access to language immersion. I already have long considered texting in my L4 to be a language learning tool, although I am a firm believer in communicative language teaching and learning - authenticity is king, meaning: I probably over-invest in the building-the-relationship-with-my-collaborators side of the equation, meaning, I definitely do not view texting as a means to an end, but as the basis of real, important relationships in my actual life. Meaning: I don't look at my collaborators as chatbotičky. Meaning there is a lot of banter in English that I don't consider wasted time. But maybe I should. :::shrug:::

Most of all, you could use such corpora to better understand how people and relationships work. I care about that.

In fact, perhaps that is one of the things I care about most of all.

Wow, I know I've been a fan of instant messaging since I was a little girl of about 9 years old and against my parents' wishes I download AIM onto our dial-up computer so I could text with my friends down the street or across town. But I didn't realize my love of and interaction with this kind of communication could lead me to explore ways in which texting could be analyzed in order to inform language learning, psychology, and even forming friends as an adult.

This is exactly the kind of stuff I can see myself studying for an advanced degree. Cool! Exciting!

No comments:

Post a Comment