Thursday, November 15, 2018

Super Obvious DUH application of corpus linguistics to Czech as a Second Language

This always seems to happen - I have an idea and then think, "Oh man. That was so obvious. Why didn't I think of it before?"

I just, well, didn't.

Rather than picking words from my Harry Potter readings at random, I could do it in a much more focused, precise way because CL can inform me of which words are the most frequently used words. 

And if they're frequently used, obviously I can find them in Harry Potter.

Duh.

So now I just downloaded some guy's list of the top 2k most common words in Czech. I'm going to go through it and mark the ones which I definitely do know. The ones I do not know I am going to search for in a PDF of Harry Potter (post chapter 11, where I am), and highlight them for when I come across them later.

If I were more hardcore, I could probably somehow investigate the list and verify its authenticity but I am going to assume that any of the first 2k words made by almost anybody using almost any method are probably going to be pretty good, and the words I pick can't become less based on random factors (though, to be honest - I am not a complete moron! I pick words which I truly have no idea what they could mean from the context - and sometimes after I pick 'em, I figure it out. So yeah, it's not entirely random what I pick. I also try to get a variety of POS)

I seriously can't believe I didn't think of doing this earlier. It would have been much more logical.

Argh!

2 comments:

  1. Great idea! I don't think you should feel bad though. Picking the most common words is logical, but discovering them naturally is also really helpful in internalizing the words and their meaning so I think both approaches are very good.

    ReplyDelete
    Replies
    1. It is hard to discover language naturally when you don't have an immersive context, though. Since we live in Iowa, it has to be done a little bit more, shall we say, thoughtfully? Purposefully? Remember when we went to the CR last year and everybody kept saying "zrovna"? It is kind of like that. No CSL books think about that word and its frequency. It was super high on the frequency list. If the textbooks had informed themselves by using CL, then they would have included it in their examples and their students would have been able to discover it "naturally."

      I do not have regrets about what I have done so far. But it is always like that, where hindsight is 20/20, right?

      Delete