Category: Natural Language Processing

  • Really large NLP corpora

    Jeeze people. You’re all noisy. I’m sure it was all done for posterity’s sake. 23M irclogs/MagNET/#perl.log 29M irclogs/freenode/#mysql.log 36M irclogs/freenode/#debian.log 37M irclogs/foonetic/#xkcd.log 39M irclogs/OFTC/#debian.log 43M irclogs/freenode/#jquery.log 44M irclogs/freenode/#perl.log $ for file in irclogs/MagNET/#perl.log irclogs/freenode/#mysql.log irclogs/freenode/#debian.log irclogs/foonetic/#xkcd.log irclogs/OFTC/#debian.log irclogs/freenode/#jquery.log irclogs/freenode/#perl.log; do echo -n “$file: ” ; head -1 $file ; done irclogs/MagNET/#perl.log: — Log opened Thu…

  • PocketSphinx on android via the NDK

    While working on my project for the Spring ’10 “NLP on Mobile Devices” course, I put together a PocketSphinx ndk build. You can pull it down from my git repo: $ git clone git://colliertech.org/colliertech/PocketSphinx.git I haven’t written any of the JNI marshaling functions yet, though.