http://blog.ynada.com/66

# NLTK code for building a corpus of Twitter messages (or any number of text files in a dir)