Home | Trees | Index | Help |
---|
Module l555 :: Class Corpus |
|
a class representing a corpus (text). Includes methods for performing several different operations on the corpus such as retrieving words and sentences.
Now I try out some restructured text
Method Summary | |
---|---|
return a list of letters from a corpus object | |
slurp in a whole corpus into a string | |
return text without punctuation from self.text | |
return a list of sentences from a corpus objects | |
return a dictionary of word (default) or letter frequencies The proportional argument, when true, returns a dictionary with frequencies relative to the total number of words in the list | |
returns the type token ratio for the corpus, based on either words (default) or letters | |
return a list of words from a corpus object |
Method Details |
---|
letters(self)return a list of letters from a corpus object |
readFile(self, fileName)slurp in a whole corpus into a string |
removePunctuation(self)return text without punctuation from self.text |
sents(self)return a list of sentences from a corpus objects |
toFreqs(self, type='word', proportional=False)return a dictionary of word (default) or letter frequencies The proportional argument, when true, returns a dictionary with frequencies relative to the total number of words in the list |
typeTokenRatio(self, type='word')returns the type token ratio for the corpus, based on either words (default) or letters |
words(self)return a list of words from a corpus object |
Home | Trees | Index | Help |
---|
Generated by Epydoc 2.1 on Mon Nov 10 14:13:13 2008 | http://epydoc.sf.net |