Module l555 :: Class Corpus
[show private | hide private]
[frames | no frames]

Class Corpus


a class representing a corpus (text). Includes methods for performing several different operations on the corpus such as retrieving words and sentences.

Now I try out some restructured text


Method Summary
  letters(self)
return a list of letters from a corpus object
  readFile(self, fileName)
slurp in a whole corpus into a string
  removePunctuation(self)
return text without punctuation from self.text
  sents(self)
return a list of sentences from a corpus objects
  toFreqs(self, type, proportional)
return a dictionary of word (default) or letter frequencies The proportional argument, when true, returns a dictionary with frequencies relative to the total number of words in the list
  typeTokenRatio(self, type)
returns the type token ratio for the corpus, based on either words (default) or letters
  words(self)
return a list of words from a corpus object

Method Details

letters(self)

return a list of letters from a corpus object

readFile(self, fileName)

slurp in a whole corpus into a string

removePunctuation(self)

return text without punctuation from self.text

sents(self)

return a list of sentences from a corpus objects

toFreqs(self, type='word', proportional=False)

return a dictionary of word (default) or letter frequencies The proportional argument, when true, returns a dictionary with frequencies relative to the total number of words in the list

typeTokenRatio(self, type='word')

returns the type token ratio for the corpus, based on either words (default) or letters

words(self)

return a list of words from a corpus object


Generated by Epydoc 2.1 on Mon Nov 10 14:13:13 2008 http://epydoc.sf.net