

Returns an anonymous array of sentences (without POS tags) from a text. Applies add_tags and reformats to be easier to read. Return an easy-on-the-eyes tagged version of a text string. * Recursively extract all noun phrases from the MNPs get_readable TEXT Applies add_tags and involves three stages: Given a text string, return as many nouns and noun phrases as possible. Relax the Hidden Markov Model: this may improve accuracy for uncommon words, particularly words used polysemously METHODS add_tags TEXTĮxamine the string provided and return it fully tagged (XML style) add_tags_incrementally TEXTĮxamine the string provided and return it fully tagged (XML style) but do not reset the internal part-of-speech state between invocations. This affects only the get_words() and get_nouns() methods. Will ignore noun phrases longer than this threshold.

When returning occurrence counts for a noun phrase, multiply the value by the number of words in the NP. Stem single words using Lingua::Stem::EN weight_noun_phrases => 0 Takes a hash with the following parameters (shown with default values): unknown_word_tag => '' CONSTRUCTOR new %PARAMSĬlass constructor. The tagger also extracts as many nouns and noun phrases as it can, using a set of regular expressions. Unknown words are classified according to word morphology or can be set to be treated as nouns or other parts of speech.

The tagger assigns appropriate tags based on conditional probabilities - it examines the preceding tag to determine the appropriate tag for the current word. The module is a probability based, corpus-trained tagger that assigns POS tags to English text based on a lookup dictionary and a set of probability values. My $readable_text = $p->get_readable($text) DESCRIPTION # Get a readable version of the tagged text # Get a list of all nouns and noun phrases with occurrence counts Lingua::EN::Tagger - Part-of-speech tagger for English natural language processing.
