|
CS 1705 Library | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectjava.util.AbstractCollection<E>
java.util.AbstractList<E>
java.util.ArrayList<StringNormalizer.NormalizerRule>
net.sf.webcat.StringNormalizer
public class StringNormalizer
This class represents a programmable string "normalizing" engine that
can be used to convert strings into a canonical form, say, before
comparing strings for equality or something. Basically, a normalizer
is a list of zero or more rules, or transformations. The
normalize(String) method can be used to apply the entire
set of transformations to a given string.
For example, you can build a string normalizer that replaces all
sequences of one or more whitespace characters by a single space
character, trims any leading or trailing space, and converts a
string to lower case. This class provides a number of predefined
transformations in the StringNormalizer.StandardRule enumeration.
Some examples:
// An "identity" transformation that does nothing:
StringNormalizer norm1 = new StringNormalizer();
// norm1.normalize(...) returns its argument unchanged
// A "lower case" normalizer:
StringNormalizer norm2 = new StringNormalizer(
StringNormalizer.StandardRule.IGNORE_CAPITALIZATION);
// norm2.normalize(...) returns a lower case version of its argument
// self-explanatory:
StringNormalizer norm3 = new StringNormalizer(
StringNormalizer.StandardRule.IGNORE_CAPITALIZATION,
StringNormalizer.StandardRule.IGNORE_PUNCTUATION);
// A "standard" normalizer:
StringNormalizer norm4 = new StringNormalizer(true);
// norm4.normalize(...) returns its contents with all punctuation
// characters removed, all letters converted to lower case, all
// whitespace sequences replaced by single spaces, all MS-DOS or
// Mac line terminators replaced by "\n"'s, and all leading and
// trailing whitespace removed.
Note that string normalizers that contain multiple rules apply those
rules in order (i.e., in the order added, or the
List order of this class). This may produce
inconsistent results if you are not careful when you add your rules.
| Nested Class Summary | |
|---|---|
static interface |
StringNormalizer.NormalizerRule
This interface defines what it means to be a normalizer rule: an object having an appropriate StringNormalizer.NormalizerRule.normalize(String) method. |
static class |
StringNormalizer.RegexNormalizerRule
A highly reusable concrete implementation of StringNormalizer.NormalizerRule
that applies a series of regular expression
substitutions. |
static class |
StringNormalizer.StandardRule
This enumeration defines the set of predefined transformation rules. |
| Constructor Summary | |
|---|---|
StringNormalizer()
Creates a new StringNormalizer object containing no rules (the "identity" normalizer). |
|
StringNormalizer(boolean useStandardRules)
Creates a new StringNormalizer object, optionally containing the standard set of rules. |
|
StringNormalizer(Collection<? extends StringNormalizer.NormalizerRule> rules)
Creates a new StringNormalizer object containing the given set of rules. |
|
StringNormalizer(StringNormalizer.NormalizerRule... rules)
Creates a new StringNormalizer object containing the given set of rules. |
|
StringNormalizer(StringNormalizer.StandardRule... rules)
Creates a new StringNormalizer object containing the given set of rules. |
|
| Method Summary | |
|---|---|
boolean |
add(StringNormalizer.NormalizerRule rule)
Add the specified rule. |
void |
add(StringNormalizer.StandardRule rule)
Add the specified standard rule, as defined in StringNormalizer.StandardRule. |
void |
addStandardRules()
Add the standard set of rules. |
String |
normalize(String content)
Normalize a string by applying a set of normalization rules (transformations). |
void |
remove(StringNormalizer.StandardRule rule)
Remove the specified standard rule, as defined in StringNormalizer.StandardRule. |
static StringNormalizer.NormalizerRule |
standardRule(StringNormalizer.StandardRule rule)
Retrieve a standard rule by name. |
| Methods inherited from class java.util.ArrayList |
|---|
add, addAll, addAll, clear, clone, contains, ensureCapacity, get, indexOf, isEmpty, lastIndexOf, remove, remove, set, size, toArray, toArray, trimToSize |
| Methods inherited from class java.util.AbstractList |
|---|
equals, hashCode, iterator, listIterator, listIterator, subList |
| Methods inherited from class java.util.AbstractCollection |
|---|
containsAll, removeAll, retainAll, toString |
| Methods inherited from class java.lang.Object |
|---|
getClass, notify, notifyAll, wait, wait, wait |
| Methods inherited from interface java.util.List |
|---|
containsAll, equals, hashCode, iterator, listIterator, listIterator, removeAll, retainAll, subList |
| Constructor Detail |
|---|
public StringNormalizer()
public StringNormalizer(boolean useStandardRules)
StringNormalizer.StandardRule exception the OPT_* rules.
useStandardRules - If true, the set of standard (non-OPT_*)
rules will be used. If false, an "identity" normalizer will be
produced instead.public StringNormalizer(StringNormalizer.StandardRule... rules)
rules - a (variable-length) comma-separated sequence of
rules to addpublic StringNormalizer(StringNormalizer.NormalizerRule... rules)
rules - a (variable-length) comma-separated sequence of
rules to addpublic StringNormalizer(Collection<? extends StringNormalizer.NormalizerRule> rules)
rules - a collection of rules to add (could be another
StringNormalizer, or any other kind of collection)| Method Detail |
|---|
public String normalize(String content)
content - The string to transform
public void addStandardRules()
StringNormalizer.StandardRule exception the OPT_* rules.
public void add(StringNormalizer.StandardRule rule)
StringNormalizer.StandardRule.
Note that you can also use the inherited
List.add(Object) method to add custom NormalizerRule
objects.
rule - The rule to addpublic boolean add(StringNormalizer.NormalizerRule rule)
add in interface Collection<StringNormalizer.NormalizerRule>add in interface List<StringNormalizer.NormalizerRule>add in class ArrayList<StringNormalizer.NormalizerRule>rule - The rule to add
public void remove(StringNormalizer.StandardRule rule)
StringNormalizer.StandardRule.
Note that you can also use the inherited
List.remove(Object) method to remove other kinds
of NormalizerRule objects.
rule - The rule to removepublic static StringNormalizer.NormalizerRule standardRule(StringNormalizer.StandardRule rule)
rule - the rule to retrieve
StringNormalizer.NormalizerRule
|
Last updated: Wed, Apr 1, 2009 12:29 AM EDT | |||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||