Class IntersectionSimilarity<T>
- java.lang.Object
-
- org.apache.commons.text.similarity.IntersectionSimilarity<T>
-
- Type Parameters:
T- the type of the elements extracted from the character sequence
- All Implemented Interfaces:
SimilarityScore<IntersectionResult>
public class IntersectionSimilarity<T> extends java.lang.Object implements SimilarityScore<IntersectionResult>
Measures the intersection of two sets created from a pair of character sequences.It is assumed that the type
Tcorrectly conforms to the requirements for storage within aSetorHashMap. Ideally the type is immutable and implementsObject.equals(Object)andObject.hashCode().- Since:
- 1.7
- See Also:
Set,HashMap
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private static classIntersectionSimilarity.BagCountMutable counter class for storing the count of elements.private classIntersectionSimilarity.TinyBagA minimal implementation of a Bag that can store elements and a count.
-
Constructor Summary
Constructors Constructor Description IntersectionSimilarity(java.util.function.Function<java.lang.CharSequence,java.util.Collection<T>> converter)Create a new intersection similarity using the provided converter.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description IntersectionResultapply(java.lang.CharSequence left, java.lang.CharSequence right)Calculates the intersection of two character sequences passed as input.private static <T> intgetIntersection(java.util.Set<T> setA, java.util.Set<T> setB)Compute the intersection between two sets.private intgetIntersection(IntersectionSimilarity.TinyBag bagA, IntersectionSimilarity.TinyBag bagB)Compute the intersection between two bags.private IntersectionSimilarity.TinyBagtoBag(java.util.Collection<T> objects)Convert the collection to a bag.
-
-
-
Field Detail
-
converter
private final java.util.function.Function<java.lang.CharSequence,java.util.Collection<T>> converter
The converter used to create the elements from the characters.
-
-
Constructor Detail
-
IntersectionSimilarity
public IntersectionSimilarity(java.util.function.Function<java.lang.CharSequence,java.util.Collection<T>> converter)
Create a new intersection similarity using the provided converter.If the converter returns a
Setthen the intersection result will not include duplicates. Any otherCollectionis used to produce a result that will include duplicates in the intersect and union.- Parameters:
converter- the converter used to create the elements from the characters- Throws:
java.lang.IllegalArgumentException- if the converter is null
-
-
Method Detail
-
apply
public IntersectionResult apply(java.lang.CharSequence left, java.lang.CharSequence right)
Calculates the intersection of two character sequences passed as input.- Specified by:
applyin interfaceSimilarityScore<T>- Parameters:
left- first character sequenceright- second character sequence- Returns:
- The intersection result
- Throws:
java.lang.IllegalArgumentException- if either input sequence isnull
-
toBag
private IntersectionSimilarity.TinyBag toBag(java.util.Collection<T> objects)
Convert the collection to a bag. The bag will contain the count of each element in the collection.- Parameters:
objects- the objects- Returns:
- The bag
-
getIntersection
private static <T> int getIntersection(java.util.Set<T> setA, java.util.Set<T> setB)Compute the intersection between two sets. This is the count of all the elements that are within both sets.- Type Parameters:
T- the type of the elements in the set- Parameters:
setA- the set AsetB- the set B- Returns:
- The intersection
-
getIntersection
private int getIntersection(IntersectionSimilarity.TinyBag bagA, IntersectionSimilarity.TinyBag bagB)
Compute the intersection between two bags. This is the sum of the minimum count of each element that is within both sets.- Parameters:
bagA- the bag AbagB- the bag B- Returns:
- The intersection
-
-