Jaccard Class Reference

Class for storing and computing the Jaccard index (Tanimoto coefficient). More...

Inheritance diagram for Jaccard:
BinaryDataCorrelationMatrix CorrelationMatrix

List of all members.

Public Member Functions

void AddEntity (int entity_id)
 Add an entity to the CorrelationMatrix by growing it to the requested size.
override void ComputeCorrelations (IBooleanMatrix entity_data)
 Compute the correlations from an implicit feedback, positive-only dataset.
int[] GetNearestNeighbors (int entity_id, uint k)
 Get the k nearest neighbors of a given entity.
IList< int > GetPositivelyCorrelatedEntities (int entity_id)
 Get all entities that are positively correlated to an entity, sorted by correlation.
 Jaccard (int num_entities)
 Creates an object of type Jaccard.
double SumUp (int entity_id, ICollection< int > entities)
 Sum up the correlations between a given entity and the entities in a collection.
void Write (StreamWriter writer)
 Write out the correlations to a StreamWriter.

Static Public Member Functions

static float ComputeCorrelation (HashSet< int > vector_i, HashSet< int > vector_j)
 Computes the Jaccard index of two binary vectors.
static CorrelationMatrix Create (int num_entities)
 Creates a correlation matrix.
static CorrelationMatrix Create (IBooleanMatrix vectors)
 Creates a Jaccard index matrix from given data.
static CorrelationMatrix ReadCorrelationMatrix (StreamReader reader)
 Creates a CorrelationMatrix from the lines of a StreamReader.

Protected Attributes

int num_entities
 Number of entities, e.g. users or items.

Properties

override bool IsSymmetric [get]
 returns true if the matrix is symmetric, which is generally the case for similarity matrices

Detailed Description

Class for storing and computing the Jaccard index (Tanimoto coefficient).

The Jaccard index is often also called the Tanimiti coefficient.

http://en.wikipedia.org/wiki/Jaccard_index


Constructor & Destructor Documentation

Jaccard ( int  num_entities  )  [inline]

Creates an object of type Jaccard.

Parameters:
num_entities the number of entities

Member Function Documentation

void AddEntity ( int  entity_id  )  [inline, inherited]

Add an entity to the CorrelationMatrix by growing it to the requested size.

Note that you still have to correctly compute and set the entity's correlation values

Parameters:
entity_id the numerical ID of the entity
static float ComputeCorrelation ( HashSet< int >  vector_i,
HashSet< int >  vector_j 
) [inline, static]

Computes the Jaccard index of two binary vectors.

Parameters:
vector_i the first vector
vector_j the second vector
Returns:
the cosine similarity between the two vectors
override void ComputeCorrelations ( IBooleanMatrix  entity_data  )  [inline, virtual]

Compute the correlations from an implicit feedback, positive-only dataset.

Parameters:
entity_data the implicit feedback set, rows contain the entities to correlate

Implements BinaryDataCorrelationMatrix.

static CorrelationMatrix Create ( int  num_entities  )  [inline, static, inherited]

Creates a correlation matrix.

Gives out a useful warning if there is not enough memory

Parameters:
num_entities the number of entities
Returns:
the correlation matrix
static CorrelationMatrix Create ( IBooleanMatrix  vectors  )  [inline, static]

Creates a Jaccard index matrix from given data.

Parameters:
vectors the boolean data
Returns:
the similarity matrix based on the data
int [] GetNearestNeighbors ( int  entity_id,
uint  k 
) [inline, inherited]

Get the k nearest neighbors of a given entity.

Parameters:
entity_id the numerical ID of the entity
k the neighborhood size
Returns:
an array containing the numerical IDs of the k nearest neighbors
IList<int> GetPositivelyCorrelatedEntities ( int  entity_id  )  [inline, inherited]

Get all entities that are positively correlated to an entity, sorted by correlation.

Parameters:
entity_id the entity ID
Returns:
a sorted list of all entities that are positively correlated to entitiy_id
static CorrelationMatrix ReadCorrelationMatrix ( StreamReader  reader  )  [inline, static, inherited]

Creates a CorrelationMatrix from the lines of a StreamReader.

In the first line, we expect to be the number of entities. All the other lines have the format

		      EntityID1 EntityID2 Correlation
		    

where EntityID1 and EntityID2 are non-negative integers and Correlation is a floating point number.

Parameters:
reader the StreamReader to read from
double SumUp ( int  entity_id,
ICollection< int >  entities 
) [inline, inherited]

Sum up the correlations between a given entity and the entities in a collection.

Parameters:
entity_id the numerical ID of the entity
entities a collection containing the numerical IDs of the entities to compare to
Returns:
the correlation sum
void Write ( StreamWriter  writer  )  [inline, inherited]

Write out the correlations to a StreamWriter.

Parameters:
writer A StreamWriter

Member Data Documentation

int num_entities [protected, inherited]

Number of entities, e.g. users or items.


The documentation for this class was generated from the following file:
Generated on Thu Apr 5 01:11:31 2012 for MyMediaLite by  doxygen 1.6.3