Thursday, May 21, 2020

Local Outlier Factor - A simple GO implementation



In this post we will review the LOF: Local Outlier Factor algorithm.
For full source of the example, see my GO sources at https://github.com/alonana/lof
An example of using the LOF is available in this post.

The LOF is based on several terms:
  • reachability distance
  • LRD
  • LOF
To implement LOF, we'll first implement K-Nearest-Neighbors detection.
Once we have the list of k nearest neighbors for each point, we can run the LOF algorithm.

The LOF provides score for each point.
A LOF score for an outlier would be higher 1.
How high?
It depends on you data, but the more clear outlier, the higher the score would be.


The LOF calculation is based on the following pseudo code:

func LOF(p) {
  sum=0
  foreach neighbor in neighbors(point) {
     sum += LRD(neighbor)
  }
  return sum / (k * LRD(p))
}

The LRD calculation is based on the following pseudo code:

func LRD(p) {
  sum=0
  foreach neighbor in neighbors(point) {
     sum += reachabilityDistance(p,neighbor)
  }
  return k / sum
}


The reachability distance calculation is based on the following pseudo code:

func reachabilityDistance(a, b) {
  return max(Distance(a,b), kDistance(b))
}

The k-distance is based on the following pseudo code:

func kDistance(p) {
  highest=0
  foreach neighbor in neighbors(point) {
     highest = max(highest, Distance(p,neighbor))
  }
  return highest
}

Again, notice that the "neighbors" used in the pseudo code, returns only the k-nearest points, and not all the neighbors.



No comments:

Post a Comment