回到主页

How far away?

some sort of distances

· Data mining

How close we are? Well, it seems to be a romantic question to answer. Can you believe, there are dozens of methods to measure the distance between you and I.

broken image

Firstly, Euclidean is the most easiest way to measure the distance between two points. The distance between two point is an absolute value, given by the equation as following,

broken image

Manhattan distance, also called taxicab metrics, is the sum of the absolute differences of the two points' coordinates. If the green line represents Euclidean distance, then the red one, the blue one and the yellow one is the Manhattan distance, which is the same value between the two points.

broken image

While there are two other terms to describe similarity, the first one is cosine similarity. The value derives from -1 to 1. 1 represents the same, 0 represents the two things are independent, while -1 represents to the opposite.

broken image

Secondly, I would like to introduce Jaccard similarity. It uses to describe the similarity between two stacks, or two sample sets. The Jaccard coefficient is from 0 to 1, 0 means the two sets are not overlay with each other, while 1 means they are exactly the same.

broken image