In this post we will review the relative variability, and check how can we use it to compare variance of 2 different sets.
First let's create a function that gets a set of numbers:
func stats(numbers []float64) {
// ...
}
Calculate the mean:
sum := float64(0)
for _, element := range numbers {
sum += element
}
mean := sum / float64(len(numbers))
fmt.Printf("mean: %v\n", mean)
The variance and the standard deviation:
variance := float64(0)
for _, element := range numbers {
distanceSqr := math.Pow(element-mean, 2)
variance += distanceSqr
}
variance = variance / float64(len(numbers))
std := math.Sqrt(variance)
fmt.Printf("variance: %v\nstd: %v\n", variance, std)
and lastly the relative variability:
relativeVariability := std / math.Abs(mean)
fmt.Printf("relative variability: %v\n", relativeVariability)
The relative variability can be used to compare between different sets, let examine some examples.
The basic example is a set which is a constant number - all identical.
fmt.Println("=== constant ===")
numbers := make([]float64, 1000)
for i := 0; i < len(numbers); i++ {
numbers[i] = 100
}
stats(numbers)
And the result is
=== constant === mean: 100 variance: 0 std: 0 relative variability: 0
Which is quite expected. But let us check two other sets, one is using random numbers in range 0-100, and the other is using random number is range 0-1000.
fmt.Println("=== random 0-100 ===")
numbers = make([]float64, 1000)
for i := 0; i < len(numbers); i++ {
numbers[i] = rand.Float64() * 100
}
stats(numbers)
fmt.Println("=== random 0-1000 ===")
numbers = make([]float64, 1000)
for i := 0; i < len(numbers); i++ {
numbers[i] = rand.Float64() * 1000
}
stats(numbers)
And the result is:
=== random 0-100 === mean: 50.761294848805164 variance: 855.2429996004388 std: 29.24453794472463 relative variability: 0.5761188328987829 === random 0-1000 === mean: 504.7749855147492 variance: 84752.21156230739 std: 291.122330923458 relative variability: 0.5767368417168754
We can see that the relative variance of the 2 sets is almost identical. This enables us to deduct that the variance behaves in a similar way.
What about slicing different sizes of samples?
In the following, we will use different sections of the same set.
numbers = make([]float64, 100000)
for i := 0; i < len(numbers); i++ {
numbers[i] = rand.Float64() * 100
}
fmt.Println("=== random using sections - size 10 ===")
stats(numbers[:10])
fmt.Println("=== random using sections - size 100 ===")
stats(numbers[:100])
fmt.Println("=== random using sections - size 1000 ===")
stats(numbers[:1000])
fmt.Println("=== random using sections - size 10000 ===")
stats(numbers[:10000])
fmt.Println("=== random using sections - size 100000 ===")
stats(numbers)
And the result is:
=== random using sections - size 10 === mean: 39.58300235324653 variance: 768.6101314621283 std: 27.723818847015437 relative variability: 0.7003970694189037 === random using sections - size 100 === mean: 46.59611065676921 variance: 968.7449099481023 std: 31.124667226303036 relative variability: 0.6679670639373723 === random using sections - size 1000 === mean: 49.66072506547081 variance: 814.3409641880382 std: 28.536660004072626 relative variability: 0.5746323672570423 === random using sections - size 10000 === mean: 49.99659036064225 variance: 833.8674856543009 std: 28.876763766985746 relative variability: 0.5775746617656908 === random using sections - size 100000 === mean: 50.05194561771965 variance: 831.7444664704859 std: 28.839980347955958 relative variability: 0.5762009846375657
So we can see that above a certain size of set, we get enough accuracy for the relative variability. Looks like small set suffer from varying variance. In some runs the relative variability of the set of size 10 was very high, while in other it was very low. So to use relative variability, or actually any statistic method, make sure the set is big enough.
No comments:
Post a Comment