Wednesday, August 11, 2021

Relative Variability

 

In this post we will review the relative variability, and check how can we use it to compare variance of 2 different sets.

First let's create a function that gets a set of numbers:

func stats(numbers []float64) {
// ...
}


Calculate the mean:


sum := float64(0)
for _, element := range numbers {
sum += element
}
mean := sum / float64(len(numbers))
fmt.Printf("mean: %v\n", mean)



The variance and the standard deviation:


variance := float64(0)
for _, element := range numbers {
distanceSqr := math.Pow(element-mean, 2)
variance += distanceSqr
}
variance = variance / float64(len(numbers))
std := math.Sqrt(variance)
fmt.Printf("variance: %v\nstd: %v\n", variance, std)



and lastly the relative variability:


relativeVariability := std / math.Abs(mean)
fmt.Printf("relative variability: %v\n", relativeVariability)


The relative variability can be used to compare between different sets, let examine some examples.


The basic example is a set which is a constant number - all identical.



fmt.Println("=== constant ===")
numbers := make([]float64, 1000)
for i := 0; i < len(numbers); i++ {
numbers[i] = 100
}
stats(numbers)


And the result is



=== constant ===
mean: 100
variance: 0
std: 0
relative variability: 0



Which is quite expected. But let us check two other sets, one is using random numbers in range 0-100, and the other is using random number is range 0-1000.


fmt.Println("=== random 0-100 ===")
numbers = make([]float64, 1000)
for i := 0; i < len(numbers); i++ {
numbers[i] = rand.Float64() * 100
}
stats(numbers)

fmt.Println("=== random 0-1000 ===")
numbers = make([]float64, 1000)
for i := 0; i < len(numbers); i++ {
numbers[i] = rand.Float64() * 1000
}
stats(numbers)



And the result is:

=== random 0-100 ===
mean: 50.761294848805164
variance: 855.2429996004388
std: 29.24453794472463
relative variability: 0.5761188328987829
=== random 0-1000 ===
mean: 504.7749855147492
variance: 84752.21156230739
std: 291.122330923458
relative variability: 0.5767368417168754



We can see that the relative variance of the 2 sets is almost identical. This enables us to deduct that the variance behaves in a similar way.


What about slicing different sizes of samples?
In the following, we will use different sections of the same set.



numbers = make([]float64, 100000)
for i := 0; i < len(numbers); i++ {
numbers[i] = rand.Float64() * 100
}
fmt.Println("=== random using sections - size 10 ===")
stats(numbers[:10])
fmt.Println("=== random using sections - size 100 ===")
stats(numbers[:100])
fmt.Println("=== random using sections - size 1000 ===")
stats(numbers[:1000])
fmt.Println("=== random using sections - size 10000 ===")
stats(numbers[:10000])
fmt.Println("=== random using sections - size 100000 ===")
stats(numbers)



And the result is:

=== random using sections - size 10 ===
mean: 39.58300235324653
variance: 768.6101314621283
std: 27.723818847015437
relative variability: 0.7003970694189037
=== random using sections - size 100 ===
mean: 46.59611065676921
variance: 968.7449099481023
std: 31.124667226303036
relative variability: 0.6679670639373723
=== random using sections - size 1000 ===
mean: 49.66072506547081
variance: 814.3409641880382
std: 28.536660004072626
relative variability: 0.5746323672570423
=== random using sections - size 10000 ===
mean: 49.99659036064225
variance: 833.8674856543009
std: 28.876763766985746
relative variability: 0.5775746617656908
=== random using sections - size 100000 ===
mean: 50.05194561771965
variance: 831.7444664704859
std: 28.839980347955958
relative variability: 0.5762009846375657


So we can see that above a certain size of set, we get enough accuracy for the relative variability. Looks like small set suffer from varying variance. In some runs the relative variability of the set of size 10 was very high, while in other it was very low. So to use relative variability, or actually any statistic method, make sure the set is big enough.






No comments:

Post a Comment