算法-变异系数
1 统计学解释
变异系数:变异系数(coefficient of variation),又称离散系数,是一个衡量数据离散程度的、没有量纲的统计量。其值为标准差与平均值之比。
变异系数的计算公式为:
CV=标准差/均值
2 实现
2.1 依赖 numpy 实现
import numpy
def coefficient_of_variation(data):
mean=numpy.mean(data) #计算平均值
std=numpy.std(data,ddof=0) #计算标准差
cv=std/mean
return cv
data_test_1=[1,2,3,4,5,6,7]
data_test_2=[1,1,1,4,7,7,7]
print('CV_1',coefficient_of_variation(data_test_1))
print('CV_2',coefficient_of_variation(data_test_2))
结果
CV_1 0.5
CV_2 0.6943650748294136
2.2 简单实现
# Calculate the Arithmetic mean.
def mean(values):
"""
mean
Args:
values: list, eg:[1,2,3,4]
"""
if not len(values):
return 0
arithmetic_mean = sum(values) / float(len(values))
return arithmetic_mean
# Function to calculate the Standard Deviation.
def standard_deviation(values):
"""
standard_deviation
Args:
values: list, eg:[1,2,3,4]
"""
if not len(values):
return 0
value_mean = mean(values)
value_sum =0
for value in values:
value_sum +=(value-value_mean)**2;
x = (value_sum / float(len(values))) ** 0.5
return x
# Function to calculate the Coefficient of Variation
def coefficient_of_variation(values):
"""
standard_deviation
Args:
values: list, eg:[1,2,3,4]
"""
if not len(values):
return 0
value_mean=mean(values)
coefficient_of_variation = (standard_deviation(values) / float(value_mean))
return coefficient_of_variation
if __name__ == "__main__":
assert coefficient_of_variation([1, 1, 1]) == 0
values=[1,2,3,4,5,6,7]
assert mean(values) == 4
assert standard_deviation(values) == 2
assert coefficient_of_variation(values) == 0.5
values=[1,1,1,4,7,7,7]
print coefficient_of_variation(values)
3 实际应用
变异系数通常用来比较两组量纲差异明显的数据的离散程度,例如两个粉丝数差距显著的社交媒体账号推文点赞数的离散程度。
Last updated