在Python中,可以使用以下方法來去除一組數據中的異常數據:
data = [1, 2, 3, 4, 5, 100, 6, 7, 8, 200]
threshold = 10
cleaned_data = [x for x in data if x <= threshold]
import numpy as np
data = [1, 2, 3, 4, 5, 100, 6, 7, 8, 200]
mean = np.mean(data)
std = np.std(data)
threshold = 2.0
cleaned_data = [x for x in data if abs(x - mean) <= threshold * std]
scipy.stats.zscore
函數進行標準化,并將標準化后的數據與給定的閾值進行比較,將超過閾值的數據視為異常數據。以下是示例代碼:from scipy import stats
data = [1, 2, 3, 4, 5, 100, 6, 7, 8, 200]
threshold = 2.0
z_scores = stats.zscore(data)
cleaned_data = [x for x, z in zip(data, z_scores) if abs(z) <= threshold]
根據具體需求和數據特點,選擇適合的方法來去除異常數據。