问题
给定一个均值100,标准差25的正态分布,分别生成大小为5、25、100、500的样本,计算每个样本的均值和标准差。观察样本大小对统计量的影响。
Python答题框架
import numpy as np
POPULATION_MU = 100
POPULATION_SIGMA = 25
sample_sizes = [5, 25, 100, 500]
# 你的代码在此
for size in sample_sizes:
# 生成样本并计算统计量
# 打印结果
完整数据
使用以下参数生成正态分布样本:
np.random.seed(321) # 保证可重复性
a. 计算均值
给定三个数据集X、Y、Z,计算它们的均值。
Python答题框架
X = [31.,6.,21.,...,47.] # 完整数据见下文
Y = [15.,41.,33.,...,17.]
Z = [38.,23.,16.,...,6.]
# 你的代码在此
print("X均值:", np.mean(X))
print("Y均值:", np.mean(Y))
print("Z均值:", np.mean(Z))
完整数据
X = [31.,6.,21.,32.,41.,4.,48.,38.,43.,36.,50.,20.,46.,33.,8.,27.,17.,44.,16.,39.,3.,37.,35.,13.,49.,2.,18.,42.,22.,25.,15.,24.,11.,19.,5.,40.,12.,10.,1.,45.,26.,29.,7.,30.,14.,23.,28.,0.,34.,9.,47.]
Y = [15.,41.,33.,29.,3.,28.,28.,8.,15.,22.,39.,38.,22.,10.,39.,40.,24.,15.,21.,25.,17.,33.,40.,32.,42.,5.,39.,8.,15.,25.,37.,33.,14.,25.,1.,31.,45.,5.,6.,19.,13.,39.,18.,49.,13.,38.,8.,25.,32.,40.,17.]
Z = [38.,23.,16.,35.,48.,18.,48.,38.,24.,27.,24.,35.,37.,28.,11.,12.,31.,-1.,9.,19.,20.,0.,23.,33.,34.,24.,14.,28.,12.,25.,53.,19.,42.,21.,15.,36.,47.,20.,26.,41.,33.,50.,26.,22.,-1.,35.,10.,25.,23.,24.,6.]
b. 正态性检验
使用Jarque-Bera检验判断数据分布的正态性。
Python答题框架
from statsmodels.stats.stattools import jarque_bera
# 你的代码在此
jb_test_X = jarque_bera(X)
print("X的JB检验p值:", jb_test_X[1])
c. 估计不稳定性
绘制三个数据集的直方图并标注样本均值。
Python答题框架
import matplotlib.pyplot as plt
# 你的代码在此
plt.hist([X, Y, Z], bins=20, alpha=0.7)
plt.axvline(np.mean(X), color='blue')
plt.axvline(np.mean(Y), color='orange')
plt.axvline(np.mean(Z), color='green')
a. 不同窗口长度的影响
使用yfinance获取THO和BIL的历史数据,计算不同窗口长度(50,150,300)下滚动夏普比率的均值和标准差。
Python答题框架
import yfinance as yf
import pandas as pd
start = '2010-01-01'
end = '2015-01-01'
# 下载数据
data = yf.download(['THO', 'BIL'], start=start, end=end)
tho_prices = data['Close']['THO']
bil_prices = data['Close']['BIL']
# 计算收益率
returns = tho_prices.pct_change().dropna()
treasury_ret = bil_prices.pct_change().dropna()
def sharpe_ratio(asset, riskfree):
return np.mean(asset - riskfree)/np.std(asset - riskfree)
# 你的代码在此
for window in [50, 150, 300]:
# 计算滚动夏普比率
# 计算统计量
b. 样本外不稳定性
绘制不同窗口长度的夏普比率时序图,标注均值线和标准差区间。
Python答题框架
# 示例:绘制50日窗口
window = 50
running_sharpe = [...]
plt.plot(running_sharpe)
plt.axhline(mean, color='red')
plt.fill_between(..., mean±std, alpha=0.2)
a. 波士顿气温
计算2015年波士顿周均气温的均值和标准差。
Python答题框架
b15_df = pd.DataFrame([29.,22.,...,28.],
index=pd.date_range('2012-01-01', periods=52, freq='W'),
columns=['Weekly Avg Temp'])
print("均值:", b15_df.mean())
b. 帕罗奥图气温
计算2015年帕罗奥图周均气温的均值和标准差。
Python答题框架
类似4a,替换为帕罗奥图数据
c. 预测2016年气温
用2015年均值预测2016年气温,计算平均预测误差。
Python答题框架
# 绘制直方图
plt.hist(b16_data)
plt.axvline(2015_mean)
# 计算MAE
error = np.abs(b16_data - b15_mean)
print("MAE:", np.mean(error))
说明
pip install yfinance