为什么R中用于内核密度估计的集成平方误差的ECDF不从0开始

问题描述

我正在研究内核密度的R。我需要使用400次蒙特卡洛模拟来模拟积分平方误差:ISE =∫{f ^(x)-f(x)} 2进行内核密度估计。我需要绘制ISE的ECDF。我的统计数据为Zn =(n h I-mean)/(sd * sqrt(h)),我需要绘制此统计数据的ECDF(400个数字)。我在循环中使用了两种不同的方法:1)我使用列表2)我仅使用向量,但是没有一个能给我正确的答案。 非常感谢您的帮助。

set.seed(111)

# 400 Times Monte carlos simulations( via list) for Integrated Squared Error(ISE) for kernel density estimation
set.seed(16)
list1 = list() # Make an empty list to save output in FOR FirsT ISE 
list2 =list() # Make an empty list to save output in FOR FirsT ISE 
B=400
for (i in 1:B) { # Indicate number of iterations with "i"
    x1 = rnorm(1000000,1,1) # 10^6 random numbers with mean 1,sd 1
    x2 = rnorm(1000000,-1,1)# 10^6 random numbers with mean -1,sd 1
    ind1 = rbinom(1000000,(1/3))
    z = ind1 * x1 + (1-ind1) * x2 # my final random numbers with  probabilty 1/3 from first dist and 2/3 from second dist
    z
    result1=density(z,kernel="gaussian",bw=0.01,from=-5,to=5)# kernel density for z
    myresult1=result1$y # y value of kernel
    result1$y 
    plot(result1)
    x=result1$# x value of kernel
        
        
        MF=function(x){    # my function which is mix of two normal dis
            final= (1/(3*sqrt(2*pi)))*exp(-(x-1)^2/(2)) + (2/(3*sqrt(2*pi)))*exp(-(x+1)^2/(2))
            return (final)
        }
    MF(x)
    list1[[i]] = sfsmisc::integrate.xy(x = result1$x,(result1$y - MF(x))^2) #compute Integrated Squared Error(ISE) for kernel density estimation
    list2[[i]]= 1/500000 * (sum (abs(result1$y - MF(x))^2)) #approximate ISE for 10^5 points (5-(-5))/100000=1/10000
} # Save output in list for each iteration
list1

list2

I1=as.vector(unlist(list1))# CONVERT LIST TO VECTOR FOR first ISE


I2=as.vector(unlist(list2))# CONVERT LIST TO VECTOR FOR SECOND ISE

# Zn=(n*h*I-mean)/(sd * sqrt(h)) this is my statistic,n=10^6 as sample size,mean,sd 
Zn_I1=(10^6*.01*I1 -0.057)/(2.23 * sqrt(0.01))# Statistic for first ISE:MEAN 0.057,SD=2.23 and I use 0.01 for h,n*h^4 goes to zero when n goes to infinty
Zn_I2=(10^6*.01*I2 -0.057)/(2.23 * sqrt(0.01))# Statistic for first ISE:MEAN 0.057,n*h^4 goes to zero when n goes to infinty

W1=pnorm(Zn_I1) # pnorm of first statistic
plot(ecdf(W1),xlim=c(0,1),ylim=c(0,lw=2,xlab= "Large sample limit (h = 0.35)N=100 B=400",main="Empirical cumulative distribution function")# we add this plot to prevIoUs plot
abline(0,1)

W2=pnorm(Zn_I2)
plot(ecdf(W2),1)    








# another method  400 Times Monte carlos simulations for Integrated Squared Error(ISE) for kernel density estimation


set.seed(1110)
myresult1 = rep(result1$y,1000)
myMF = rep(MF(x),1000)
myI = rep(0,400)
iii = rep(0,400)
myI_2 = rep(0,400)
myZ=rep(0,400)
myW=rep(pnorm(myZ),400)
B=400
x1 = rnorm(1000000,1)
x2 = rnorm(1000000,1)
ind1 = rbinom(1000000,(1/3))
z = ind1 * x1 + (1-ind1) * x2

for (i in 1:B){
    x1[i] = x1
    x2[i] = x2
    ind1[i]=ind1
    z[i]=z
    result1=density(z,n=1001,to=5)
    myresult1[i]=result1$y
    x[i]=result1$x
    
    MF=function(x){
        final= (1/(3*sqrt(2*pi)))*exp(-(x-1)^2/(2)) + (2/(3*sqrt(2*pi)))*exp(-(x+1)^2/(2))
        return (final)
    }
    MF(x[i])
    
    I= 1/500000 * (sum (abs(result1$y[1:1000] - MF(x))^2))
    myI[i]=I
    
    
}
myI
iii
h=0.011
nh=10^6*h
sigma_h=5*(h)
Z=(nh*myI)-0.056
Z_=Z/(sigma_h)#SIGMA 2.1 h=.35 n=100 
Z_
pnorm(Z_)
plot(ecdf(pnorm(Z_)),xlab= "Large sample limit (h = 0.011)N=10^6 B=400",1)

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)