问题描述
dataset1 = data.frame(
caliber = c("5000","2500","1250","625","312.5","156","80","40","20","0"),var1 = c(NA,NA,30458,13740,11261,9729,5039,3343,367),var2 = c(463000,271903,154611,87204,47228,28082,14842,8474,5121,1308),var3 = c(308385,184863,89719,48986,27968,18557,9191,5248,3210,703),var4 = c(290159,149061,64045,36864,19092,12515,6805,3933,2339,574),var5 = c(270801,163657,51642,48197,23582,14544,7877,4389,2663,482),var6 = c(NA,37316,21305,11823,5692,3070,1781,363))
描述口径和其他变量之间关系的最好方法是用一个 2 次多项式方程:var = poly(caliber,2,raw=T)
我的问题是如何使用一组新变量来识别 calibre 变量的值。正如你在下面看到的,我已经有了每个变量的结果,但我需要确定口径的值。
dataset2 = data.frame(
caliber = c(NA,NA),var1 = c(1120,1296,1132,1280,1096,1124,1004,8384,1072,1104,1568,1044,1108,1012),var2 = c(5044,4924,5088,4804,4824,4844,4964,4788,4944),var3 = c(2836,2744,2668,2688,2940,2756,2720,2892,2636,2700,2836,2668),var4 = c(8872,61580,3036,4468,12132,3000,7920,6868,6896,9392,4728,21076,3228),var5 = c(2312,4236,1928,4448,2388,2108,3644,3060,2168,1912,1812,3528,4100,2176),var6 = c(1156,1228,1224,1364,1128,1176,1184,1640,1188,1300,1332,1152))
我知道之前有一些关于这个主题的帖子,比如
- predict x values from simple fitting and annoting it in the plot
- Predict X value from Y value with a fitted model
- get x-value given y-value: general root finding for linear / non-linear interpolation function
- aproxfun function from binsmooth package,find x from y value
但没有任何帮助。主要问题是:
formula <- lm(var2~poly(caliber,raw=T),dataset1)
approx(x = formula$fitted,y = formula$caliber,xout = 0)$y
公式$caliber 的 NA 值
mod<-lm(var2~poly(caliber,data=dataset1); summary(mod)
newdata=data.frame("var2"=dataset2[1:24,c("var2")])
pred<-predict(mod,newdata,type = 'response')
poly(calibre,coefs = list(alpha = c(998.35,3691.21383929929:object 'caliber' not found) 中的错误
无法将预测传递给另一个数据集
不同行的数据集
解决方法
根据讨论,我所了解的,我为您提供以下解决方案
dataset1 = data.frame(
caliber = c(5000,2500,1250,625,312.5,156,80,40,20,0),var1 = c(NA,NA,30458,13740,11261,9729,5039,3343,367),var2 = c(463000,271903,154611,87204,47228,28082,14842,8474,5121,1308),var3 = c(308385,184863,89719,48986,27968,18557,9191,5248,3210,703),var4 = c(290159,149061,64045,36864,19092,12515,6805,3933,2339,574),var5 = c(270801,163657,51642,48197,23582,14544,7877,4389,2663,482),var6 = c(NA,37316,21305,11823,5692,3070,1781,363))
formula <- lm(caliber ~ poly(var2,degree = 2,raw=T),dataset1)
dataset2 = data.frame(
caliber = c(NA,NA),var1 = c(1120,1296,1132,1280,1096,1124,1004,8384,1072,1104,1568,1044,1108,1012),var2 = c(5044,4924,5088,4804,4824,4844,4964,4788,4944),var3 = c(2836,2744,2668,2688,2940,2756,2720,2892,2636,2700,2836,2668),var4 = c(8872,61580,3036,4468,12132,3000,7920,6868,6896,9392,4728,21076,3228),var5 = c(2312,4236,1928,4448,2388,2108,3644,3060,2168,1912,1812,3528,4100,2176),var6 = c(1156,1228,1224,1364,1128,1176,1184,1640,1188,1300,1332,1152))
predict(formula,dataset2,type = 'response')
predict
函数的输出将为您提供 dataset2 中 calibre 的值。
我已经更正了您的数据集1。如果将值放在双引号内,它就会变成字符。所以,我从 caliber
变量中删除了双引号。