使用拟合的 2 度多项式模型从 Y 值预测 X 值

问题描述

我有一个格式如下的数据集:

dataset1 = data.frame(
caliber = c("5000","2500","1250","625","312.5","156","80","40","20","0"),var1 = c(NA,NA,30458,13740,11261,9729,5039,3343,367),var2 = c(463000,271903,154611,87204,47228,28082,14842,8474,5121,1308),var3 = c(308385,184863,89719,48986,27968,18557,9191,5248,3210,703),var4 = c(290159,149061,64045,36864,19092,12515,6805,3933,2339,574),var5 = c(270801,163657,51642,48197,23582,14544,7877,4389,2663,482),var6 = c(NA,37316,21305,11823,5692,3070,1781,363))

描述口径和其他变量之间关系的最好方法是用一个 2 次多项式方程:var = poly(caliber,2,raw=T)

enter image description here

我的问题是如何使用一组新变量来识别 calibre 变量的值。正如你在下面看到的,我已经有了每个变量的结果,但我需要确定口径的值。

dataset2 = data.frame(
caliber = c(NA,NA),var1 = c(1120,1296,1132,1280,1096,1124,1004,8384,1072,1104,1568,1044,1108,1012),var2 = c(5044,4924,5088,4804,4824,4844,4964,4788,4944),var3 = c(2836,2744,2668,2688,2940,2756,2720,2892,2636,2700,2836,2668),var4 = c(8872,61580,3036,4468,12132,3000,7920,6868,6896,9392,4728,21076,3228),var5 = c(2312,4236,1928,4448,2388,2108,3644,3060,2168,1912,1812,3528,4100,2176),var6 = c(1156,1228,1224,1364,1128,1176,1184,1640,1188,1300,1332,1152))

我知道之前有一些关于这个主题的帖子,比如

但没有任何帮助。主要问题是:

formula <- lm(var2~poly(caliber,raw=T),dataset1)
approx(x = formula$fitted,y = formula$caliber,xout = 0)$y

公式$caliber 的 NA 值

mod<-lm(var2~poly(caliber,data=dataset1); summary(mod)
newdata=data.frame("var2"=dataset2[1:24,c("var2")])
pred<-predict(mod,newdata,type = 'response')

poly(calibre,coefs = list(alpha = c(998.35,3691.21383929929:object 'caliber' not found) 中的错误

无法将预测传递给另一个数据集

不同行的数据集

X 和 Y 之间的插值给出了错误的值

解决方法

根据讨论,我所了解的,我为您提供以下解决方案

dataset1 = data.frame(
  caliber = c(5000,2500,1250,625,312.5,156,80,40,20,0),var1 = c(NA,NA,30458,13740,11261,9729,5039,3343,367),var2 = c(463000,271903,154611,87204,47228,28082,14842,8474,5121,1308),var3 = c(308385,184863,89719,48986,27968,18557,9191,5248,3210,703),var4 = c(290159,149061,64045,36864,19092,12515,6805,3933,2339,574),var5 = c(270801,163657,51642,48197,23582,14544,7877,4389,2663,482),var6 = c(NA,37316,21305,11823,5692,3070,1781,363))

formula <- lm(caliber ~ poly(var2,degree = 2,raw=T),dataset1)

dataset2 = data.frame(
  caliber = c(NA,NA),var1 = c(1120,1296,1132,1280,1096,1124,1004,8384,1072,1104,1568,1044,1108,1012),var2 = c(5044,4924,5088,4804,4824,4844,4964,4788,4944),var3 = c(2836,2744,2668,2688,2940,2756,2720,2892,2636,2700,2836,2668),var4 = c(8872,61580,3036,4468,12132,3000,7920,6868,6896,9392,4728,21076,3228),var5 = c(2312,4236,1928,4448,2388,2108,3644,3060,2168,1912,1812,3528,4100,2176),var6 = c(1156,1228,1224,1364,1128,1176,1184,1640,1188,1300,1332,1152))

predict(formula,dataset2,type = 'response')

predict 函数的输出将为您提供 dataset2 中 calibre 的值。

我已经更正了您的数据集1。如果将值放在双引号内,它就会变成字符。所以,我从 caliber 变量中删除了双引号。