如何在 python 中使用 PCA 和 LDA 绘图实现 Q 和 T 统计测试？

问题描述

我一直在研究这篇文章 Structural Health Monitoring based on principal component analysis: damage detection,localization and classification，其中实现了 PCA 模型以识别钢框架中的损坏。在这里，它讨论了对 PCA 模型使用 Q、T 以及（Q 和 T）统计量和 I 统计量测试的组合，以识别损坏并将其定位在结构中。我知道使用 PCA 和hotelling t 统计可以使用以下链接完成：

Hotelling's T^2 scores in python

我的问题如下：

是否有更好的方法来计算 Hotelling t 统计量？
如何在 Python 中计算 Q 统计量？
是否可以将这些参数实现到 LDA？如果是，我需要从 LDA 中提取什么来计算 Q、T 以及（Q 和 T）和 I 统计检验的组合？
（额外问题）如何使用统计检验从上述文章中获得贡献方法？

编辑 1：

我在网上搜索并找到以下链接 T-Squared Q residuals and Contributions 并尝试计算 T 平方和 Q 残差计算的方程，但是，我在 Q 残差计算中不断出错。我的代码中可能有什么错误？（我的输入矩阵大小 [50,7]）此外，是否可以根据我的输入绘制 Q 与 T 的关系图？

代码：

def hotelling_tsquared_PCA(input_features):
    n_samples = input_features.shape[0]
    
    ##### Hyperparameter optimisation:
    # Running Bayesian Optimisation to get the best parameters:
    start = time.time()
    
    # Create the algorithms
    tpe_algo = tpe.suggest
    # rand_algo = rand.suggest
    # atpe_algo = atpe.suggest
    
    # Assigning model:
    model = 'pca'
  
    # Creating the trial objects:
    hypopt_trials = Trials()
    
    # Getting the best parameters:
    best_params = fmin(obj_fnc,search_space(model),algo=tpe_algo,max_evals=500,trials=hypopt_trials)
    print("Best params: ",best_params)
    print('Best accuracy: ',hypopt_trials.best_trial['result']['loss'])
    print("[INFO] Baye. Opt. search took {:.2f} seconds".format(time.time() - start))
    
    # Calling parameters:
    ## PCA:
    svd_solver = ["auto","full","arpack","randomized"]
    
    # Creating the PCA models:
    #### Implementing hyperopt Search:
    pca = PCA(n_components=2,svd_solver=svd_solver[best_params['svd_solver']])
    
    pca = pca.fit(input_features)
    PCA_scores = pca.transform(input_features)
    print('PCA score matrix shape:',np.array(PCA_scores).shape)
    PCA_loading = (pca.components_).T
    print('PCA Loading matrix shape:',np.array(PCA_loading).shape)
    eigenvalues = pca.explained_variance_

    t2 = np.linalg.multi_dot([input_features,PCA_loading,np.linalg.inv(np.diag(eigenvalues)),PCA_loading.T,input_features.T])
    # print(t2)
    print('PCA hotellings T^2 matrix shape:',np.array(Q_res).shape)
    return t2

def Q_Residual_PCA(input_features):
    n_samples = input_features.shape[0]
    
    ##### Hyperparameter optimisation:
    # Running Bayesian Optimisation to get the best parameters:
    start = time.time()
    
    # Create the algorithms
    tpe_algo = tpe.suggest
    # rand_algo = rand.suggest
    # atpe_algo = atpe.suggest
    
    # Assigning model:
    model = 'pca'
  
    # Creating the trial objects:
    hypopt_trials = Trials()
    
    # Getting the best parameters:
    best_params = fmin(obj_fnc,"randomized"]
    
    # Creating the PCA models:
    #### Implementing hyperopt Search:
    pca = PCA(svd_solver=svd_solver[best_params['svd_solver']])
    
    print('Model: ',pca)
    pca = pca.fit(input_features)
    PCA_scores = pca.transform(input_features)
    print('PCA score matrix shape:',np.array(PCA_loading).shape)
    
    Q_eq_1 = np.dot(PCA_loading,PCA_loading.T)
    # print(pd.DataFrame(Q_eq_1))
    print('PCA Loading matrix product shape:',np.array(Q_eq_1).shape)
    Q_eq_2 = np.identity(n_samples) - Q_eq_1
    # Q_eq_2 = np.eye(50,7) - Q_eq_1
    # print(pd.DataFrame(Q_eq_2))
    print('PCA Sub matrix shape:',np.array(Q_eq_2).shape)
    Q_res = np.linalg.multi_dot([input_features.T,Q_eq_2,input_features])
    # print(pd.DataFrame(Q_res))
    print('PCA Q Residual matrix shape:',np.array(Q_res).shape)
    return Q_res

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

linear-discriminant pca python scikit-learn statistics