问题描述
我正在寻找一种方法来为 2 个给定矩阵找到相同的特征向量,这样我就可以进行联合对角化。为此,我发现并尝试使用以下函数中的 qndiag(来自 https://github.com/pierreablin/qndiag.git ):
function [D,B,infos] = qndiag(C,varargin)
% Joint diagonalization of matrices using the quasi-Newton method
%
% The algorithm is detailed in:
%
% P. Ablin,J.F. Cardoso and A. Gramfort. Beyond Pham’s algorithm
% for joint diagonalization. Proc. ESANN 2019.
% https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2019-119.pdf
% https://hal.archives-ouvertes.fr/hal-01936887v1
% https://arxiv.org/abs/1811.11433
%
% The function takes as input a set of matrices of size `(p,p)`,stored as
% a `(n,p,p)` array,`C`. It outputs a `(p,`B`,such that the
% matrices `B * C(i,:,:) * B'` are as diagonal as possible.
%
% There are several optional parameters which can be provided in the
% varargin variable.
%
% Optional parameters:
% --------------------
% 'B0' Initial point for the algorithm.
% If absent,a whitener is used.
% 'weights' Weights for each matrix in the loss:
% L = sum(weights * KL(C,C')).
% No weighting (weights = 1) by default.
% 'maxiter' (int) Maximum number of iterations to perform.
% Default : 1000
%
% 'tol' (float) A positive scalar giving the tolerance at
% which the algorithm is considered to have converged.
% The algorithm stops when |gradient| < tol.
% Default : 1e-6
%
% lambda_min (float) A positive regularization scalar. Each
% eigenvalue of the Hessian approximation below
% lambda_min is set to lambda_min.
%
% max_ls_tries (int),Maximum number of line-search tries to
% perform.
%
% return_B_list (bool) Chooses whether or not to return the list
% of iterates.
%
% verbose (bool) Prints informations about the state of the
% algorithm if True.
%
% Returns
% -------
% D : Set of matrices jointly diagonalized
% B : Estimated joint diagonalizer matrix.
% infos : structure containing monitoring informations,containing the times,% gradient norms and objective values.
%
% Example:
% --------
%
% [D,B] = qndiag(C,'maxiter',100,'tol',1e-5)
%
% Authors: Pierre Ablin <pierre.ablin@inria.fr>
% Alexandre Gramfort <alexandre.gramfort@inria.fr>
%
% License: MIT
% First tests
if nargin == 0,error('No signal provided');
end
if length(size(C)) ~= 3,error('Input C should be 3 dimensional');
end
if ~isa (C,'double'),fprintf ('Converting input data to double...');
X = double(X);
end
% Default parameters
C_mean = squeeze(mean(C,1));
[p,d] = eigs(C_mean);
p = fliplr(p);
d = flip(diag(d));
B = p' ./ repmat(sqrt(d),1,size(p,1));
max_iter = 1000;
tol = 1e-6;
lambda_min = 1e-4;
max_ls_tries = 10;
return_B_list = false;
verbose = false;
weights = [];
% Read varargin
if mod(length(varargin),2) == 1,error('There should be an even number of optional parameters');
end
for i = 1:2:length(varargin)
param = lower(varargin{i});
value = varargin{i + 1};
switch param
case 'B0'
B0 = value;
case 'max_iter'
max_iter = value;
case 'tol'
tol = value;
case 'weights'
weights = value / mean(value(:));
case 'lambda_min'
lambda_min = value;
case 'max_ls_tries'
max_ls_tries = value;
case 'return_B_list'
return_B_list = value;
case 'verbose'
verbose = value;
otherwise
error(['Parameter ''' param ''' unkNown'])
end
end
[n_samples,n_features,~] = size(C);
D = transform_set(B,C,false);
current_loss = NaN;
% Monitoring
if return_B_list
B_list = []
end
t_list = [];
gradient_list = [];
loss_list = [];
if verbose
print('Running quasi-Newton for joint diagonalization');
print('iter | obj | gradient');
end
for t=1:max_iter
if return_B_list
B_list(k) = B;
end
diagonals = zeros(n_samples,n_features);
for k=1:n_samples
diagonals(k,:) = diag(squeeze(D(k,:)));
end
% Gradient
if isempty(weights)
G = squeeze(mean(bsxfun(@rdivide,D,...
reshape(diagonals,n_samples,1)),...
1)) - eye(n_features);
else
G = squeeze(mean(...
bsxfun(@times,...
reshape(weights,1),...
bsxfun(@rdivide,...
reshape(diagonals,1))),...
1)) - eye(n_features);
end
g_norm = norm(G);
if g_norm < tol
break
end
% Hessian coefficients
if isempty(weights)
h = mean(bsxfun(@rdivide,...
reshape(diagonals,n_features),1);
else
h = mean(bsxfun(@times,...
1);
end
h = squeeze(h);
% Quasi-Newton's direction
dt = h .* h' - 1.;
dt(dt < lambda_min) = lambda_min; % Regularize
direction = -(G .* h' - G') ./ dt;
% Line search
[success,new_D,new_B,new_loss,direction] = ...
linesearch(D,direction,current_loss,max_ls_tries,weights);
D = new_D;
B = new_B;
current_loss = new_loss;
% Monitoring
gradient_list(t) = g_norm;
loss_list(t) = current_loss;
if verbose
print(sprintf('%d - %.2e - %.2e',t,g_norm))
end
end
infos = struct();
infos.t_list = t_list;
infos.gradient_list = gradient_list;
infos.loss_list = loss_list;
if return_B_list
infos.B_list = B_list
end
end
function [op] = transform_set(M,diag_only)
[n,~] = size(D);
if ~diag_only
op = zeros(n,p);
for k=1:length(D)
op(k,:) = M * squeeze(D(k,:)) * M';
end
else
op = zeros(n,:) = sum(M .* (squeeze(D(k,:)) * M'),1);
end
end
end
function [v] = slogdet(A)
v = log(abs(det(A)));
end
function [out] = loss(B,is_diag,weights)
[n,~] = size(D);
if ~is_diag
diagonals = zeros(n,p);
for k=1:n
diagonals(k,:)));
end
else
diagonals = D;
end
logdet = -slogdet(B);
if ~isempty(weights)
diagonals = bsxfun(@times,diagonals,reshape(weights,n,1));
end
out = logdet + 0.5 * sum(log(diagonals(:))) / n;
end
function [success,delta] = linesearch(D,n_ls_tries,~] = size(D);
step = 1.;
if current_loss == NaN
current_loss = loss(B,false);
end
success = false;
for n=1:n_ls_tries
M = eye(p) + step * direction;
new_D = transform_set(M,true);
new_B = M * B;
new_loss = loss(new_B,true,weights);
if new_loss < current_loss
success = true;
break
end
step = step / 2;
end
new_D = transform_set(M,false);
delta = step * direction;
end
我将它与以下脚本一起使用,您可以通过下载本文底部的 2 个输入矩阵进行测试:
clc; clear
m=7 % dimension
n=2 % number of matrices
% Load spectro and WL+GCph+XC
FISH_GCsp = load('Fisher_GCsp_flat.txt');
FISH_XC = load('Fisher_XC_GCph_WL_flat.txt');
% Marginalizing over uncommon parameters between the two matrices
COV_GCsp_first = inv(FISH_GCsp);
COV_XC_first = inv(FISH_XC);
COV_GCsp = COV_GCsp_first(1:m,1:m);
COV_XC = COV_XC_first(1:m,1:m);
% Invert to get Fisher matrix
FISH_sp = inv(COV_GCsp);
FISH_xc = inv(COV_XC);
% Drawing a random set of commuting matrices
C=zeros(n,m,m);
B0=zeros(n,m);
C(1,:) = FISH_sp
C(2,:) = FISH_xc
%[D,'max_iter',1e6,1e-6);
[D,B] = qndiag(C);
% Print diagonal matrices
B*C(1,:)*B'
B*C(2,:)*B'
但不幸的是,我收到以下错误:
Unable to perform assignment because the size of the left side is 1-by-7-by-7 and the size of the
right side is 6-by-6.
Error in qndiag>transform_set (line 224)
op(k,:)) * M';
Error in qndiag (line 128)
D = transform_set(B,false);
Error in compute_joint_diagonalization (line 32)
[D,B] = qndiag(C);
我不明白函数 squeeze
最重要的效用:为什么函数 eigs
返回只返回 6 个值而不是 7 就像我的数据(输入矩阵的大小为 7x7).
这个数组维度问题可能有什么问题,我该如何解决?
Matrix Fisher_XC_GCph_WL_flat.txt
更新 1
您只需在 Matlab 下解压并执行脚本 compute_joint_diagonalization.m
,您通常会看到上述错误(关于 eigs
和 squeeze
函数的使用)。
它应该可以帮助您了解此问题的根源。
更新 2
如果我用 [p,d] = eigs(C_mean)
替换 [p,d] = eigs(C_mean,7)
,我会收到另一个错误:
Index in position 1 exceeds array bounds (must not exceed 2).
Error in qndiag>transform_set (line 224)
op(k,:)) * M';
Error in qndiag (line 128)
D = transform_set(B,false);
Error in compute_joint_diagonalization (line 27)
[D,B] = qndiag(C);
但是,使用的 2 个矩阵的维度是 7x7,应该使用 eigs(C_mean,7)
正确处理。
更新 3
size(D) =
2 7 7
length(D) =
7
size(M) =
7 7
size(op) =
2 7 7
Index in position 1 exceeds array bounds (must not exceed 2).
Error in qndiag>transform_set (line 231)
op(k,B] = qndiag(C);
注意 k
从 1 到 length(D)=7
变化。
这些尺寸是否会出现问题?
解决方法
来自 eigs
的 documentation:
d = eigs(A)
返回矩阵 A 的六个最大幅值特征值的向量。
如果您想要所有七个,则需要调用 d = eigs(A,7)
或 d = eig(A)
。对于小矩阵(例如 eig 获取所有特征值通常更容易,而不是使用 eigs
获取子集。
编辑:响应您的“更新 3”
for k=1:length(D)
应替换为 for k=1:n
。这需要在两行上进行更改。从您的错误消息来看,它们是第 231 和 236 行。
L = length(X)
返回 X
中最大数组维度的长度,在您的情况下为 7,即对于第一个维度来说太高了。