如何提取不同长度的子字符串?

问题描述

我有一个n by 2矩阵,其中包含指定字符串的子字符串的开始和结束索引。如何在没有for循环的情况下提取子串的n by 1单元格数组?

string = 'Hello World!';
ranges = [1 1;
    2 3;
    4 5;
    3 7];
substrings = cell(size(ranges,1),1);
for i=1:size(ranges,1)
    substrings{i} = string(ranges(i,1):ranges(i,2));
end

预期结果:

substrings = 
'H'
'el'
'lo'
'llo W'

解决方法

您可以使用cellfun使其成为单行操作:

str = 'Hello World!';
ranges = [  1 1;
            2 3;
            4 5;
            3 7];
% first convert "ranges" to a cell object
Cranges = mat2cell(ranges,ones(size(ranges,1),2);
% call "cellfun" on every row/entry of "Cranges"
cellfun(@(x)str(x(1):x(2)),Cranges,'UniformOutput',false)

ans =

4×1细胞阵列

{'H'    }
{'el'   }
{'lo'   }
{'llo W'}

我已将变量string更改为str,因为string是MATLAB中的本机函数(将输入转换为 string 类型)。

尽管这是单行操作,但这并不意味着它效率更高:

Num = 1000000;
        
substrings = cell(size(ranges,1);
% time for-loop
tic
for j = 1:Num
    for i = 1:size(ranges,1)
        substrings{i} = str(ranges(i,1):ranges(i,2));
    end
end
toc;

Cranges = mat2cell(ranges,2);
% time function-call
tic
for j = 1:Num
    substrings = cellfun(@(x)str(x(1):x(2)),false);
end
toc;
Elapsed time is 3.929622 seconds. 
Elapsed time is 50.319609 seconds.