使用条件重命名文件

问题描述

需要一些建议。我正在尝试使用正则表达式做一些可能无法完成的事情，如果可能的话，这很麻烦。我什么都不能工作。我正在尝试为我的PDF文件创建一个标记系统。所以如果我有这个文件名：

"csharp 8 in a nutshell[studying programming csharp ebooks].pdf"

我希望'[]'内的所有单词中都有一个'@'。所以上面的文件名看起来像这样：

"csharp 8 in a nutshell[@studying @programming @csharp @ebooks].pdf"

问题是将'@'保留在'[]'中。例如，我宁愿文件名最前面的“ csharp”不带有“ @”。

此外，我正在使用一个名为“批量重命名实用程序”的批量重命名器来帮助我。

可以做到吗？
如果可以的话，有什么提示吗？

谢谢。

解决方法

批量重命名实用程序不支持替换多个匹配项，您只能匹配整个文件名并使用捕获组/反向引用执行替换。

由于您使用的是Windows，因此建议您使用Powershell：

cd 'C:\YOUR_FOLDER\HERE'
Get-ChildItem -File | Rename-Item -NewName { $_.Name -replace '(?<=\[[^][]*?)\w+(?=[^][]*])','@$&' }

请参见this regex demo和proof it works with .NET regex flavor。

(?<=\[[^][]*?)-在此位置之前，必须有一个[，然后是[和]以外的任意数量的字符，应尽可能少
\w+-1个以上的字符字符
(?=[^][]*])-在此位置之后，除[和]外，必须有尽可能多的字符，然后是]个字符。

替换为@ +整个匹配值（$&）。

还可以使用

Get-ChildItem -File | Rename-Item -NewName { $_.Name -replace '(\G(?!\A)[^][\w]+|\[)(\w+)','$1@$2' }

请参见this regex demo和.NET regex test。

(\G(?!\A)[^][\w]+|\[)-第1组（$1）：上一场比赛的结束，除了]，[和单词chars以外的1+个字符，或{ {1}}字符
[-第2组（(\w+)）：一个或多个单词字符。

如果您只想重命名* .pdf文件，请将$2替换为Get-ChildItem -File。

我假设最多有一个用括号定界的子字符串。

使用Perl时，您可以将以下正则表达式的零长度匹配替换为'@'（单击“ Perl”，然后检查全局和区分大小写的选项），Ruby，Python的备用正则表达式引擎，带有perl=true的R或使用PCRE正则表达式引擎的语言（包括PHP）。除Ruby外，需要设置区分大小写（\i）和常规（\g）的标志。 Ruby只需要不区分大小写的标志。

r = /(?:^.*\[ *|\G(?<!^)|[a-z]+ +)\K(?<=\[| )(?=[a-z][^\[\]]*\])/

例如，如果使用Ruby，则将执行

str = "csharp 8 in a nutshell[studying programming csharp ebooks].pdf"
str.gsub(r,'@')
  #=> "csharp 8 in a nutshell[@studying @programming @csharp @ebooks].pdf"

我相信上面提到的所有语言都允许从命令行运行简短的脚本。（我在下面提供了Ruby脚本。）

正则表达式引擎执行以下操作。

(?:                : begin non-capture group
  ^.*\[ *          : match beginning of string then 0+ characters then '['
                     then 0+ spaces
  |                : or
  \G               : asserts the position at the end of the previous match
                     or at the start of the string for the first match
  (?<!^)           : use a negative lookbehind to assert that the current
                     location is not the start of the string
  |                : or
  [a-z]+ +         : match 1+ letters then 1+ spaces
)                  : end non-capture group
\K                 : reset beginning of reported match to current location
                     and discard all previously-matched characters from match
                     to be returned
(?<=               : begin positive lookbehind
  \[|[ ]           : match '[' or a space
)                  : end positive lookbehind
(?=                : begin positive lookahead
  [a-z][^\[\]]*\]  : match a letter then 0+ characters other than '[' and ']'
                     then ']'
)                  : end positive lookahead

另一种可能性（以Ruby为例）是将字符串分成三部分，修改中间部分，然后重新加入部分：

first,mid,last = str.split /(?<=\[)|(?=\])/
  #=> ["csharp 8 in a nutshell[",#    "studying programming csharp ebooks",#    "].pdf"]
first + mid.gsub(/(?<=\A| )(?! )/,'@') + last
  #=> "csharp 8 in a nutshell[@studying @programming @csharp @ebooks].pdf"

split使用的正则表达式为：“匹配以'['开头的（零宽度）字符串（(?<=\[)是正向后看）或后跟']'（(?=\])是正向超前。）通过匹配零宽度字符串split不会删除任何字符。

gsub的正则表达式为：“匹配零宽度的字符串，该字符串位于字符串的开头，或者以空格开头，后跟一个空格以外的字符（(?! )是前瞻性）。也可以将其写为/(?<![^ ])(?! )/（(?<![^ ])是后瞻性）。

一个变体：

first + mid.split.map { |s| '@' + s }.join(' ') + last
  #=> "csharp 8 in a nutshell[@studying @programming @csharp @ebooks].pdf"

我创建了一个名为'in'的文件，其中包含以下两行：

Little [Miss Muffet sat on her] tuffet
eating her [curds and] whey

这是（Ruby）脚本的示例，可以从命令行运行该脚本以执行必要的替换。

ruby -e "File.open('out','w') do |fout|
          File.foreach('in') do |str|
            first,last = str.split(/(?<=\[)|(?=\])/)
            fout.puts(first + mid.gsub(/(?<=\A| )(?! )/,'@') + last)
          end
        end"

这将产生一个名为'out'的文件，其中包含以下两行：

Little [@Miss @Muffet @sat @on @her] tuffet
eating her [@curds @and] whey

file file file regex rename