问题描述
我有一个脚本可以从目录中的每个文件中提取元数据。当文件路径没有变音符号时,脚本会生成一个如下所示的 csv 文件:
当文件路径包含变音符号(即“TéstMé.txt”)时,csv 文件的 filehash 字段中有空格:
我的问题是:无论文件路径中的变音符号如何,我如何让这个脚本工作?
- 我已经确定问题不在于脚本的
Get-filehash
部分(当我运行单行Get-filehash "C:\Temp\New\TéstMé.txt"
时,会生成一个哈希值。) - 我还确定将
filehash = Get-filehash -Path
替换为filehash = Get-filehash -LiteralPath
不是解决方案,因为它也会产生空白。 - 我尝试更改
($_.Trim() -match "^(?<Children>\d+)\s+(?<FullName>.*)") {
行中的正则表达式,以防它阻塞变音符号,但任何更改都会显示WARNING: parsing [unique parsing error here].
- 我还尝试将
ValueFromPipeline=$True,ValueFromPipelineByPropertyName=$True
从$true
更改为$false
(以防管道更改文件路径值),但没有效果。 - 我认为 Robocopy(脚本中使用的)可能无法处理带有变音符号的文件,但
Robocopy C:\Temp\New C:\Temp\star
可以很好地移动文件。 - 我确实有一个用于识别非法字符的正则表达式(从 here 获得),但我不知道如何将其合并到脚本中。
- 仅供参考:我无法更改实际文件名。很想对任何带有变音符号的字母进行查找和替换,但我无法使用此选项。
Function Get-FolderItem {
[cmdletbinding(DefaultParameterSetName='Filter')]
Param (
[parameter(Position=0,ValueFromPipeline=$True,ValueFromPipelineByPropertyName=$True)]
[Alias('FullName')]
[string[]]$Path = $PWD,[parameter(ParameterSetName='Filter')]
[string[]]$Filter = '*.*',[parameter(ParameterSetName='Exclude')]
[string[]]$ExcludeFile,[parameter()]
[int]$MaxAge,[parameter()]
[int]$MinAge
)
Begin {
$params = New-Object System.Collections.Arraylist
$params.AddRange(@("/L","/E","/NJH","/BYTES","/FP","/NC","/XJ","/R:0","/W:0","T:W"))
If ($PSBoundParameters['MaxAge']) {
$params.Add("/MaxAge:$MaxAge") | Out-Null
}
If ($PSBoundParameters['MinAge']) {
$params.Add("/MinAge:$MinAge") | Out-Null
}
}
Process {
ForEach ($item in $Path) {
Try {
$item = (Resolve-Path -LiteralPath $item -ErrorAction Stop).ProviderPath
If (-Not (Test-Path -LiteralPath $item -Type Container -ErrorAction Stop)) {
Write-Warning ("{0} is not a directory and will be skipped" -f $item)
Return
}
If ($PSBoundParameters['ExcludeFile']) {
$Script = "robocopy `"$item`" NULL $Filter $params /XF $($ExcludeFile -join ',')"
} Else {
$Script = "robocopy `"$item`" NULL $Filter $params"
}
Write-Verbose ("Scanning {0}" -f $item)
Invoke-Expression $Script | ForEach {
Try {
If ($_.Trim() -match "^(?<Children>\d+)\s(?<FullName>.*)") {
$object = New-Object PSObject -Property @{
FullName = $matches.FullName
Extension = $matches.fullname -replace '.*\.(.*)','$1'
FullPathLength = [int] $matches.FullName.Length
filehash = Get-filehash -LiteralPath "\\?\$($matches.FullName)" |Select -Expand Hash
Created = ([System.IO.FileInfo] $matches.FullName).creationtime
LastWriteTime = ([System.IO.FileInfo] $matches.FullName).LastWriteTime
}
$object.pstypenames.insert(0,'System.IO.RobocopyDirectoryInfo')
Write-Output $object
} Else {
Write-Verbose ("Not matched: {0}" -f $_)
}
} Catch {
Write-Warning ("{0}" -f $_.Exception.Message)
Return
}
}
} Catch {
Write-Warning ("{0}" -f $_.Exception.Message)
Return
}
}
}
}
Get-FolderItem "C:\Temp\New" | Export-Csv -Path C:\Temp\testesting.csv
解决方法
这是一个解决方案,我使用 /UNILOG:c:\temp\test.txt
参数将 RoboCopy 输出输出到 unicode 日志,然后使用相同的代码
Function Get-FolderItem {
[cmdletbinding(DefaultParameterSetName='Filter')]
Param (
[parameter(Position=0,ValueFromPipeline=$True,ValueFromPipelineByPropertyName=$True)]
[Alias('FullName')]
[string[]]$Path = $PWD,[parameter(ParameterSetName='Filter')]
[string[]]$Filter = '*.*',[parameter(ParameterSetName='Exclude')]
[string[]]$ExcludeFile,[parameter()]
[int]$MaxAge,[parameter()]
[int]$MinAge
)
Begin {
$params = New-Object System.Collections.Arraylist
$params.AddRange(@("/L","/E","/NJH","/BYTES","/FP","/NC","/XJ","/R:0","/W:0","T:W","/UNILOG:c:\temp\test.txt"))
If ($PSBoundParameters['MaxAge']) {
$params.Add("/MaxAge:$MaxAge") | Out-Null
}
If ($PSBoundParameters['MinAge']) {
$params.Add("/MinAge:$MinAge") | Out-Null
}
}
Process {
ForEach ($item in $Path) {
Try {
$item = (Resolve-Path -LiteralPath $item -ErrorAction Stop).ProviderPath
If (-Not (Test-Path -LiteralPath $item -Type Container -ErrorAction Stop)) {
Write-Warning ("{0} is not a directory and will be skipped" -f $item)
Return
}
If ($PSBoundParameters['ExcludeFile']) {
$Script = "robocopy `"$item`" NULL $Filter $params /XF $($ExcludeFile -join ',')"
} Else {
$Script = "robocopy `"$item`" NULL $Filter $params"
}
Write-Verbose ("Scanning {0}" -f $item)
Invoke-Expression $Script | Out-Null
get-content "c:\temp\test.txt" | ForEach {
Try {
If ($_.Trim() -match "^(?<Children>\d+)\s(?<FullName>.*)") {
$object = New-Object PSObject -Property @{
FullName = $matches.FullName
Extension = $matches.fullname -replace '.*\.(.*)','$1'
FullPathLength = [int] $matches.FullName.Length
FileHash = Get-FileHash -LiteralPath "\\?\$($matches.FullName)" |Select -Expand Hash
Created = ([System.IO.FileInfo] $matches.FullName).creationtime
LastWriteTime = ([System.IO.FileInfo] $matches.FullName).LastWriteTime
}
$object.pstypenames.insert(0,'System.IO.RobocopyDirectoryInfo')
Write-Output $object
} Else {
Write-Verbose ("Not matched: {0}" -f $_)
}
} Catch {
Write-Warning ("{0}" -f $_.Exception.Message)
Return
}
}
} Catch {
Write-Warning ("{0}" -f $_.Exception.Message)
Return
}
}
}
}
$a = Get-FolderItem "C:\Temp\New" | Export-Csv -Path C:\Temp\testtete.csv -Encoding Unicode