问题描述
我在一个文件夹中有多个 .txt 文件,这些文件具有特定的数据格式,其中每一行都从 3 位数字开始,例如。
101,3333,35899,BufferC1,99,02,333
102,3344,30079,BufferD2,89,03,444
.... and so on.
现在,如果记录以“101”开头,那么我必须用一些文本替换同一行的第 3 个和第 5 个索引元素。替换完所有文件后,我必须将所有修改过的文件复制到同一台机器上的另一个目录/文件夹中。
我写了一些不起作用的代码。请帮助我,因为我是 powershell 的新手。我是否正在采用正确的方法?
$FileNamesList = @(Get-ChildItem C:\TestFiles\*.txt | Select-Object -ExpandProperty Name)
for($i=0; $i -lt $FileNamesList.count; $i++)
{
# original folder path
$FilePath1 = 'C:\TestFiles\' + $FileNamesList[$i]
#modified files after text is replaced will be copied in another folder
$FilePath2 = 'C:\TestFilesModified\' + $FileNamesList[$i]
$OriginalFileData = Get-Content $FilePath1
# Is this correct,can i assign foreach cmd result to a variable?
$ModifiedFileData= ForEach($Row in $OriginalFileData )
{
If($Row.Split(",")[0] -eq "101")
{
$Row -replace($Row.Split(",")[3]),"Test File"
$Row -replace($Row.Split(",")[5]),"Test Data"
}
else{
$Row
}
Out-File -FilePath $FilePath2 -InputObject $ModifiedFileData
}
解决方法
我喜欢@Theo's approach。我开始沿着相同的思路思考并编写类似的代码。然而,我对 CSV 的想法感到厌烦,最终找到了一些可供分享的替代方法。
CSV 方法存在一些问题:
- 缺少标题
- 难以重新导出,因为
Export-Csv
没有-NoHeader
参数 - 处理评论中提到的可变数量的列
我对此的第一种方法是使用 ConvertFrom-String
cmdlet 从原始数据中获取正式对象。
$SourceFolder = "C:\Temp\SourceFolder"
$DestinationFolder = "C:\Temp\destination"
ForEach( $File in Get-ChildItem $SourceFolder -Filter *.txt )
{
$DestinationFile = Join-Path -Path $DestinationFolder -ChildPath $File.Name
$File |
Get-Content |
ConvertFrom-String -Delimiter "," |
ForEach-Object{
If( $_.P1 -eq 101 ) {
$_.P3 = "P3 Replacement Value"
$_.P5 = "P5 Replacement Value"
}
$_.PSObject.Properties.Value -join "," # Output from the loop
} |
Set-Content -Path $DestinationFile
}
这样做的好处是您不需要知道任何 1 个文件甚至给定文件的任何 1 行中有多少个字段。 ConvertFrom-String
只会为给定的行添加额外的属性。使用 $_.PSObject.Properties.Value
展开值还允许使用有限代码获得任意数量的属性。
第二种方法需要知道最大列数。出于示例的目的,假设我们有可变数量的列,但不会超过 8。我们可以将 -Header
参数与 Import-Csv
命令一起使用。
$Header = "P1","P2","P3","P4","P5","P6","P7","P8"
$SourceFolder = "C:\Temp\SourceFolder"
$DestinationFolder = "C:\Temp\destination"
ForEach( $File in Get-ChildItem C:\temp\SourceFolder -filter *.txt )
{
$DestinationFile = Join-Path -Path $DestinationFolder -ChildPath $File.Name
Import-Csv -Path $File.FullName -Header $Header |
ForEach-Object{
If( $_.P1 -eq 101 ) {
$_.P3 = "P3 Replacement Value"
$_.P5 = "P5 Replacement Value"
}
($_.PSObject.Properties.Value -join ",").TrimEnd(",") # Output from the loop
} |
Set-Content -Path $DestinationFile
}
注意:我认为即使您不知道最大字段数,您也可以将大数组分配给 $Header
变量。
注意:.TrimEnd(",")
方法。由于给定行上的字段可能少于 $Header
数组中的元素,Import-Csv
将添加一个属性并为其分配空值。反过来,这可能会导致 -join
产生的字符串中出现额外的逗号。
警告:如果有合法的空尾随字段,这也可能会造成问题。
最后,我确实找到了进一步利用 *-Csv
cmdlet 的方法,但它仅适用于 ConvertFrom-String
示例:
$SourceFolder = "C:\Temp\SourceFolder"
$DestinationFolder = "C:\Temp\destination"
ForEach( $File in Get-ChildItem $SourceFolder -Filter *.txt )
{
$DestinationFile = Join-Path -Path $DestinationFolder -ChildPath $File.Name
$File |
Get-Content |
ConvertFrom-String -Delimiter "," |
ForEach-Object{
If( $_.P1 -eq 101 ) {
$_.P3 = "P3 Replacement Value"
$_.P5 = "P5 Replacement Value"
}
$_
} |
ConvertTo-Csv -NoTypeInformation |
Select-Object -Skip 1 |
Set-Content -Path $DestinationFile
}
这不需要 -join
表达式,但是它以额外的 Select-Object
管道为代价。
遗憾的是,这些文件没有标题,这将使它们成为正确的 Csv 文件,并且使用起来更加可靠..
您现在可以做的是逐行遍历这些文件并在分隔符 ,
处拆分。
编辑
代码现在使用正则表达式来拆分逗号上的每个字符串,除非此逗号位于带引号的字段内。
# create a regex string to split on the delimiter character ',' unless it isinside quotes
$commaUnlessQuoted = ',(?=(?:[^"]*"[^"]*")*[^"]*$)'
Get-ChildItem -Path 'D:\Test'-Filter '*.txt' -File | ForEach-Object {
$file = $_.FullName
$data = switch -Regex -File $file {
'^101,' {
# first field is '101',replace element index 3 with "Test File" and element index 5 with "Test Data"
$fields = $_ -split $commaUnlessQuoted
$fields[3] = "Test File"
$fields[5] = "Test Data"
# rejoin the fields with a comma
$fields -join ','
}
'^002,' {
# first field is '002',replace element index 3 with the current date
$fields = $_ -split $commaUnlessQuoted
$fields[3] = (Get-Date).ToLongDateString()
$fields -join ','
}
# you can add as many regex conditions here as you like.
# default means no conditions above matched,so return te line as-is
default {$_}
}
$data | Set-Content -Path $file -Force
# copy the modified file to somewhere else
Copy-Item -Path $file -Destination 'C:\TestFilesModified'
}
$commaUnlessQuoted
的正则表达式详细信息:
,Match the character “,” literally
(?= Assert that the regex below can be matched,starting at this position (positive lookahead)
(?: Match the regular expression below
[^"] Match any character that is NOT a “"”
* Between zero and unlimited times,as many times as possible,giving back as needed (greedy)
" Match the character “"” literally
[^"] Match any character that is NOT a “"”
* Between zero and unlimited times,giving back as needed (greedy)
" Match the character “"” literally
)* Between zero and unlimited times,giving back as needed (greedy)
[^"] Match any character that is NOT a “"”
* Between zero and unlimited times,giving back as needed (greedy)
$ Assert position at the end of the string (or before the line break at the end of the string,if any)
)
根据您对 this question 的评论,如果您还需要抛出非终止异常,只需更改以下几行:
$data | Set-Content -Path $file -Force
# copy the modified file to somewhere else
Copy-Item -Path $file -Destination 'C:\TestFilesModified'
进入
try {
$data | Set-Content -Path $file -Force -ErrorAction Stop
# copy the modified file to somewhere else
Copy-Item -Path $file -Destination 'C:\TestFilesModified' -ErrorAction Stop
}
catch { throw } # just rethrow the exception so it 'bubbles up' to the calling script