当 UTC 与时区偏移一起使用时，DateTime.Parse 方法中是否存在错误？

问题描述

Microsoft 文档指出，在 DateTime 结构的其他字符串表示形式中，DateTime.Parse(string) 符合 ISO 8601 并接受时区偏移的协调世界时 (UTC) 格式。

当我运行以下单元测试用例时，DateTime.Parse(string) 和 DateTime.TryParse(string,out DateTime) 接受源字符串末尾的附加字符。这似乎是一个错误。当附加多个字符时，该方法无法正确解析字符串。

[Theory]
[InlineData("2020-5-7T09:37:00.0000000-07:00")]
[InlineData("2020-5-7T09:37:00.0000000-07:00x")]
[InlineData("2020-5-7T09:37:00.0000000-07:00xx")]
[InlineData("2020-5-7T09:37:00.0000000Z")]
[InlineData("2020-5-7T09:37:00.0000000Zx")]
public void TestParse(string source)
{
    DateTime dt = DateTime.Parse(source);
    Assert.True(dt != null);
    bool b = DateTime.TryParse(source,out dt);
    Assert.True(b);
}

编写这个单元测试用例是为了简化我的代码并说明我看到的行为（我认识到应该以不同的方式处理预期的失败）。

测试 1 和 2 通过，第三个测试（带有“xx”后缀）失败。在我看来，第二个测试（带有“x”后缀）应该会失败。

如果未提供时区指示符，则测试 4 通过，测试 5 失败。这似乎是正确的行为。

我想知道是否有人遇到过这种情况，如果有，是否普遍认为这是一个错误？

解决方法

就我个人而言，这对我来说似乎是一个错误...

2020-5-7T09:37:00.0000000-07:00x 是一个有效的 ISO 8601 日期，末尾有一个无效的任意字符，该字符会错误地解析而没有错误，其中两个任意字符似乎正确失败。

似乎正在发生的事情如下

在 ParseISO8601 中解析时，它会在 ParseTimeZone

时区

-07:00

这使 str 具有 1 个最后一个索引
然后检查 str.Match('#') 哪个前缀增加索引（我认为这是错误的）。
此时假设解析在后续检查中完成，因为索引位于字符串的末尾

比赛摘录

_评论我的

internal bool Match(char ch)
{
    if (++Index >= Length) // reaches the end of the string here
    {
        return false;
    }
    ...
}

上述方法返回false，但是它增加了索引！这在事情的计划中似乎非常奇怪。

看起来它的设计目的是检查当前字符的哈希值，然后空终止，如果还有其他东西，则失败。

摘自 ParseISO8601

str.SkipWhiteSpaces();
if (str.Match('#'))  // mine... check hash ... no
{
   if (!VerifyValidPunctuation(ref str))
   {
      result.SetFailure(ParseFailureKind.Format,"Format_BadDateTime",null);
      return false;
   }
   str.SkipWhiteSpaces();
}
if (str.Match('\0')) // mine... check null termination ... no
{
   if (!VerifyValidPunctuation(ref str))
   {
      result.SetFailure(ParseFailureKind.Format,null);
      return false;
   }
}
if (str.GetNext()) // mine... get anything else,if found fail
{
   // If this is true,there were non-white space characters remaining in the DateTime
   result.SetFailure(ParseFailureKind.Format,null);
   return false;
}

_{注意：这似乎发生在我测试过的每个框架中。}

简而言之，它完全忽略日期末尾的任何任意字符，但是如果还剩下 2 个字符，它会按预期运行，因为它会减少 match 检查中的索引。

internal bool Match(char ch) {
    if (++Index >= len) {
        return (false);
    }
    if (Value[Index] == ch) {
        m_current = ch;
        return (true);
    }
    Index--;
    return (false);
}

更新 1

我已将此报告为 GitHub 上的运行时错误，可在此处进行跟踪

DateTime.Parse ISO 8601 allowing extra arbitrary character at the end of the date #46477

更新 2

这已被正式标记为运行时错误，并且已为 .Net 6 进行修复

我没有时间构建运行时来检查它实际在做什么，但是从评论看来，当它拉出时区时，当前索引增加了 1 太远，并且匹配错误地检查了下一个字符。

有一个 PR 来解决这个问题。该修复程序应该在未来的 .NET 版本 (6.0) 中。感谢您报告此问题。

c#c#datetime datetime datetime iso8601