c# – 它应该是如此明显,但为什么会失败呢？

多年来一直编码.net我觉得自己像个n00b.为什么以下代码失败？

byte[] a = Guid.NewGuid().ToByteArray(); // 16 bytes in array
string b = new UTF8Encoding().GetString(a);
byte[] c = new UTF8Encoding().GetBytes(b);
Guid d = new Guid(c);    // Throws exception (32 bytes recived from c)

更新

批准了CodeInChaos的答案.可以在他的答案中读取16个字节的原因,即32个字节.答案中也说明了：

the default constructor of
UTF8Encoding has error checking
disabled

恕我直言,当尝试将字节数组编码为包含无效字节的字符串时,UTF8编码器应该抛出异常.为了使.net框架正常运行,代码应该编写如下

byte[] a = Guid.NewGuid().ToByteArray();
 string b = new UTF8Encoding(false,true).GetString(a);  // Throws exception as expected
 byte[] c = new UTF8Encoding(false,true).GetBytes(b);
 Guid d = new Guid(c);

解决方法

并非每个字节序列都是有效的UTF-8编码字符串.

GUID几乎可以包含任何字节序列.但是UTF-8作为特定规则,如果值> 127,则允许字节序列. Guid通常不会遵循这些规则.

然后,当您将损坏的字符串编码回字节数组时,您将获得一个长度超过16个字节的字节数组,这是Guid的构造函数不接受的.

UTF8Encoding.GetString的文档说明：

With error detection,an invalid sequence causes this method to throw a ArgumentException. Without error detection,invalid sequences are ignored,and no exception is thrown.

并且UTF8Encoding的默认构造函数已禁用错误检查(不要问我原因).

This constructor creates an instance that does not provide a Unicode byte order mark and does not throw an exception when an invalid encoding is detected.
Note
For security reasons,your applications are recommended to enable error detection by using the constructor that accepts a throwOnInvalidBytes parameter and setting that parameter to true.

您可能希望使用Base64编码而不是UTF-8.这样,您可以将任何有效的字节序列映射到字符串中并返回.

c# – 它应该是如此明显,但为什么会失败呢？

解决方法

相关文章