问题描述
给出以下字符串:
str = "\\u20ac €"
如何将其解码为€ €
?
使用str.encode("utf-8").decode("unicode-escape")
返回€ â\x82¬
(为澄清起见,我正在寻找一种通用的解决方案,该方法如何解码unicode和转义字符的任何组合)
解决方法
如果这始终是字符串的格式,请使用.split
:
string = "\\u20ac €"
escaped_unicode,non_escaped_unicode = string.split()
output = '{} {}'.format(escaped_unicode.encode("utf-8").decode("unicode-escape"),non_escaped_unicode)
print(output)
# € €
否则,我们将需要更多的创造力。我认为最通用的解决方案是仍然使用split
,然后使用regex来确定是否需要处理转义的unicode(假定输入足够合理,不能在Unicode中混合unicode和转义的unicode。相同的“单词” )
import re
string = "ac ab \\u20ac cdef €"
regex = re.compile(r'([\u0000-\u007F]+)')
output = []
for word in string.split():
match = regex.search(word)
if match:
try:
output.append(match[0].encode("utf-8").decode("unicode-escape"))
except UnicodeDecodeError:
# assuming the string contained a literal \\u or anything else
# that decode("unicode-escape") could not handle,so adding to output as is
output.append(word)
else:
output.append(word)
print(' '.join(output))
# ac ab € cdef €
,
一种简单而快速的解决方案是使用NULL POINTER EXCEPTION
来匹配OnViewCreated()
和正好四个十六进制数字,并将这些数字转换为Unicode代码点:
@Override
public void onViewCreated(@NonNull View view,@Nullable Bundle savedInstanceState) {
super.onViewCreated(view,savedInstanceState);
submitCheck = (Button) view.findViewById(R.id.signupBtn);
textFillCheck = (EditText) view.findViewById(R.id.signupFirstName);
submitCheck.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
if (TextUtils.isEmpty(textFillCheck.getText().toString())) {
Toast.makeText(getActivity(),"Please fill in all fields",Toast.LENGTH_SHORT).show();
Intent intent = new Intent(getActivity(),SignupFragment.class);
startActivity(intent);
}
else{
Toast.makeText(getActivity(),textFillCheck.getText().toString(),Toast.LENGTH_LONG).show();
}
}
});
}
输出:
re.sub