问题描述
如何从Java ByteBuffer
开始以ByteBuffer#position()
读取NUL终止的UTF-8字符串?
ByteBuffer b = /* 61 62 63 64 00 31 32 34 00 (hex) */;
String s0 = /* read first string */;
String s1 = /* read second string */;
// `s0` will Now contain “ABCD” and `s1` will contain “124”.
我已经尝试过使用Charsets.UTF_8.decode(b)
,但似乎此函数忽略了当前的ByteBuffer
后缀并一直读取到缓冲区末尾。
从字节缓冲区中读取这样的字符串是否比寻找包含0的字节并限制其缓冲区(或将带有字符串的部分复制到单独的缓冲区中)有更多惯用的方式?
解决方法
我不知道的惯用语“一个衬里”(毫不奇怪,因为NUL终止的字符串不是Java规范的一部分)。
我想到的第一件事是使用b.slice().limit(x)
仅在所需的字节上创建轻量级视图(最好将其复制到任何地方,因为您可能可以直接使用缓冲区)
ByteBuffer b = ByteBuffer.wrap(new byte[] {0x61,0x62,0x63,0x64,0x00,0x31,0x32,0x34,0x00 });
int i;
while (b.hasRemaining()) {
ByteBuffer nextString = b.slice(); // View on b with same start position
for (i = 0; b.hasRemaining() && b.get() != 0x00; i++) {
// Count to next NUL
}
nextString.limit(i); // view now stops before NUL
CharBuffer s = StandardCharsets.UTF_8.decode(nextString);
System.out.println(s);
}
,
在Java中,字符userName: dirt-core userName$ docker build .
Sending build context to Docker daemon 116.2MB
Step 1/14 : FROM alpine:3.6
---> 43773d1dba76
Step 2/14 : LABEL maintainer "erik.muller@wetek.com"
---> Using cache
---> ed506b5fe261
Step 3/14 : RUN apk update && apk add --no-cache bash unzip libstdc++
---> Using cache
---> c13b49aad150
Step 4/14 : RUN apk add bash openjdk8 && apk add curl
---> Using cache
---> a14826dd038c
Step 5/14 : RUN mkdir -p /opt/build-tools && cd /opt && curl -LO https://dl.google.com/android/repository/build-tools_r28.0.2-linux.zip && unzip -q build-tools_r28.0.2-linux.zip -d /opt/build-tools && rm -f build-tools_r28.0.2-linux.zip $$ echo "$(ls -la /opt/build-tools/android-9/)" && cd /opt/build-tools/android-9/ | chmod +x aapt | echo y | ./aapt
---> Running in 969082e71f75
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 55.0M 100 55.0M 0 0 7925k 0 0:00:07 0:00:07 --:--:-- 9055k
rm: can't stat 'total 22344
drwxrwx--- 5 root root 4096 Jul 26 2018 .
drwxr-xr-x 3 root root 4096 Aug 25 12:30 ..
-rw-rw---- 1 root root 766452 Jul 26 2018 NOTICE.txt
-rwxrwxr-x 1 root root 1511888 Jul 26 2018 aapt
-rwxrwxr-x 1 root root 3049966 Jul 26 2018 aapt2
-rwxrwxr-x 1 root root 1932780 Jul 26 2018 aarch64-linux-android-ld
-rwxrwxr-x 1 root root 552733 Jul 26 2018 aidl
-rwxrwxr-x 1 root root 2612 Jul 26 2018 apksigner
-rwxrwxr-x 1 root root 3570836 Jul 26 2018 arm-linux-androideabi-ld
-rwxrwxr-x 1 root root 40126 Jul 26 2018 bcc_compat
-rw-rw-r-- 1 root root 16985 Jul 26 2018 core-lambda-stubs.jar
-rwxrwxr-x 1 root root 2577 Jul 26 2018 d8
-rwxrwxr-x 1 root root 1324080 Jul 26 2018 dexdump
-rwxrwxr-x 1 root root 2577 Jul 26 2018 dx
-rwxrwxr-x 1 root root 3570836 Jul 26 2018 i686-linux-android-ld
drwxrwx--- 2 root root 4096 Jul 26 2018 lib
drwxrwx--- 2 root root 4096 Jul 26 2018 lib64
-rwxrwxr-x 1 root root 1066203 Jul 26 2018 llvm-rs-cc
-rwxrwxr-x 1 root root 4732 Jul 26 2018 mainDexClasses
-rw-rw-r-- 1 root root 758 Jul 26 2018 mainDexClasses.rules
-rw-rw-r-- 1 root root 310 Jul 26 2018 mainDexClassesNoAapt.rules
-rwxrwxr-x 1 root root 1919720 Jul 26 2018 mipsel-linux-android-ld
drwxrwx--- 5 root root 4096 Jul 26 2018 renderscript
-rw-rw-r-- 1 root root 17 Jul 26 2018 runtime.properties
-rw-rw-r-- 1 root root 59 Jul 26 2018 source.properties
-rwxrwxr-x 1 root root 1476617 Jul 26 2018 split-select
-rwxrwxr-x 1 root root 1756288 Jul 26 2018 x86_64-linux-android-ld
-rwxrwxr-x 1 root root 237570 Jul 26 2018 zipalign': Filename too long
The command '/bin/sh -c mkdir -p /opt/build-tools && cd /opt && curl -LO https://dl.google.com/android/repository/build-tools_r28.0.2-linux.zip && unzip -q build-tools_r28.0.2-linux.zip -d /opt/build-tools && rm -f build-tools_r28.0.2-linux.zip $$ echo "$(ls -la /opt/build-tools/android-9/)" && cd /opt/build-tools/android-9/ | chmod +x aapt | echo y | ./aapt' returned a non-zero code: 1
是UTF-8字节0,Unicode代码点U + 0是普通字符。因此,全部读取(也许读入一个超大字节数组),然后做
\u0000
如果您没有固定的位置,必须顺序读取每个字节,则代码很丑陋。 C创始人之一确实称nul终止字符串是历史性错误。
相反,为了不为Java字符串生成UTF-8字节0,通常将其作为C / C ++ nul终止的字符串进行进一步处理,存在编写修改后的UTF-8,也编码0字节的情况。
,您可以通过替换和拆分功能来实现。将您的十六进制字节转换为String并通过自定义字符找到0。然后使用该自定义字符分割字符串。
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
/**
* Created by Administrator on 8/25/2020.
*/
public class Jtest {
public static void main(String[] args) {
//ByteBuffer b = /* 61 62 63 64 00 31 32 34 00 (hex) */;
ByteBuffer b = ByteBuffer.allocate(10);
b.put((byte)0x61);
b.put((byte)0x62);
b.put((byte)0x63);
b.put((byte)0x64);
b.put((byte)0x00);
b.put((byte)0x31);
b.put((byte)0x32);
b.put((byte)0x34);
b.put((byte)0x00);
b.rewind();
String s0;
String s1;
// print the ByteBuffer
System.out.println("Original ByteBuffer: "
+ Arrays.toString(b.array()));
// `s0` will now contain “ABCD” and `s1` will contain “124”.
String s = StandardCharsets.UTF_8.decode(b).toString();
String ss = s.replace((char)0,';');
String[] words = ss.split(";");
for(int i=0; i < words.length; i++) {
System.out.println(" Word " + i + " = " +words[i]);
}
}
}
我相信您可以通过删除替换来更有效地完成此任务。