从ByteBuffer读取NUL终止的字符串

问题描述

如何从Java ByteBuffer开始以ByteBuffer#position()读取NUL终止的UTF-8字符串?

ByteBuffer b = /* 61 62 63 64 00 31 32 34 00 (hex) */;
String s0 = /* read first string */;
String s1 = /* read second string */;

// `s0` will Now contain “ABCD” and `s1` will contain “124”.

我已经尝试过使用Charsets.UTF_8.decode(b),但似乎此函数忽略了当前的ByteBuffer后缀并一直读取到缓冲区末尾。

从字节缓冲区中读取这样的字符串是否比寻找包含0的字节并限制其缓冲区(或将带有字符串的部分复制到单独的缓冲区中)有更多惯用的方式?

解决方法

我不知道的惯用语“一个衬里”(毫不奇怪,因为NUL终止的字符串不是Java规范的一部分)。

我想到的第一件事是使用b.slice().limit(x)仅在所需的字节上创建轻量级视图(最好将其复制到任何地方,因为您可能可以直接使用缓冲区)

ByteBuffer b = ByteBuffer.wrap(new byte[] {0x61,0x62,0x63,0x64,0x00,0x31,0x32,0x34,0x00 });
int i;
while (b.hasRemaining()) {
  ByteBuffer nextString = b.slice(); // View on b with same start position
  for (i = 0; b.hasRemaining() && b.get() != 0x00; i++) {
    // Count to next NUL
  }
  nextString.limit(i); // view now stops before NUL
  CharBuffer s = StandardCharsets.UTF_8.decode(nextString);
  System.out.println(s);
}
,

在Java中,字符userName: dirt-core userName$ docker build . Sending build context to Docker daemon 116.2MB Step 1/14 : FROM alpine:3.6 ---> 43773d1dba76 Step 2/14 : LABEL maintainer "erik.muller@wetek.com" ---> Using cache ---> ed506b5fe261 Step 3/14 : RUN apk update && apk add --no-cache bash unzip libstdc++ ---> Using cache ---> c13b49aad150 Step 4/14 : RUN apk add bash openjdk8 && apk add curl ---> Using cache ---> a14826dd038c Step 5/14 : RUN mkdir -p /opt/build-tools && cd /opt && curl -LO https://dl.google.com/android/repository/build-tools_r28.0.2-linux.zip && unzip -q build-tools_r28.0.2-linux.zip -d /opt/build-tools && rm -f build-tools_r28.0.2-linux.zip $$ echo "$(ls -la /opt/build-tools/android-9/)" && cd /opt/build-tools/android-9/ | chmod +x aapt | echo y | ./aapt ---> Running in 969082e71f75 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 55.0M 100 55.0M 0 0 7925k 0 0:00:07 0:00:07 --:--:-- 9055k rm: can't stat 'total 22344 drwxrwx--- 5 root root 4096 Jul 26 2018 . drwxr-xr-x 3 root root 4096 Aug 25 12:30 .. -rw-rw---- 1 root root 766452 Jul 26 2018 NOTICE.txt -rwxrwxr-x 1 root root 1511888 Jul 26 2018 aapt -rwxrwxr-x 1 root root 3049966 Jul 26 2018 aapt2 -rwxrwxr-x 1 root root 1932780 Jul 26 2018 aarch64-linux-android-ld -rwxrwxr-x 1 root root 552733 Jul 26 2018 aidl -rwxrwxr-x 1 root root 2612 Jul 26 2018 apksigner -rwxrwxr-x 1 root root 3570836 Jul 26 2018 arm-linux-androideabi-ld -rwxrwxr-x 1 root root 40126 Jul 26 2018 bcc_compat -rw-rw-r-- 1 root root 16985 Jul 26 2018 core-lambda-stubs.jar -rwxrwxr-x 1 root root 2577 Jul 26 2018 d8 -rwxrwxr-x 1 root root 1324080 Jul 26 2018 dexdump -rwxrwxr-x 1 root root 2577 Jul 26 2018 dx -rwxrwxr-x 1 root root 3570836 Jul 26 2018 i686-linux-android-ld drwxrwx--- 2 root root 4096 Jul 26 2018 lib drwxrwx--- 2 root root 4096 Jul 26 2018 lib64 -rwxrwxr-x 1 root root 1066203 Jul 26 2018 llvm-rs-cc -rwxrwxr-x 1 root root 4732 Jul 26 2018 mainDexClasses -rw-rw-r-- 1 root root 758 Jul 26 2018 mainDexClasses.rules -rw-rw-r-- 1 root root 310 Jul 26 2018 mainDexClassesNoAapt.rules -rwxrwxr-x 1 root root 1919720 Jul 26 2018 mipsel-linux-android-ld drwxrwx--- 5 root root 4096 Jul 26 2018 renderscript -rw-rw-r-- 1 root root 17 Jul 26 2018 runtime.properties -rw-rw-r-- 1 root root 59 Jul 26 2018 source.properties -rwxrwxr-x 1 root root 1476617 Jul 26 2018 split-select -rwxrwxr-x 1 root root 1756288 Jul 26 2018 x86_64-linux-android-ld -rwxrwxr-x 1 root root 237570 Jul 26 2018 zipalign': Filename too long The command '/bin/sh -c mkdir -p /opt/build-tools && cd /opt && curl -LO https://dl.google.com/android/repository/build-tools_r28.0.2-linux.zip && unzip -q build-tools_r28.0.2-linux.zip -d /opt/build-tools && rm -f build-tools_r28.0.2-linux.zip $$ echo "$(ls -la /opt/build-tools/android-9/)" && cd /opt/build-tools/android-9/ | chmod +x aapt | echo y | ./aapt' returned a non-zero code: 1 是UTF-8字节0,Unicode代码点U + 0是普通字符。因此,全部读取(也许读入一个超大字节数组),然后做

\u0000

如果您没有固定的位置,必须顺序读取每个字节,则代码很丑陋。 C创始人之一确实称nul终止字符串是历史性错误。

相反,为了不为Java字符串生成UTF-8字节0,通常将其作为C / C ++ nul终止的字符串进行进一步处理,存在编写修改后的UTF-8,也编码0字节的情况。

,

您可以通过替换拆分功能来实现。将您的十六进制字节转换为String并通过自定义字符找到0。然后使用该自定义字符分割字符串。

import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;

/**
 * Created by Administrator on 8/25/2020.
 */
public class Jtest {
    public static void main(String[] args) {
        //ByteBuffer b = /* 61 62 63 64 00 31 32 34 00 (hex) */;
        ByteBuffer b = ByteBuffer.allocate(10);

        b.put((byte)0x61);
        b.put((byte)0x62);
        b.put((byte)0x63);
        b.put((byte)0x64);
        b.put((byte)0x00);
        b.put((byte)0x31);
        b.put((byte)0x32);
        b.put((byte)0x34);
        b.put((byte)0x00);
        b.rewind();

        String s0;
        String s1;

        // print the ByteBuffer
        System.out.println("Original ByteBuffer:  "
                + Arrays.toString(b.array()));

        // `s0` will now contain “ABCD” and `s1` will contain “124”.
        String s = StandardCharsets.UTF_8.decode(b).toString();
        String ss = s.replace((char)0,';');
        String[] words = ss.split(";");
        for(int i=0; i < words.length; i++) {
            System.out.println(" Word " + i + " = " +words[i]);
        }

    }
}

我相信您可以通过删除替换来更有效地完成此任务。