字符串中的HTML标签,转换为普通的字符串

问题描述

HTML标签来自api响应字符串,需要显示格式化的字符串,而不是标签。以下是我尝试的代码

html字符串:

"<span class="st"><em>Bread<\/em> is a staple food,usually by baking. Throughout ... <em>Sourdough<\/em> is a type of <em>bread<\/em> produced by dough using naturally occurring yeasts and lactobacilli. ... List of <em>toast<\/em> dishes<\/span>",

尝试的代码

let data = Data(vm.description!.utf8)
if let attributedString = try? NSAttributedString(data: data,options: [.documentType: NSAttributedString.DocumentType.html],documentAttributes: nil) {
        infoDescription.attributedText = attributedString
}

尝试了其他方法

extension String {
    var htmlToAttributedString: NSAttributedString? {
        guard let data = data(using: .utf8) else { return nil }
        do {
            return try NSAttributedString(data: data,options: [.documentType: NSAttributedString.DocumentType.html,.characterEncoding:String.Encoding.utf8.rawValue],documentAttributes: nil)
        } catch {
            return nil
        }
    }
    var htmlToString: String {
        return htmlToAttributedString?.string ?? ""
    }
}

请指导我在做什么错或缺少什么。谢谢

解决方法

使用以下String扩展名从字符串中删除html标签

extension String {
    public var withoutHtml: String {
        guard let data = self.data(using: .utf8) else {
            return self
        }

        let options: [NSAttributedString.DocumentReadingOptionKey: Any] = [
            .documentType: NSAttributedString.DocumentType.html,.characterEncoding: String.Encoding.utf8.rawValue
        ]

        guard let attributedString = try? NSAttributedString(data: data,options: options,documentAttributes: nil) else {
            return self
        }

        return attributedString.string
    }
}

用法

let formattedStr = yourString?.withoutHtml
,

您需要先对HTML实体进行解码,然后才能使用当前的实现来获取样式字符串。

对于HTML实体解码,您可以参考以下内容: https://stackoverflow.com/a/30141700/3867033

但是我发现您可以使用NSAttributesString达到相同的结果。

let html1 = """
<span class="st"><em>Bread</em> is a staple food,usually by baking. Throughout ... <em>Sourdough</em> is a type of <em>bread</em> produced by dough using naturally occurring yeasts and lactobacilli. ... List of <em>toast</em> dishes</span>
"""

extension String {
  var toAttributedString: NSAttributedString? {
    return try? NSAttributedString(
      data: data(using: .utf8)!,options: [
        .documentType: NSAttributedString.DocumentType.html,],documentAttributes: nil)
  }
}

let output1 = html1.toAttributedString!.string
let output2 = output1.toAttributedString

对我来说也有点不可思议,但这确实有效...

enter image description here

,

我使用扫描仪将html文本转换为普通文本,并且效果很好。

此功能在<>标签之间剥离文本。

func stripHTML(fromString rawString: String) -> String {
    let scanner = Scanner.init(string: rawString)
    var convertedString = rawString
    while !scanner.isAtEnd {
        let _ = scanner.scanUpToString("<")
        if let text = scanner.scanUpToString(">") {
            convertedString = convertedString.replacingOccurrences(of: "\(text)>",with: "")
        }
    }
    return convertedString
}

选中here,以查看扫描仪工作原理的详细说明。

将其用作下面的代码。享受:)

let normalText = stripHTML(fromString: yourHtmlText))