Golang系列文章：正则表达式

上次我们写了个并发程序，来抓取几个站点的网页内容，其中使用到了正则表达式来获取站点的域名部分，作为文件名存储本地文件，今天，我们就来总结一下正则表达式的常用方法。

首先，在Go语言中，正则表达式相关的操作都封装在regexp这个包中，所以使用前需要引入该包。

下面我们来写个最简单的案例：

// regexp.go

package main

import (
    "fmt"
    "regexp"
)

func main() {
    // 是否匹配指定规则
    isMatch,_ := regexp.MatchString("^hel+o","hello world")

    fmt.Println(isMatch)        // true
}

以上代码直接调用了regexp.MatchString(regexpString,targetString)方法，作用是检测字符串是否匹配指定的规则：以hel+o开头（其中l出现 1 ~ n 次）。

我们还可以先对正则表达式进行预编译，编译后可在后面程序中多次使用：

// regexp.go

package main

import (
    "fmt"
    "regexp"
)

func main() {
    // 预编译当前正则表达式
    re,_ := regexp.Compile("^hel+o")

    // 是否匹配指定字符串
    isMatch := re.MatchString("hello world")

    fmt.Println(isMatch)        // true

    // 获取查找结果
    result := re.FindString("hello world")

    fmt.Println(result)         // hello

    // 获取匹配结果的起始位置和结束位置索引
    indexResult := re.FindStringIndex("hello world")

    fmt.Println(indexResult)    // [0 5]
}

注意在上面程序中，MatchString方法只需传入目标字符串即可，另外，我们还使用FindString方法在字符串中查找正则匹配到的子串，使用FindStringIndex方法获取匹配结果出现的起始位置索引和结束位置的下一个索引。

当然，我们还可以对捕获组进行匹配：

// regexp.go

package main

import (
    "fmt"
    "regexp"
)

func main() {
    // 带捕获组的正则
    re,_ := regexp.Compile("h(\\w+)o")

    // 匹配捕获组
    subResult := re.FindStringSubmatch("hello world")

    fmt.Println(subResult)      // [hello ell]

    indexResult := re.FindStringSubmatchIndex("hello world")

    fmt.Println(indexResult);   // [0 5 1 4]
}

程序中返回了匹配项内容和捕获组内容，以及各自的索引信息。

在上面几段代码中，都使用了regexp.Compile(regexpString)方法对正则字符串进行预编译，我们还可以调用regexp.MustCompile(regexpString)来做同样的操作，区别在于，后者只返回一个结果，可作为常量使用：

// regexp.go

package main

import (
    "fmt"
    "regexp"
)

// MustCompile方法只返回一个结果 可作为常量
var RE = regexp.MustCompile("Jack")

func main() {
    // 目标字符串
    target := "Jack and Jack‘s friend"

    // 替换操作
    result := RE.ReplaceAllString(target,"John")

    fmt.Println(result);  // John and John‘s friend
}

上面代码中，我们定义了正则常量，然后调用ReplaceAllString对目标字符串进行了替换操作。

关于正则表达式，就先写这么多，其实还有很多内容，后续再做总结。

Golang系列文章：正则表达式

相关文章