正则表达式 – 剥离select querystring属性/值对,因此清漆不会由它们改变缓存

我的目标是将某些querystring属性及其值列入白名单,这样清漆就不会改变URL之间的缓存.

例:

Url 1: http://foo.com/someproduct.html?utm_code=google&type=hello  
Url 2: http://foo.com/someproduct.html?utm_code=yahoo&type=hello  
Url 3: http://foo.com/someproduct.html?utm_code=yahoo&type=goodbye

在上面的例子中,我想将“utm_code”列入白名单,而不是“类型”.所以在第一个url被命中之后,我希望清漆将缓存的内容提供给第二个URL.

但是,在第三个url的情况下,属性“type”值是不同的,因此应该是一个varnish cache miss.

我已经尝试了以下2种方法(发现在drupal帮助文章,我现在找不到)似乎不起作用.可能是因为我有正则表达式错误.

# 1. strip out certain querystring values so varnish does not vary cache.
set req.url = regsuball(req.url,"([\?|&])utm_(campaign|content|medium|source|term)=[^&\s]*&?","\1");
# get rid of trailing & or ?
set req.url = regsuball(req.url,"[\?|&]+$","");

# 2. strip out certain querystring values so varnish does not vary cache.
set req.url = regsuball(req.url,"([\?|&])utm_campaign=[^&\s]*&?","\1");
set req.url = regsuball(req.url,"([\?|&])foo_bar=[^&\s]*&?","([\?|&])bar_baz=[^&\s]*&?","");
我想出来,想分享.我发现这个代码使一个子程序能够做到我所需要的.
sub vcl_recv {

    # strip out certain querystring params that varnish should not vary cache by
    call normalize_req_url;

    # snip a bunch of other code
}

sub normalize_req_url {

    # Strip out Google Analytics campaign variables. They are only needed
    # by the javascript running on the page
    # utm_source,utm_medium,utm_campaign,gclid,...
    if(req.url ~ "(\?|&)(gclid|cx|ie|cof|siteurl|zanpid|origin|utm_[a-z]+|mr:[A-z]+)=") {
        set req.url = regsuball(req.url,"(gclid|cx|ie|cof|siteurl|zanpid|origin|utm_[a-z]+|mr:[A-z]+)=[%.-_A-z0-9]+&?","");
    }
    set req.url = regsub(req.url,"(\?&?)$","");
}

相关文章

正则替换html代码中img标签的src值在开发富文本信息在移动端...
正则表达式
AWK是一种处理文本文件的语言,是一个强大的文件分析工具。它...
正则表达式是特殊的字符序列,利用事先定义好的特定字符以及...
Python界一名小学生,热心分享编程学习。
收集整理每周优质开发者内容,包括、、等方面。每周五定期发...