iPhone-用于URL有效性的Reg Exp

问题描述

| 我有一个聊天视图,用户可以在其中互相发送网址。 如果是网址,我想让用户按下链接并打开Web视图。 我正在使用使用RegexKitLite的IFTweetLabel。 当前唯一可用的支持是网址以http / https开头。 我想支持没有http的链接,例如:www.nytimes.com,甚至没有nytimes.com的\“ www \”。 (以及其他扩展)。 这是http / s前缀reg exp:
@\"([hH][tT][tT][pP][sS]?:\\\\/\\\\/[^,\'\\\">\\\\]\\\\)]*[^\\\\.,\'\\\">\\\\]\\\\)])
有人可以告诉我我需要满足其他要求的其他正则表达式吗? 我尝试使用“ This”,但是将其添加到目标C代码会产生很多问题。 谢谢     

解决方法

这将同时匹配http://example.org和www.example.org。
@\"(([hH][tT][tT][pP][sS]?:\\\\/\\\\/|www\\\\.)[^,\'\\\">\\\\]\\\\)]*\\\\.[^\\\\.,\'\\\">\\\\]\\\\)]{2,6})
尽管我添加了\“ match group \”,但是请检查RegExp返回的匹配/搜索结果,以便将正确的参数重新插入正确的位置。 如果您可以发布整个代码段,那将更加容易。 RegExp说明:
(
    (
        [hH][tT][tT][pP][sS]?:\\/\\/    # Match HTTP/http (and hTtP :)
        |                             # OR
        www\\.                         # www<literal DOT>
    )
    [^,\'\\\">\\]\\)]*                    # Match at least 1 character that are not any of space,comma,apostrophe,quotation mark,\"more than\",\"right square bracket\",\"right parenthese\"
    \\.                                # Match <literal DOT>
    [^\\.,\'\\\">\\]\\)]{2,6}              # Match 2-6 characters that are not any of dot,space,\"right parenthese\"
)
    ,以下是John Grubers URL匹配正则表达式:
(?i)\\b(?:[a-z][\\w-]+:(?:/{1,3}|[a-z0-9%])|www\\d{0,3}[.]|[a-z0-9.\\-]+[.][a-z]{2,4}/)(?:[^\\s()<>]+|\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\))+(?:\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\)|[^\\s`!()\\[\\]{};:\'\".,<>?«»“”‘’])
以下是我将其他一些正则表达式与很多Grubers正则表达式混合而成的正则表达式:
(?i)\\b(?:(?:[a-z][\\w\\-]+://(?:\\S+?(?::\\S+?)?\\@)?)|(?:(?:[a-z0-9\\-]+\\.)+[a-z]{2,4}))(?:[^\\s()<>]+|\\((?:[^\\s()<>]+|(?:\\([^\\s()<>]*\\)))*\\))*(?<![\\s`!()\\[\\]{};:\'\".,<>?«»“”‘’])
以下是一个示例程序,该示例程序通过RegexKitLite演示了每个正则表达式与以下示例文本匹配的内容:   你看到吗   http://www.stackoverflow.com?要么   http://www.stackoverflow.com/?      然后有   www.stackoverflow.com/,以及   www.stackoverflow.com/index。      也许像stackoverflow.com这样的东西   有额外的stackoverflow.com?要么   \“ stackoverflow.com \”?      也许是jobs.stackoverflow.com,或者   \'http://twitter.com/#!/ CHOCKENBERRY \',   密码!!      文件   @file:///Users/johne/rkl/rkl.html#RegexKitLiteCookbook?      也许   http://www.yahoo.com/index///i.html!   http://www.yahoo.com/////xyz.html ?! 代码:
#import <Foundation/Foundation.h>
#import \"RegexKitLite.h\"

int main(int argc,char *argv[]) {
  NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];

  NSString *urlRegex = @\"(?i)\\\\b(?:(?:[a-z][\\\\w\\\\-]+://(?:\\\\S+?(?::\\\\S+?)?\\\\@)?)|(?:(?:[a-z0-9\\\\-]+\\\\.)+[a-z]{2,4}))(?:[^\\\\s()<>]+|\\\\((?:[^\\\\s()<>]+|(?:\\\\([^\\\\s()<>]*\\\\)))*\\\\))*(?<![\\\\s`!()\\\\[\\\\]{};:\'\\\".,<>?«»“”‘’])\";

  // John Gruber\'s URL matching regex from http://daringfireball.net/2010/07/improved_regex_for_matching_urls
  NSString *gruberURLRegex = @\"(?i)\\\\b(?:[a-z][\\\\w-]+:(?:/{1,3}|[a-z0-9%])|www\\\\d{0,3}[.]|[a-z0-9.\\\\-]+[.][a-z]{2,4}/)(?:[^\\\\s()<>]+|\\\\(([^\\\\s()<>]+|(\\\\([^\\\\s()<>]+\\\\)))*\\\\))+(?:\\\\(([^\\\\s()<>]+|(\\\\([^\\\\s()<>]+\\\\)))*\\\\)|[^\\\\s`!()\\\\[\\\\]{};:\'\\\".,<>?«»“”‘’])\";

  NSString *urlString = @\"Did you see http://www.stackoverflow.com?  Or http://www.stackoverflow.com/?\\n\\nAnd then there is www.stackoverflow.com/,along with www.stackoverflow.com/index.\\n\\nMaybe something like stackoverflow.com with extra stackoverflow.com?  Or \\\"stackoverflow.com\\\"?\\n\\nPerhaps jobs.stackoverflow.com,or \'http://twitter.com/#!/CHOCKENBERRY\',the CHOCKLOCK!!\\n\\nFile @file:///Users/johne/rkl/rkl.html#RegexKitLiteCookbook?\\n\\nMaybe http://www.yahoo.com/index///i.html!  http://www.yahoo.com/////xyz.html?!\";

  NSLog(@\"String :\\n\\n%@\\n\\n\",urlString);

  NSLog(@\"Matches: %@\\n\",[urlString componentsMatchedByRegex:urlRegex]);

  NSLog(@\"Gruber URL Regex Matches: %@\\n\",[urlString componentsMatchedByRegex:gruberURLRegex]);

  [pool release]; pool = NULL;
  return(0);
}
编译:
shell% gcc -o url url.m RegexKitLite.m -framework Foundation -licucore
运行时:
shell% ./url
2011-05-27 20:32:58.204 url[25520:903] String :

Did you see http://www.stackoverflow.com?  Or http://www.stackoverflow.com/?

And then there is www.stackoverflow.com/,along with www.stackoverflow.com/index.

Maybe something like stackoverflow.com with extra stackoverflow.com?  Or \"stackoverflow.com\"?

Perhaps jobs.stackoverflow.com,the CHOCKLOCK!!

File @file:///Users/johne/rkl/rkl.html#RegexKitLiteCookbook?

Maybe http://www.yahoo.com/index///i.html!  http://www.yahoo.com/////xyz.html?!

2011-05-27 20:32:58.211 url[25520:903] Matches: (
    \"http://www.stackoverflow.com\",\"http://www.stackoverflow.com/\",\"www.stackoverflow.com/\",\"www.stackoverflow.com/index\",\"stackoverflow.com\",\"jobs.stackoverflow.com\",\"http://twitter.com/#!/CHOCKENBERRY\",\"file:///Users/johne/rkl/rkl.html#RegexKitLiteCookbook\",\"http://www.yahoo.com/index///i.html\",\"http://www.yahoo.com/////xyz.html\"
)
2011-05-27 20:32:58.213 url[25520:903] Gruber URL Regex Matches: (
    \"http://www.stackoverflow.com\",\"http://www.yahoo.com/////xyz.html\"
)
编辑2011/05/27:对正则表达式进行了较小的更改,以解决与
(
)
括号不正确匹配的问题。 编辑2011/05/27:发现了一些其他的极端情况,上面的正则表达式处理不好。更新的正则表达式:
(?i)\\b(?:[a-z][\\w\\-]+://(?:\\S+?(?::\\S+?)?\\@)?)?(?:(?:(?<!:/|\\.)(?:(?:[a-z0-9\\-]+\\.)+[a-z]{2,4}(?![a-z]))|(?<=://)/))(?:(?:[^\\s()<>]+|\\((?:[^\\s()<>]+|(?:\\([^\\s()<>]*\\)))*\\))*)(?<![\\s`!()\\[\\]{};:\'\".,<>?«»“”‘’])
...作为Obj-C字符串:
@\"(?i)\\\\b(?:[a-z][\\\\w\\\\-]+://(?:\\\\S+?(?::\\\\S+?)?\\\\@)?)?(?:(?:(?<!:/|\\\\.)(?:(?:[a-z0-9\\\\-]+\\\\.)+[a-z]{2,4}(?![a-z]))|(?<=://)/))(?:(?:[^\\\\s()<>]+|\\\\((?:[^\\\\s()<>]+|(?:\\\\([^\\\\s()<>]*\\\\)))*\\\\))*)(?<![\\\\s`!()\\\\[\\\\]{};:\'\\\".,<>?«»“”‘’])\";
OP还询问如何确保尾随TLD为“有效”。这是Obj-C字符串形式的正则表达式,带有所有当前有效的TLD(截至2011/05/27):
@\"(?i)\\\\b(?:[a-z][\\\\w\\\\-]+://(?:\\\\S+?(?::\\\\S+?)?\\\\@)?)?(?:(?:(?<!:/|\\\\.)(?:(?:[a-z0-9\\\\-]+\\\\.)+(?:(ac|ad|ae|aero|af|ag|ai|al|am|an|ao|aq|ar|arpa|as|asia|at|au|aw|ax|az|ba|bb|bd|be|bf|bg|bh|bi|biz|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cat|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|com|coop|cr|cu|cv|cx|cy|cz|de|dj|dk|dm|do|dz|ec|edu|ee|eg|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gg|gh|gi|gl|gm|gn|gov|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|im|in|info|int|io|iq|ir|is|it|je|jm|jo|jobs|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|me|mg|mh|mil|mk|ml|mm|mn|mo|mobi|mp|mq|mr|ms|mt|mu|museum|mv|mw|mx|my|mz|na|name|nc|ne|net|nf|ng|ni|nl|no|np|nr|nu|nz|om|org|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|pro|ps|pt|pw|py|qa|re|ro|rs|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|sk|sl|sm|sn|so|sr|st|su|sv|sy|sz|tc|td|tel|tf|tg|th|tj|tk|tl|tm|tn|to|tp|tr|travel|tt|tv|tw|tz|ua|ug|uk|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|xn--0zwm56d|xn--11b5bs3a9aj6g|xn--3e0b707e|xn--45brj9c|xn--80akhbyknj4f|xn--90a3ac|xn--9t4b11yi5a|xn--clchc0ea0b2g2a9gcd|xn--deba0ad|xn--fiqs8s|xn--fiqz9s|xn--fpcrj9c3d|xn--fzc2c9e2c|xn--g6w251d|xn--gecrj9c|xn--h2brj9c|xn--hgbk6aj7f53bba|xn--hlcj6aya9esc7a|xn--j6w193g|xn--jxalpdlp|xn--kgbechtv|xn--kprw13d|xn--kpry57d|xn--lgbbat1ad8j|xn--mgbaam7a8h|xn--mgbayh7gpa|xn--mgbbh1a71e|xn--mgbc0a9azcg|xn--mgberp4a5d4ar|xn--o3cw4h|xn--ogbpf8fl|xn--p1ai|xn--pgbs0dh|xn--s9brj9c|xn--wgbh1c|xn--wgbl6a|xn--xkc2al3hye2a|xn--xkc2dl3a5ee0h|xn--yfro4i67o|xn--ygbi2ammx|xn--zckzah|xxx|ye|yt|za|zm|zw))(?![a-z]))|(?<=://)/))(?:(?:[^\\\\s()<>]+|\\\\((?:[^\\\\s()<>]+|(?:\\\\([^\\\\s()<>]*\\\\)))*\\\\))*)(?<![\\\\s`!()\\\\[\\\\]{};:\'\\\".,<>?«»“”‘’])\";
    ,您不想为此使用正则表达式。 您想要一个13英镑,它将为您找到所有这些。     

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...