Elastic Search中带有评论的博客文章的数据模型

问题描述

使用elasticsearch构建帖子/评论系统的最佳方法是什么? 我正在使用Elasticsearch作为辅​​助数据库

会有一个帖子带有多线程评论系统,也许两个 水平深。 每个帖子最多可以有500-1000条评论。 每个人的喜欢和评论都将增加计数器 评论和发布。这意味着很多索引。 另外,我想获取有关他们对应用过滤器的评论博客文章

现在,我的结构看起来像这样。在这一篇文章中,博客帖子和用户详细信息将很少被编辑,但是标签评论将被频繁添加

{
"_index": "brainstormer_ideas_with_comments","_type": "_doc","_id": "1","_version": 1,"_seq_no": 0,"_primary_term": 1,"found": true,"_source": {
    "id": 1,"brainstormer_id": 1,"idea": "cCZhvy","description": "2jJPo3hYbqeh2VBnDJeGtylVu7qfe_MRp77hTK6t7SN57GzeQG8c","user": {
        "id": "user-1","login": "pO2DSqIS--"
    },"created_at": "2020-08-13T20:35:17+00:00","like_count": 41,"comment_count": 45,"tags": [
        "bU37X","a_Rl5b","vxD.ZMo","AmvtHVuQ","yx9oSx-_D"
    ],"comments": [
        {
            "id": "comment-1","comment": "7ewh-Cqf4gQqmIK53jXbR7","tags": [
                "mJN","jFm-","hV0pi","ONGNow","HtzmDfO","dawVLk09"
            ],"created_at": "2020-08-08T20:35:17+00:00","user": {
                "id": "user-1","login": "Tl6CDNawUh"
            }
        },{
            "id": "comment-1","comment": "BKj8sAcbJJXWxAPk3HQFTZWtvQm","tags": [
                "sYj","XRLw","xtAeH","Oq6dBR","lj4_hOI","n3lhc2ig"
            ],"created_at": "2020-09-21T20:35:17+00:00","user": {
                "id": "user-2","login": "AF3KT415uf"
            }
        },"comment": "vzt7XEe2WIP3OszpLmcF8J","tags": [
                "YCH","kodm","RGv2B","Qk5R1D","ICrDjmz","4mmfLK16"
            ],"created_at": "2020-07-08T20:35:17+00:00","user": {
                "id": "user-3","login": "7xTLOuCeWD"
            }
        },"comment": "Jm6E3PrlOI","tags": [
                "IrZ","TJlf","__HQy","5VH2Vs","btvxG51","5iRoVR_k"
            ],"created_at": "2020-07-19T20:35:17+00:00","user": {
                "id": "user-4","login": "zr32RlxNak"
            }
        },"comment": "jKGzoZhcpuv4DrvoebamXLnmvyX_CK0","tags": [
                "Osa","OKlQ","cBcjt","2BcQD7","K7lLhS7","ZK1t_GXl"
            ],"created_at": "2020-07-14T20:35:17+00:00","user": {
                "id": "user-5","login": "B8LGMpPWwv"
            }
        },"comment": "L-PryTXsa1FbEnIJdH_5vlsdpfnckB1kmMJI4EVwszhc45qlW6e","tags": [
                "kRJ","Mkka","ari.I","pgWcUk","w78vFir","eOx.zRx9"
            ],"created_at": "2020-08-07T20:35:17+00:00","user": {
                "id": "user-6","login": "IG1Oo_fOcr"
            }
        }
    ]
}

}

使用嵌套对象还是父/子或其他东西更好? 关于结构以及更新弹性搜索的频率的任何建议都是 非常感谢。

谢谢

解决方法

嵌套对象和父子关系都很昂贵,请阅读trouble with nested objects博客文章以获取更多信息。

一种方法是在Elasticsearch中为主帖子上的每个评论/回复创建一个单独的文档,而不是建立严格的父子关系,而只需在其中输入一个字段即可告诉父帖子是什么,即松散耦合/您的文档之间的关系。

default refresh interval for elasticsearch is 1 sec for providing the NRT(如果需要)可以保留此默认值,也可以根据用例和性能要求对其进行微调。