Skip to content

Commit 0cfb1f0

Browse files
committed
feature: full_text_search. issue #2224
1 parent 015fecf commit 0cfb1f0

File tree

635 files changed

+106751
-25
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

635 files changed

+106751
-25
lines changed

docs/features/full_text_search.md

+88
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# 全文检索
2+
3+
## 方案
4+
基于强大的elasticsearch进行搜索,把cmdb的mongo数据使用mongo-connector同步到
5+
elasticsearch,封装es的全文搜索的API提供出来。
6+
mongo-connector,通过读取mongodb的replica oplog,将mongodb产生的操作
7+
在elasticsearch上replay,来实现单向同步。即mongo里有数据变动,mongo-connector
8+
就会把相应的数据同步到es中同时进行更新。
9+
10+
[原理图](../resource/img/mongo-connector.png)
11+
12+
## es的使用
13+
全文检索使用了es的query_string的参数,并且配合使用bool(must, must_not, should)
14+
进行了定制化的搜索,配合使用aggs进行了数据的分类汇聚上报,配合使用了highlight提供了
15+
数据的高亮。
16+
一个完整的es的query请求大致如下:
17+
\*e\*为搜索条件,
18+
```
19+
{
20+
"query": {
21+
"bool": {
22+
"must": [
23+
{
24+
"term": { "bk_obj_id": "test_search"}
25+
},
26+
{
27+
"query_string": {"query": "*e*"}
28+
}
29+
],
30+
"must_not": [
31+
{
32+
"match": {"bk_supplier_account": "*e*"}
33+
}
34+
],
35+
"should": [
36+
{
37+
"bool": {
38+
"must_not": [
39+
{
40+
"regexp": { "metadata.label.bk_biz_id": "[0-9]*" }
41+
}
42+
]
43+
}
44+
},
45+
{
46+
"term": { "metadata.label.bk_biz_id": "2" }
47+
}
48+
],
49+
"minimum_should_match" : 1
50+
}
51+
},
52+
"aggs": {
53+
"bk_obj_id_agg": {
54+
"terms": {
55+
"field": "bk_obj_id.keyword"
56+
}
57+
},
58+
"type_agg": {
59+
"terms": {
60+
"field": "_type"
61+
}
62+
}
63+
},
64+
"highlight": {
65+
"fields": {
66+
"*" : {}
67+
},
68+
"require_field_match": false
69+
}
70+
}
71+
```
72+
把返回的结果封装转换成cmdb的api返回值规范:
73+
[全文检索api](../apidoc/v3.5/full_text_find.md)
74+
75+
## mongo-connector和es的部署
76+
[部署](../overview/installation.md)
77+
第6和第7步,以及后面的配置开关full_text_search(值为off或者on)
78+
79+
## 参考github
80+
[olivere elastic](https://github.com/olivere/elastic)
81+
82+
[mongo-connector](https://github.com/yougov/mongo-connector)
83+
84+
85+
## 参考wiki
86+
[olivere elastic wiki](https://github.com/yougov/mongo-connector/wiki/Usage-with-ElasticSearch)
87+
88+
[mongo-connector wiki](https://github.com/olivere/elastic/wiki)

docs/overview/installation.md

+105-5
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55
* ZooKeeper >= 3.4.11
66
* Redis >= 3.2.11
77
* MongoDB >= 2.8.0
8+
* Elasticsearch >= 5.0.0 & < 7 (用于全文检索功能,推荐使用5.x的版本)
9+
* Mongo-connector >= 2.5.0 (用于全文检索功能,推荐3.1.1)
810

911
## CMDB 微服务进程清单
1012

@@ -51,7 +53,6 @@
5153

5254
推荐版本下载:[MongoDB 2.8.0](http://downloads.mongodb.org/linux/mongodb-linux-x86_64-rhel70-2.8.0-rc5.tgz?_ga=2.109966917.1194957577.1522583108-162706957.1522583108)
5355

54-
5556
### 4. Release包下载
5657

5758
官方发布的 **Linux Release** 包下载地址见[这里](https://github.com/Tencent/bk-cmdb/releases)。如果你想自已编译,具体的编译方法见[这里](source_compile.md)
@@ -75,7 +76,101 @@
7576

7677
详细手册请参考官方资料 [MongoDB](https://docs.mongodb.com/manual/reference/method/db.createUser/)
7778

78-
### 6. 部署CMDB
79+
### 6. 部署Elasticsearch (用于全文检索, 可选, 控制开关见第9步的full_text_search)
80+
81+
官方下载 [ElasticSearch](https://www.elastic.co/cn/downloads/past-releases)
82+
搜索5.x的版本下载,推荐下载5.0.2, 5.6.16
83+
下载后解压即可,解压后找到配置文件config/elasticsearch.yml,可以配置指定network.host为
84+
具体的host的地址
85+
然后到目录的bin目录下运行(注意,不能使用root权限运行,要普通用户):
86+
```
87+
./elasticsearch
88+
```
89+
90+
如果想部署高可能可扩展的ES,可参考官方文档[ES-guild](https://www.elastic.co/guide/index.html)
91+
92+
### 7. 部署mongo-connector (用于全文检索, 可选, 控制开关见第9步的full_text_search)
93+
94+
官方仓库 [Mongo-connector](https://github.com/yougov/mongo-connector)
95+
推荐使用pip安装:
96+
97+
```
98+
pip install elastic2-doc-manager elasticsearch
99+
pip install 'mongo-connector[elastic5]'
100+
```
101+
102+
下载后请检查python包版本,尤其python elasticsearch大版本要和下载的elasticsearch一致
103+
104+
配置配置文件config.json(配置说明参见[config](https://github.com/yougov/mongo-connector/wiki/Configuration%20Options)):
105+
106+
主要配置
107+
key前面添加__代表忽略此配置
108+
mainAddress指定mongo,如果是mongo集群,可以指向slave节点
109+
authentication暂时先别配置,认证有问题
110+
namespaces里面配置要同步的mongo里的table,false代表不同步,true代表同步,
111+
可以自行配置需要同步哪些table用于全文检索
112+
113+
```
114+
{
115+
"__comment__": "Configuration options starting with '__' are disabled",
116+
"__comment__": "To enable them, remove the preceding '__'",
117+
118+
"mainAddress": "127.0.0.1:27017",
119+
"oplogFile": "/var/log/mongo-connector/oplog.timestamp",
120+
"noDump": false,
121+
"batchSize": -1,
122+
"verbosity": 3,
123+
"continueOnError": true,
124+
125+
"logging": {
126+
"type": "file",
127+
"filename": "/var/log/mongo-connector/mongo-connector.log",
128+
"format": "%(asctime)s [%(levelname)s] %(name)s:%(lineno)d - %(message)s",
129+
"rotationWhen": "D",
130+
"rotationInterval": 1,
131+
"rotationBackups": 10,
132+
133+
"__type": "syslog",
134+
"__host": "localhost:514"
135+
},
136+
137+
"__authentication": {
138+
"adminUsername": "cc",
139+
"password": "cc",
140+
"__passwordFile": "mongo-connector.pwd"
141+
},
142+
143+
"__fields": ["field1", "field2", "field3"],
144+
145+
"exclude_fields": ["create_time", "last_time"],
146+
147+
"namespaces": {
148+
"cmdb.cc_HostBase": true,
149+
"cmdb.cc_ObjectBase": true,
150+
"cmdb.cc_ObjDes": true,
151+
"cmdb.cc_OperationLog": false
152+
},
153+
154+
"docManagers": [
155+
{
156+
"docManager": "elastic2_doc_manager",
157+
"targetURL": "127.0.0.1:9200",
158+
"__bulkSize": 1000,
159+
"uniqueKey": "_id",
160+
"autoCommitInterval": 0
161+
}
162+
]
163+
}
164+
```
165+
166+
然后运行命令启动:
167+
```
168+
mongo-connector -c config.json
169+
```
170+
171+
也可以自己写成system服务来运行
172+
173+
### 8. 部署CMDB
79174

80175
编译后下载 **cmdb.tar.gz**
81176

@@ -129,7 +224,7 @@ drwxrwxr-x 3 1004 1004 4.0K Mar 29 14:45 cmdb_hostcontroller
129224
|cmdb_auditcontroller|controller|审计数据维护服务|
130225
|cmdb_hostcontroller|controller|主机数据维护服务|
131226

132-
### 7. 初始化
227+
### 9. 初始化
133228

134229
假定安装目录是 **/data/cmdb/**
135230

@@ -151,6 +246,8 @@ drwxrwxr-x 3 1004 1004 4.0K Mar 29 14:45 cmdb_hostcontroller
151246
--blueking_cmdb_url <blueking_cmdb_url> the cmdb site url, eg: http://127.0.0.1:8088 or http://bk.tencent.com
152247
--blueking_paas_url <blueking_paas_url> the blueking paas url, eg: http://127.0.0.1:8088 or http://bk.tencent.com
153248
--listen_port <listen_port> the cmdb_webserver listen port, should be the port as same as -c <cc_url> specified, default:8083
249+
--full_text_search <full_text_search> full text search function, off or on, default off
250+
--es_url <es_url> the elasticsearch listen url
154251

155252
```
156253

@@ -170,17 +267,20 @@ drwxrwxr-x 3 1004 1004 4.0K Mar 29 14:45 cmdb_hostcontroller
170267
|--blueking_cmdb_url|该值表示部署完成后,输入到浏览器中访问的cmdb 网址, 格式: http://xx.xxx.com:80, 用户自定义填写;在没有配置 DNS 解析的情况下, 填写服务器的 IP:PORT。端口为当前cmdb_webserver监听的端口。|||
171268
|--blueking_paas_url|蓝鲸PAAS 平台的地址,对于独立部署的CC版本可以不配置|||
172269
|--listen_port|cmdb_webserver服务监听的端口,默认是8083||8083|
270+
|--full_text_search|全文检索功能开关(取值:off/on),默认是off,开启是on||off|
271+
|--es_url|elasticsearch服务监听url,默认是http://127.0.0.1:9200||http://127.0.0.1:9200|
173272

174273
**:init.py 执行成功后会自动生成cmdb各服务进程所需要的配置。**
175274

176275
**示例(示例中的参数需要用真实的值替换):**
177276

277+
如果部署了用于全文检索的第6和第7步,如要开启全文检索功能把full_text_search的值置为on
178278
``` shell
179-
python init.py --discovery 127.0.0.1:2181 --database cmdb --redis_ip 127.0.0.1 --redis_port 6379 --redis_pass cc --mongo_ip 127.0.0.1 --mongo_port 27017 --mongo_user cc --mongo_pass cc --blueking_cmdb_url http://127.0.0.1:8083 --listen_port 8083
279+
python init.py --discovery 127.0.0.1:2181 --database cmdb --redis_ip 127.0.0.1 --redis_port 6379 --redis_pass cc --mongo_ip 127.0.0.1 --mongo_port 27017 --mongo_user cc --mongo_pass cc --blueking_cmdb_url http://127.0.0.1:8083 --listen_port 8083 --full_text_search on --es_url http://127.0.0.1:9200
180280
```
181281

182282

183-
### 8. init.py 生成的配置如下
283+
### 10. init.py 生成的配置如下
184284

185285
配置文件的存储路径:{安装目录}/cmdb_adminserver/configures/
186286

docs/resource/img/mongo-connector.png

104 KB
Loading

resources/configures/topo.conf

+5
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,8 @@ maxIDleConns=1000
1010
res=conf/errors
1111
[level]
1212
businessTopoMax=6
13+
14+
# 全文检索功能开关(off,on),以及es的url,用于topo中是否启用全文检索api功能以及建立es连接
15+
[es]
16+
full_text_search=off
17+
url=http://127.0.0.1:9200

resources/configures/webserver.conf

+2
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ bk_account_url=http://bk.tencent.com/login/accounts/get_all_user/?bk_token=%s
1919
resources_path=/tmp/
2020
html_root=/data/cmdb/cmdb_webserver/static
2121
authscheme=internal
22+
# 全文检索功能开关(off,on), 用于给前端提供是否开启全文检索功能界面展示
23+
full_text_search=off
2224
#authscheme=iam
2325

2426
[errors]

scripts/init.py

+32-7
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ class FileTemplate(Template):
1515
def generate_config_file(
1616
rd_server_v, db_name_v, redis_ip_v, redis_port_v, redis_user_v,
1717
redis_pass_v, mongo_ip_v, mongo_port_v, mongo_user_v, mongo_pass_v,
18-
cc_url_v, paas_url_v, auth_address, auth_app_code,
18+
cc_url_v, paas_url_v, full_text_search, es_url_v, auth_address, auth_app_code,
1919
auth_app_secret, auth_enabled, auth_scheme
2020
):
2121
output = os.getcwd() + "/cmdb_adminserver/configures/"
@@ -31,6 +31,7 @@ def generate_config_file(
3131
redis_port=redis_port_v,
3232
cc_url=cc_url_v,
3333
paas_url=paas_url_v,
34+
es_url=es_url_v,
3435
ui_root="../web",
3536
agent_url=paas_url_v,
3637
configures_dir=output,
@@ -39,7 +40,8 @@ def generate_config_file(
3940
auth_app_code=auth_app_code,
4041
auth_app_secret=auth_app_secret,
4142
auth_enabled=auth_enabled,
42-
auth_scheme=auth_scheme
43+
auth_scheme=auth_scheme,
44+
full_text_search=full_text_search
4345
)
4446
if not os.path.exists(output):
4547
os.mkdir(output)
@@ -400,6 +402,10 @@ def generate_config_file(
400402
appCode = $auth_app_code
401403
appSecret = $auth_app_secret
402404
enable = $auth_enabled
405+
406+
[es]
407+
full_text_search = $full_text_search
408+
url=$es_url
403409
'''
404410

405411
template = FileTemplate(topo_file_template_str)
@@ -430,6 +436,7 @@ def generate_config_file(
430436
resources_path = /tmp/
431437
html_root = $ui_root
432438
authscheme = $auth_scheme
439+
full_text_search = $full_text_search
433440
434441
[app]
435442
agent_app_url = ${agent_url}/console/?app=bk_agent_setup
@@ -493,6 +500,8 @@ def main(argv):
493500
"auth_app_code": "bk_cmdb",
494501
"auth_app_secret": "",
495502
}
503+
full_text_search = 'off'
504+
es_url='http://127.0.0.1:9200'
496505

497506
server_ports = {
498507
"cmdb_adminserver": 60004,
@@ -515,9 +524,9 @@ def main(argv):
515524
"help", "discovery=", "database=", "redis_ip=", "redis_port=",
516525
"redis_user=", "redis_pass=", "mongo_ip=", "mongo_port=",
517526
"mongo_user=", "mongo_pass=", "blueking_cmdb_url=",
518-
"blueking_paas_url=", "listen_port=", "auth_address=",
527+
"blueking_paas_url=", "listen_port=", "es_url=", "auth_address=",
519528
"auth_app_code=", "auth_app_secret=", "auth_enabled=",
520-
"auth_scheme="
529+
"auth_scheme=", "full_text_search="
521530
]
522531
usage = '''
523532
usage:
@@ -538,9 +547,11 @@ def main(argv):
538547
--auth_address <auth_address> iam address
539548
--auth_app_code <auth_app_code> app code for iam, default bk_cmdb
540549
--auth_app_secret <auth_app_secret> app code for iam
550+
--full_text_search <full_text_search> full text search on or off
551+
--es_url <es_url> the es listen url, see in es dir config/elasticsearch.yml, (network.host, http.port), default: http://127.0.0.1:9200
541552
'''
542553
try:
543-
opts, _ = getopt.getopt(argv, "hd:D:r:p:x:s:m:P:X:S:u:U:a:l:", arr)
554+
opts, _ = getopt.getopt(argv, "hd:D:r:p:x:s:m:P:X:S:u:U:a:l:es", arr)
544555

545556
except getopt.GetoptError as e:
546557
print("\n \t", e.msg)
@@ -553,7 +564,7 @@ def main(argv):
553564

554565
for opt, arg in opts:
555566
if opt in ('-h', '--help'):
556-
print('init.py --discovery <discovery> --database <database> --redis_ip <redis_ip> --redis_port <redis_port> --redis_pass <redis_pass> --mongo_ip <mongo_ip> --mongo_port <mongo_port> --mongo_user <mongo_user> --mongo_pass <mongo_pass> --blueking_cmdb_url <blueking_cmdb_url> --blueking_paas_url <blueking_paas_url> --listen_port <listen_port>')
567+
print('init.py --discovery <discovery> --database <database> --redis_ip <redis_ip> --redis_port <redis_port> --redis_pass <redis_pass> --mongo_ip <mongo_ip> --mongo_port <mongo_port> --mongo_user <mongo_user> --mongo_pass <mongo_pass> --blueking_cmdb_url <blueking_cmdb_url> --blueking_paas_url <blueking_paas_url> --listen_port <listen_port> --es_url <es_url>')
557568
sys.exit()
558569
elif opt in ("-d", "--discovery"):
559570
rd_server = arg
@@ -606,6 +617,12 @@ def main(argv):
606617
elif opt in ("--auth_app_secret",):
607618
auth["auth_app_secret"] = arg
608619
print("auth_app_secret:", auth["auth_app_secret"])
620+
elif opt in ("--full_text_search",):
621+
full_text_search = arg
622+
print('full_text_search:', full_text_search)
623+
elif opt in("-es","--es_url",):
624+
es_url = arg
625+
print('es_url:', es_url)
609626

610627
if 0 == len(rd_server):
611628
print('please input the ZooKeeper address, eg:127.0.0.1:2181')
@@ -644,6 +661,14 @@ def main(argv):
644661
print('blueking cmdb url not start with http://')
645662
sys.exit()
646663

664+
if full_text_search not in ["off", "on"]:
665+
print('full_text_search can only be off or on')
666+
sys.exit()
667+
if full_text_search == "on":
668+
if not es_url.startswith("http://"):
669+
print('es url not start with http://')
670+
sys.exit()
671+
647672
if auth["auth_scheme"] not in ["internal", "iam"]:
648673
print('auth_scheme can only be internal or iam')
649674
sys.exit()
@@ -667,7 +692,7 @@ def main(argv):
667692

668693
generate_config_file(rd_server, db_name, redis_ip, redis_port, redis_user,
669694
redis_pass, mongo_ip, mongo_port, mongo_user,
670-
mongo_pass, cc_url, paas_url, **auth)
695+
mongo_pass, cc_url, paas_url, full_text_search, es_url, **auth)
671696
update_start_script(rd_server, server_ports)
672697
print('initial configurations success, configs could be found at cmdb_adminserver/configures')
673698

0 commit comments

Comments
 (0)