FAQ

1 常见错误

1.1 error updating geoip database

[2022-08-18T00:01:10,147][ERROR][o.e.i.g.GeoIpDownloader  ] [node-1] exception during geoip databases update

在 config/elasticsearch.yml 中添加如下配置:

ingest.geoip.downloader.enabled: false

关闭 geoip 数据库的更新

1.2

2024-10-08 17:44:11	27163	/home/meetbill/butterfly/xlib/httpgateway.py:99	reqid=6b3275c8-b3a8-4f37-8d31-b65fc158504d func_name=/vm_report/host_report_import err_msg=[Server exception
Traceback (most recent call last):
  File "/home/meetbill/butterfly/xlib/protocol_json.py", line 194, in do_process
    ret = self._func(**params)
  File "/home/meetbill/butterfly/handlers/vm_report/api_vm_report.py", line 155, in host_report_import
    es_res = es.index(index=index, id=es_id, document=host_doc)
  File "/home/meetbill/butterfly/third/elasticsearch/client/utils.py", line 347, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/home/meetbill/butterfly/third/elasticsearch/client/__init__.py", line 418, in index
    body=body,
  File "/home/meetbill/butterfly/third/elasticsearch/transport.py", line 466, in perform_request
    raise e
TransportError: TransportError(429, u'circuit_breaking_exception', u'[parent] Data too large, data for [<http_request>] would be [1006944446/960.2mb], which is larger than the limit of [986061209/940.3mb], real usage: [1006943096/960.2mb], new bytes reserved: [1350/1.3kb], usages [request=0/0b, fielddata=108215/105.6kb, in_flight_requests=1350/1.3kb, model_inference=0/0b, eql_sequence=0/0b, accounting=35637464/33.9mb]')
]

解决方法

Data too large, data for [<http_request>] would be [8333415566/7.7gb],//A
which is larger than the limit of [8160437862/7.5gb], //B
real usage: [8333414416/7.7gb], //C
new bytes reserved: [1150/1.1kb] //D
 
这里有4个数值。
B处的数就是上限,超过这个就报错。(缺省是它是ES最大内存的95%,所以我判断你的-Xmx应该是8g)
C处的数值是你的本机上ES进程已使用的内存大小,
D处的数值1150就是你本次操作(或者说执行当前的任务)所需要内存,
C + D = A > B,所以报错了。
 
解决方法你可以增大-Xmx量(如果物理内存足够的话)等等。
当然,最省事的做法就是关闭CircuitBreaker检查。
indices.breaker.type: none

2 时区

Elasticsearch 默认为 UTC 时间,即零时区

在 Elasticsearch 内部,不论 date 是什么展示格式,所有date类型数据(时间字符串 or 时间戳等)在 Elasticsearch 内部存储时全部都会转换成 UTC 时间戳(并且把时区也会计算进去),最后以milliseconds-since-the-epoch 作为存储的格式。

3 数据类型

字符串将默认被同时映射成textkeyword类型。

例如对下面的文档进行索引后:

{
    "foo": "bar"
}

ElasticSearch将会为你创建下面的动态映射(dynamic mappings):

{
    "foo": {
        "type": "text",
        "fields": {
            "keyword": {
                "type": "keyword",
                "ignore_above": 256
            }
        }
    }
}

当然, 基于这个映射你即可以在foo字段上进行全文搜索, 也可以通过foo.keyword字段实现关键词搜索及数据聚合。

禁用这个特性也很方便: 可以定义 mapping 时显式声明字符串字段的类型

4 设置密码

4.1 修改配置

config/elasticsearch.yml(新增如下配置)

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true

4.2 设置密码

./bin/elasticsearch-setup-passwords interactive
... 会设置如下用户密码
Changed password for user [apm_system]
Changed password for user [kibana_system]
Changed password for user [kibana]
Changed password for user [logstash_system]
Changed password for user [beats_system]
Changed password for user [remote_monitoring_user]
Changed password for user [elastic]

4.3 验证

$ curl -u elastic:${paasword} "http://127.0.0.1:9200" | json_pp

Last updated