基于cat api监控
es集群监控,最好是自己干吧,因为官方出了那种非常棒的x-pack做权限认证,监控,等等,做的都非常好,但是。。。是收费的。。。
自己做es集群监控,就是根据es的一些api,自己写一个java web的应用,自己做前端界面,程序里不断的每隔几秒钟,调用一次后端的接口,获取到各种监控信息,然后用前端页面显示出来,开发开发一个可视化的es集群的监控的工作台
在es老版本,有一个很好用的插件,叫做head,但是5.x之后都收口了,不让做这种插件了,主推自己的x-pack,要收费,要盈利,赚钱
1、GET /_cat/aliases?v
看到集群中有哪些索引别名
alias index filter routing.index routing.search
alias1 test1 - - -
alias2 test1 * - -
alias3 test1 - 1 1
alias4 test1 - 2 1,2
2、GET /_cat/allocation?v
看到每个节点分配了几个shard,对磁盘的占用空间大小,使用率,等等
shards disk.indices disk.used disk.avail disk.total disk.percent host ip node
5 260b 47.3gb 43.4gb 100.7gb 46 127.0.0.1 127.0.0.1 CSUXak2
3、GET /_cat/count?v
看每个索引的document数量
epoch timestamp count
1475868259 15:24:20 120
4、GET /_cat/fielddata?v
看每个节点的jvm heap内存中的fielddata内存占用情况(对分词的field进行聚合/排序要用jvm heap中的正排索引,fielddata)
id host ip node field size
Nqk-6inXQq-OxUfOUI8jNQ 127.0.0.1 127.0.0.1 Nqk-6in body 544b
Nqk-6inXQq-OxUfOUI8jNQ 127.0.0.1 127.0.0.1 Nqk-6in soul 480b
5、GET /_cat/health?v
比较全面的看一个es集群的整体健康状况,主要是看是green,yellow,red
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1475871424 16:17:04 elasticsearch green 1 1 5 5 0 0 0 0 - 100.0%
6、GET /_cat/indices?v
每个索引的具体的情况,比如有几个shard,多少个document,被删除的document有多少,占用了多少磁盘空间
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open twitter u8FNjxh8Rfy_awN11oDKYQ 1 1 1200 0 88.1kb 88.1kb
7、GET /_cat/master?v
看master node当前的具体的情况,哪个node是当前的master node
id host ip node
YzWoH_2BT-6UjVGDyPdqYg 127.0.0.1 127.0.0.1 YzWoH_2
8、GET /_cat/nodes?v
看每个node的具体的情况,就比如jvm heap内存使用率,内存使用率,cpu load,是什么角色
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
127.0.0.1 65 99 42 3.07 mdi * mJw06l1
9、GET /_cat/pending_tasks?v
看当前pending没执行完的task的具体情况,执行的是什么操作
insertOrder timeInQueue priority source
1685 855ms HIGH update-mapping [foo][t]
1686 843ms HIGH update-mapping [foo][t]
1693 753ms HIGH refresh-mapping [foo][[t]]
1688 816ms HIGH update-mapping [foo][t]
1689 802ms HIGH update-mapping [foo][t]
1690 787ms HIGH update-mapping [foo][t]
1691 773ms HIGH update-mapping [foo][t]
10、GET /_cat/plugins?v&s=component&h=name,component,version,description
看当前集群安装了哪些插件
name component version description
U7321H6 analysis-icu 5.5.1 The ICU Analysis plugin integrates Lucene ICU module into elasticsearch, adding ICU relates analysis components.
U7321H6 analysis-kuromoji 5.5.1 The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis module into elasticsearch.
U7321H6 analysis-phonetic 5.5.1 The Phonetic Analysis plugin integrates phonetic token filter analysis with elasticsearch.
U7321H6 analysis-smartcn 5.5.1 Smart Chinese Analysis plugin integrates Lucene Smart Chinese analysis module into elasticsearch.
U7321H6 analysis-stempel 5.5.1 The Stempel (Polish) Analysis plugin integrates Lucene stempel (polish) analysis module into elasticsearch.
U7321H6 analysis-ukrainian 5.5.1 The Ukrainian Analysis plugin integrates the Lucene UkrainianMorfologikAnalyzer into elasticsearch.
U7321H6 discovery-azure-classic 5.5.1 The Azure Classic Discovery plugin allows to use Azure Classic API for the unicast discovery mechanism
U7321H6 discovery-ec2 5.5.1 The EC2 discovery plugin allows to use AWS API for the unicast discovery mechanism.
U7321H6 discovery-file 5.5.1 Discovery file plugin enables unicast discovery from hosts stored in a file.
U7321H6 discovery-gce 5.5.1 The Google Compute Engine (GCE) Discovery plugin allows to use GCE API for the unicast discovery mechanism.
U7321H6 ingest-attachment 5.5.1 Ingest processor that uses Apache Tika to extract contents
U7321H6 ingest-geoip 5.5.1 Ingest processor that uses looksup geo data based on ip adresses using the Maxmind geo database
U7321H6 ingest-user-agent 5.5.1 Ingest processor that extracts information from a user agent
U7321H6 jvm-example 5.5.1 Demonstrates all the pluggable Java entry points in Elasticsearch
U7321H6 lang-javascript 5.5.1 The JavaScript language plugin allows to have javascript as the language of scripts to execute.
U7321H6 lang-python 5.5.1 The Python language plugin allows to have python as the language of scripts to execute.
U7321H6 mapper-attachments 5.5.1 The mapper attachments plugin adds the attachment type to Elasticsearch using Apache Tika.
U7321H6 mapper-murmur3 5.5.1 The Mapper Murmur3 plugin allows to compute hashes of a field's values at index-time and to store them in the index.
U7321H6 mapper-size 5.5.1 The Mapper Size plugin allows document to record their uncompressed size at index time.
U7321H6 store-smb 5.5.1 The Store SMB plugin adds support for SMB stores.
11、GET _cat/recovery?v
看shard recovery恢复的一个过程的具体情况
index shard time type stage source_host source_node target_host target_node repository snapshot files files_recovered files_percent files_total bytes bytes_recovered bytes_percent bytes_total translog_ops translog_ops_recovered translog_ops_percent
twitter 0 13ms store done n/a n/a node0 node-0 n/a n/a 0 0 100% 13 0 0 100% 9928 0 0 100.0%
12、GET /_cat/repositories?v
查看用于snapshotting的repository有哪些
id type
repo1 fs
repo2 s3
13、GET /_cat/thread_pool?v
看每个线程池的具体的情况
Z6MkIvC bulk 0 0 0
Z6MkIvC fetch_shard_started 0 0 0
Z6MkIvC fetch_shard_store 0 0 0
Z6MkIvC flush 0 0 0
Z6MkIvC force_merge 0 0 0
Z6MkIvC generic 0 0 0
Z6MkIvC get 0 0 0
Z6MkIvC index 0 0 0
Z6MkIvC listener 0 0 0
Z6MkIvC management 1 0 0
Z6MkIvC refresh 0 0 0
Z6MkIvC search 0 0 0
Z6MkIvC snapshot 0 0 0
Z6MkIvC warmer 0 0 0
14、GET _cat/shards?v
看每个shard的具体的情况
twitter 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA
twitter 0 r UNASSIGNED
15、GET /_cat/segments?v
看每个segement,索引segment文件的情况,在哪个node上,有多少个document,占用了多少磁盘空间,有多少数据在内存中,是否可以搜索
index shard prirep ip segment generation docs.count docs.deleted size size.memory committed searchable version compound
test 3 p 127.0.0.1 _0 0 1 0 3kb 2042 false true 6.5.1 true
test1 3 p 127.0.0.1 _0 0 1 0 3kb 2042 false true 6.5.1 true
16、GET /_cat/snapshots?v&s=id
看当前执行的snapshot的操作
id status start_epoch start_time end_epoch end_time duration indices successful_shards failed_shards total_shards
snap1 FAILED 1445616705 18:11:45 1445616978 18:16:18 4.6m 1 4 1 5
snap2 SUCCESS 1445634298 23:04:58 1445634672 23:11:12 6.2m 2 10 0 10
17、GET /_cat/templates?v&s=name
看当前有的那些tempalte,具体的情况是什么
name template order version
template0 te* 0
template1 tea* 1
template2 teak* 2 7
基于cluster进行监控
1、GET _cluster/health
{
"cluster_name" : "testcluster",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 50.0
}
2、GET _cluster/stats?human&pretty
{
"timestamp": 1459427693515,
"cluster_name": "elasticsearch",
"status": "green",
"indices": {
"count": 2,
"shards": {
"total": 10,
"primaries": 10,
"replication": 0,
"index": {
"shards": {
"min": 5,
"max": 5,
"avg": 5
},
"primaries": {
"min": 5,
"max": 5,
"avg": 5
},
"replication": {
"min": 0,
"max": 0,
"avg": 0
}
}
},
"docs": {
"count": 10,
"deleted": 0
},
"store": {
"size": "16.2kb",
"size_in_bytes": 16684,
"throttle_time": "0s",
"throttle_time_in_millis": 0
},
"fielddata": {
"memory_size": "0b",
"memory_size_in_bytes": 0,
"evictions": 0
},
"query_cache": {
"memory_size": "0b",
"memory_size_in_bytes": 0,
"total_count": 0,
"hit_count": 0,
"miss_count": 0,
"cache_size": 0,
"cache_count": 0,
"evictions": 0
},
"completion": {
"size": "0b",
"size_in_bytes": 0
},
"segments": {
"count": 4,
"memory": "8.6kb",
"memory_in_bytes": 8898,
"terms_memory": "6.3kb",
"terms_memory_in_bytes": 6522,
"stored_fields_memory": "1.2kb",
"stored_fields_memory_in_bytes": 1248,
"term_vectors_memory": "0b",
"term_vectors_memory_in_bytes": 0,
"norms_memory": "384b",
"norms_memory_in_bytes": 384,
"doc_values_memory": "744b",
"doc_values_memory_in_bytes": 744,
"index_writer_memory": "0b",
"index_writer_memory_in_bytes": 0,
"version_map_memory": "0b",
"version_map_memory_in_bytes": 0,
"fixed_bit_set": "0b",
"fixed_bit_set_memory_in_bytes": 0,
"file_sizes": {}
},
"percolator": {
"num_queries": 0
}
},
"nodes": {
"count": {
"total": 1,
"data": 1,
"coordinating_only": 0,
"master": 1,
"ingest": 1
},
"versions": [
"5.5.1"
],
"os": {
"available_processors": 8,
"allocated_processors": 8,
"names": [
{
"name": "Mac OS X",
"count": 1
}
],
"mem" : {
"total" : "16gb",
"total_in_bytes" : 17179869184,
"free" : "78.1mb",
"free_in_bytes" : 81960960,
"used" : "15.9gb",
"used_in_bytes" : 17097908224,
"free_percent" : 0,
"used_percent" : 100
}
},
"process": {
"cpu": {
"percent": 9
},
"open_file_descriptors": {
"min": 268,
"max": 268,
"avg": 268
}
},
"jvm": {
"max_uptime": "13.7s",
"max_uptime_in_millis": 13737,
"versions": [
{
"version": "1.8.0_74",
"vm_name": "Java HotSpot(TM) 64-Bit Server VM",
"vm_version": "25.74-b02",
"vm_vendor": "Oracle Corporation",
"count": 1
}
],
"mem": {
"heap_used": "57.5mb",
"heap_used_in_bytes": 60312664,
"heap_max": "989.8mb",
"heap_max_in_bytes": 1037959168
},
"threads": 90
},
"fs": {
"total": "200.6gb",
"total_in_bytes": 215429193728,
"free": "32.6gb",
"free_in_bytes": 35064553472,
"available": "32.4gb",
"available_in_bytes": 34802409472
},
"plugins": [
{
"name": "analysis-icu",
"version": "5.5.1",
"description": "The ICU Analysis plugin integrates Lucene ICU module into elasticsearch, adding ICU relates analysis components.",
"classname": "org.elasticsearch.plugin.analysis.icu.AnalysisICUPlugin",
"has_native_controller": false
},
{
"name": "ingest-geoip",
"version": "5.5.1",
"description": "Ingest processor that uses looksup geo data based on ip adresses using the Maxmind geo database",
"classname": "org.elasticsearch.ingest.geoip.IngestGeoIpPlugin",
"has_native_controller": false
},
{
"name": "ingest-user-agent",
"version": "5.5.1",
"description": "Ingest processor that extracts information from a user agent",
"classname": "org.elasticsearch.ingest.useragent.IngestUserAgentPlugin",
"has_native_controller": false
}
]
}
}
3、GET _cluster/pending_tasks
{
"tasks": [
{
"insert_order": 101,
"priority": "URGENT",
"source": "create-index [foo_9], cause [api]",
"time_in_queue_millis": 86,
"time_in_queue": "86ms"
},
{
"insert_order": 46,
"priority": "HIGH",
"source": "shard-started ([foo_2][1], node[tMTocMvQQgGCkj7QDHl3OA], [P], s[INITIALIZING]), reason [after recovery from shard_store]",
"time_in_queue_millis": 842,
"time_in_queue": "842ms"
},
{
"insert_order": 45,
"priority": "HIGH",
"source": "shard-started ([foo_2][0], node[tMTocMvQQgGCkj7QDHl3OA], [P], s[INITIALIZING]), reason [after recovery from shard_store]",
"time_in_queue_millis": 858,
"time_in_queue": "858ms"
}
]
}