#318kubesphere/kubeeye: KubeEye 是为 Kubernetes 设计的巡检工具,用于发现 Kubernetes 资源(使用 OPA )、集群组件、集群节点(使用Node-Problem-Detector)等配置是否符合最佳实践,对于不符合最佳实践的,将给出修改建议。
English | 中文
KubeEye is a cloud-native cluster inspection tool specifically designed for Kubernetes, capable of identifying issues and risks within the Kubernetes cluster based on custom rules.
QuickStart
Installation
Download the installation package from Releases, which includes Helm chart, demo rules, and images for offline installation.
VERSION=v1.0.3
wget https://github.com/kubesphere/kubeeye/releases/download/${VERSION}/kubeeye-offline-${VERSION}.tar.gz
tar -zxvf kubeeye-offline-${VERSION}.tar.gz
cd kubeeye-offline-${VERSION}
# offline installation, please import the images in the 'images' folder into the local container repository yourself and modify the images repo in `chart/kubeeye/values.yaml`.
helm upgrade --install kubeeye chart/kubeeye -n kubeeye-system --create-namespace
Usage
Import Inspect Rules
The
rulesdirectory in the installation package provides demo rules, which can be customized according to specific needs.
Notice: PromQL rules need to have the endpoint of Prometheus set in advance.
kubectl apply -f rulesCreate Inspect Plan
Configure inspection plans on demand.
cat > plan.yaml << EOF
apiVersion: kubeeye.kubesphere.io/v1alpha2
kind: InspectPlan
metadata:
name: inspectplan
spec:
# The planned time for executing inspections only supports cron expressions. For example, '*/30 * * * ?' means that the inspection will be performed every 30 minutes.'
# If only a single inspection is required, then remove this parameter.
schedule: "* */12 * * ?"
# The maximum number of retained inspection results, if not filled in, will retain all.
maxTasks: 10
# Should the inspection plan be paused, applicable only to periodic inspections, true or false (default is false).
suspend: false
# Inspection timeout, default 10 minutes.
timeout: 10m
# Inspection rule list, used to associate corresponding inspection rules, please fill in the inspectRule name.
# Execute `kubectl get inspectrule` to view the inspection rules in the cluster.
ruleNames:
- name: configmap-inspect-rules
- name: cronjob-inspect-rules
- name: daemonset-inspect-rules
- name: deployment-inspect-rules
- name: event-inspect-rules
- name: job-inspect-rules
- name: node-inspect-rules
- name: pod-inspect-rules
- name: pod-state-inspect-rules
# nodeName: master
# nodeSelector:
# node-role.kubernetes.io/master: ""
# Multi-cluster inspection (currently only supports multi-cluster inspection in KubeSphere)
# clusterName:
# - name: host
EOF
kubectl apply -f plan.yamlObtaining Inspection Reports
Check Inspection Results
# View the name of the inspection result for inspection report download.
kubectl get inspectresultCommand
## Get the address and port of kubeeye-apiserver service.
kubectl get svc -n kubeeye-system kubeeye-apiserver -o custom-columns=CLUSTER-IP:.spec.clusterIP,PORT:.spec.ports[*].port
## Download the inspection report, and please replace <> with the actual information obtained from the environment.
curl http://<svc-ip>:9090/kapis/kubeeye.kubesphere.io/v1alpha2/inspectresults/<result name>\?type\=html -o inspectReport.html
## After downloading, you can use a browser to open the HTML file for viewing.Web Console
## Create a nodePort type svc for kubeeye-apiserver.
kubectl -n kubeeye-system expose deploy kubeeye-apiserver --port=9090 --type=NodePort --name=ke-apiserver-node-port
## Enter the inspection report URL in the browser to view, and remember to replace <> with the actual information obtained from the environment.
http://<node address>:<node port>/kapis/kubeeye.kubesphere.io/v1alpha2/inspectresults/<result name>?type=htmlSupported Rules List
- OPA
- PromQL
- File Change
- Kernel Parameter Configuration
- Systemd Service Status
- Node Basic Info
- File Content Inspection
- Service Connectivity
AliyunContainerService/kube-eventer: 一个 K8S 事件同步工具,支持将事件同步到 DingDing, ES, Kafka,MySQL,Webhook 等kubewharf/kubegateway: kube-gateway 是字节跳动内部管理海量 kubernetes 集群的最佳实践。 它是为 kube-apiserver 的 HTTP2 流量专门设计并定制的七层负载均衡代理。 目标是为海量的大规模 kubernetes 集群(千级 node 以上)提供灵活的稳定的流量治理方案。
