Cluster Management
Last updated
Last updated
ⓒ2023. Acornsoft Corp. All rights reserved.
The primary components of a cluster are nodes, storage, and applications. To effectively manage a configured cluster and ensure it operates according to plan, monitoring, alerts, and security settings are additionally required.
Let's explore the tools and content needed for cluster management one by one.
Navigate to [Infrastructure] - [Clusters] to access functionalities related to cluster management.
Cluster Provider (Cloud Service Provider) type, Physical Location (Region)
Cluster Operation Status (Running/Stop)
Cluster Resource Allocation Type (Cluster/Service Map)
Number of Nodes Allocated to the Cluster
Allocated Resources of the Cluster
Number of GPU Nodes Allocated to the Cluster
Cluster Incident Alerts
[Function] Cluster Registration
[Function] Connect to Cluster Web Terminal
[Function] Download External Connection Certificate for the Cluster
Cocktail Cloud can be implemented in on-premises environments (physical servers) and cloud services, and continuous integration is under development.
Amazon Web Service
Microsoft Azure
Google Cloud Platform
Naver Cloud Platform
VMware
Alibaba Cloud
Tencent Cloud
Rovius Cloud
On-Premise (physical servers)
Datacenter
For the detailed process of Cluster Registration (Creation), refer to the link provided.
To check the resources and status of the registered cluster, navigate to the Cluster List screen.
Click on [Infrastructure] - [Clusters], and a list of accessible clusters will be displayed.
Information provided on the Cluster List screen includes
Cluster Name (User-defined)
Kubernetes Version
Status (Running, Stop)
Number of Nodes
Cluster Resource (CPU, Memory, Storage) Status
GPU Nodes (Number of GPU nodes configured in the cluster)
Alarms (Number of incidents occurred)
To modify the configured resources or registration information of a registered cluster, select [Infrastructure] - [Clusters], and move to the Registration Information tab.
Cloud Service Provider
Cloud Service Type
Region (Provider and server's regional/physical location)
Cluster Name (Name represented in Cocktail Cloud)
Kubernetes Version (Information about the Kubernetes version used in the cluster)
ID (Shared ID for the cluster, required for redirecting alarm messages)
Description (User description of the cluster)
Master Address (Kubernetes API address in the format "https://host:port")
Ingress Host (Host IP Address for Ingress method, Master IP or Load balancer IP)
Node Port Host Address (IP Service to be used in front of the port in the method of exposing services by attaching ports to nodes, Master IP or Load balancer IP)
Node Port Range (Range of ports to be used behind IP in the method of exposing services by attaching ports to nodes, recommended 30000~32767)
Cluster CA Certification (Enter the value of the ca.crt file after moving to the /etc/kubernetes/pki path on the master server)
Client Certificate Data (Enter the value of the admin.crt file after moving to the /etc/kubernetes/pki path on the master server)
Client Key Data (Enter the value of the admin.key file after moving to the /etc/kubernetes/pki path on the master server)
Move to the Node tab after navigating to [Infrastructure] - [Clusters] Select the specific node and move to the Monitoring tab.
Information provided includes resource usage status (CPU, Memory, Disk, Network), resource summary (Capacity, Availability, Request), and status (Event type, State, Recent occurrence time, Time elapsed since the last occurrence, Cause of occurrence, Message). Monitoring information for nodes can also be obtained from the Unified Monitoring menu, providing additional details.
To allocate storage to the cluster, navigate to [Infrastructure] - [Clusters] - [Storage Volume] and click the "+ Create" button to access the storage creation screen.
Choose the storage type for creation. Commonly, NFS and NFS Named types are available, and Azure services additionally provide Azure Disk and Azure File types.
Based on the selected type, detailed configurations for storage creation are possible. The configurable information (specifications) includes
Name: Storage name
Description: Description of storage usage
Default Storage: Option to use as the default storage
Storage Plugin: Plugin for storage
Policy: Policy for storage deletion (Retain or Delete)
Total Capacity: Total storage capacity in GB
Parameters: Storage parameter settings
Mount Options: Storage mount option settings
Label: Storage label settings
Annotation: Storage annotation settings
Applications deployed in Cocktail Cloud are deployed at the workload level, and their status can be checked by selecting the corresponding workload in [Workloads].
Details about the deployed application, including workload name, workload status, deployment type (Deployment, Daemon Set, Stateful Set, Job, Cron Job), number of instances, current resource usage (CPU, Memory), and service uptime (Age) after deployment, can be reviewed.
When alerts occur in the running workload (or instance), real-time status is provided through SMS (Slack, etc.), email, and the dashboard.
Navigate to [Infrastructure] - [Clusters] - [Alerts], where unresolved alerts are displayed in the alert list. Each alert includes the alert name (status summary), severity (Critical, Warning), and occurrence timestamp.
To view detailed information about an alert, select the alert name, and additional information will be provided through a popup.
In Cocktail Cloud, add-ons, including Prometheus, a cluster management component, provide convenience for cluster operations. The add-on manager functionality enables registration, deletion, rollback, and redeployment of components. Users can add/modify metric targets for collecting/storing add-on metrics based on their requirements.
modifying the monitoring add-on
Customize metric targets for status and resources like CPU/MEM.
Set custom thresholds for metrics (min/max values).
Trigger events and alerts based on specified metric values.
Specify individual monitoring metrics based on add-on versions.
Deploy modified metrics
Store modified metric information (Rule, Config) in ETCD.
Provide add-on registration/deletion/rollback/redeployment based on modified user information.
If storage is increased due to insufficient space or planned tasks in a configured cluster, the existing pods may not immediately reflect the increased storage information. To utilize the increased storage properly, already deployed pods need to be restarted.
Navigate to [Applications] - [Service Map] - [Workloads], select the workload list, and click the "+ Create" button to restart pods.