ECFS Health Check- GCP Labels and Tags

The Elastifile Health Check tool validates different aspects of the Elastifile cluster like system configurations, features and settings.

One of the aspects is 'Labels and Tags' which are associated to the Elastifile GCP instances and are required for the cluster operations.

This KB article highlights the 'Labels and Tags' tests, so customers will be able to resolve Errors/ Warnings which are raised as part of executing the Health Check tool.

This KB is valid for Elastifile Load Balancer based on GCP routes at this point.

Here are the checks running by the tool, their description and resolution guide:

Labels

Cluster Hash

The 'cluster-hash' label should be on every Elastifile instance including EMS, storage nodes and replication agents. The label value should be the same among all of the instances in a single cluster. Otherwise, Error will be shown

Resolution:

  1. Check all GCP instances which are part of this cluster including EMS, storage nodes and replication agents and validate if any instance misses the 'cluster-hash' label or if the value is different between instances within the same cluster.
  2. Check if the 'cluster-hash' value is the same in all storage nodes. If not, contact elastifile-support@google.com
  3. If yes, add/ correct the 'cluster-hash' label to all instances with the value of the storage nodes.

 

Instance Type

The 'ecfs-instance-type' label should be on every Elastifile instance including EMS, storage nodes and replication agents. Depends on the instance type, the label value should be one of the following (Otherwise, Error will be shown) :

  • management
  • storage-node
  • replication-agent

Resolution:

  1. Check all GCP instances which are part of this cluster including EMS, storage nodes and replication agents and validate if any instance misses the 'ecfs-instance-type' label or if the value is different than the expected one based on the instance type. 
  2. Add/ correct the 'ecfs-instance-type' label with one of the values from the above list.

 

Load Balancer Name

The 'elfs-lb-name' label should be on all of the storage nodes only. The value should be the same as the 'cloud_provider' value and the same on all storage nodes. Otherwise, Error will be shown.

Resolution:

  1. Check the value of the following command:
    elfs-cli cloud_provider show --id 1 | grep load_balancer_name
  2. Check all GCP storage node instances and validate that the 'elfs-lb-name'  exists and has the same value as the command in the previous step.
  3. Add/ correct the 'elfs-lb-name' label with the value from the command from step 1st.

 

ECFS Load Balanced

The 'elfs-loadbalanced' label should be on all of the storage nodes only. The value should be either 'yes' or 'no'. The value is not checked, but Error will be shown if the label does not exist in any of the storage nodes.

Resolution:

  1. Check all GCP storage node instances and validate which are the instances who don't have the elfs-loadbalanced' label and their IPs.
  2. Based on the instance IP, check the node status by the output of the following command:
    ecs-cli nodes
  3. Add the 'elfs-loadbalanced' label with the value of 'yes'  for the instances who miss the labels and have a node state of 'ENODE_RUNNING' from the command from step 2nd. Put 'no' value for any other status.

 

Network Tags

The network tags checks don't run in case that the 'Cluster Hash' check returns Error.
Correct the 'Cluster Hash' check first, and then run the tool again.

ECFS Network Tags

The Elastifile network tags should be on every Elastifile instance including EMS, storage nodes and replication agents. Otherwise, Error will be shown.

Each Elastifile instance has a single ECFS network tag created by the system, in the following format-

EMS: elastifile-management-node-<cluster_hash>

Storage Node: elastifile-storage-node-<cluster_hash>

Replication Agent: elastifile-replication-node-<cluster_hash>

Resolution:

  1. Check for the correct 'cluster-hash' which is printed by the script, or check one of the GCP instances labels.
  2. Check all GCP instances and validate which are the instances who don't have the expected ECFS network tag based on the above format.
  3. Add/ correct the network tag based on the above format and the cluster-hash from the 1st step.

 

Extra Network Tags

Verifying if there are additional network tags which are not ECFS ones or https-server. If any, print them all, accompanied by a Warning.  ECFS network tag which uses a different cluster-hash will be detected as well.

The Warning is because the extra network tags are not attached to new storage nodes (e.g. cluster expansion or NDU), and they should be added manually in case they are needed, e.g. allows https traffic from the storage nodes to EMS. You can refer to this KB for a suggested workaround.

Was this helpful?

How can we improve it?
true
Search
Clear search
Close search
Google apps
Main menu
1644465283135065385
true