How to ECFS Performance Test

 

Basic guideline in order to maximize the  performance from Elastifile cluster.

In order to run the erun performance tool, install the following rpm in a Centos7 client:

 

elfs-tools-2.7.1.2-53085.fc219ee4f9c3.el7.centos.x86_64.rpm

For Erun tests

  • use the same number of clients/loaders/machines, as the number of ECFS nodes, for example, 3x nodes should be tested with 3 erun clients/loaders.
  • in the erun command:
    • clients (not the number of clients/loaders.machines...) = number of cores / 2
    • files (nr-files)= number of cores 
    • Queue = should be tuned to suit the latency demands:
      • Latency too high --> decrease Queue size
      • Latency too low --> Increase Queue size.
    • When starting a new erun test, use --initial-write-phase flag, to create new data. this will first build the working set (perform only writes...) and only then start with the requested workload.
    • Once the data is available, and there is a need to rerun a new test on the same data, using different options (such as: different queue size or read/write ratio), use --reuse-existing-files instead.
    • erun example, OLTP workload for 4 cores per node, testing 4k block size, 70/30:
      • erun --profile io  --data-payload --max-file-size 100M --clients 2 --nr-files 4 --queue-size 8 --readwrites 70 --min-io-size 4K --max-io-size 4K --duration 12000 --erun-dir `hostname` 10.99.0.2:dc/root --initial-write-phase 
    • erun example, BW tests for 4 cores per node, testing 64k block size, 70/30:
      • erun --profile io  --data-payload --max-file-size 100M --clients 2 --nr-files 4 --queue-size 4 --readwrites 70 --min-io-size 64K --max-io-size 64K --duration 12000 --erun-dir `hostname` 10.99.0.2:dc/root --io-alignment 32768 --initial-write-phase 

For Any other testing tool on linux machine, the rule of thumb is

  • Clients = half the number of the cluster cores. i.e. 3x nodes with 4x cores, should be tested with 6x clients
  • Total number of Files = as the same number the cluster cores. i.e. 3x nodes with 4x cores, should be tested with 12x clients
  • Reaching max number of IOPS, with low latency (~2ms) - using 4k or 8k block sizes.
  • Reaching max BW, where latency is less crucial (can be ~ 10-20 ms) - using 32k, 64k or 256k block sizes.

 

The current Elastifile configurations in GCP:

ECFS Configuration Type Cluster Info
SSD Persistent Disks - SMALL 4 Cores, 32GB RAM, 4 x 175GB PD SSD
SSD Persistent Disks - MEDIUM 4 Cores, 42GB RAM, 4 x 1TB PD SSD
SSD Persistent Disks - LARGE 16 Cores, 96GB RAM, 4 X 5TB PD SSD
Local SSD 16 Cores, 96GB RAM, 8 X 375GB Local SSD
Standard Persistent Disks (not under 2ms, due to standard drives..) 4 Cores, 64GB RAM, 4 x 1TB Standard PD
MAX Configuration - Local SSD  

 

Some examples of the expected performance results from different GCP configurations:

Maximum sustained IOPS (under 2ms)

 

ECFS Configuration Type Read IOPS Write IOPS Mixed Read/Write IOPS (70/30 Ratio)
Per System Per Node Per System Per Node Per System Per Node
SSD Persistent Disks - SMALL - 3 nodes 40,000 13,000 10,000 3,300 20,000 6,600
SSD Persistent Disks - MEDIUM - 3 nodes 40,000 13,000 10,000 3,300 20,000 6,600
SSD Persistent Disks - MEDIUM - 3 nodes - Single Replication 42,000 14,000 24,300 8,100 30,000 10,000
SSD Persistent Disks - LARGE - 3 nodes 74,000 24,000 19,000 6,300 45,000 15,000
SSD Persistent Disks - LARGE - 3 nodes - Single Replication 74,000 24,000 52,000 17,300 64,000 21,300
Local SSD - 3 node 178,000 59,000 51,000 17,000 105,000 35,000
Standard Persistent Disks (not under 2ms, due to standard drives..) - 6 nodes 18,000 3,000 11,500 1,900 14,000 2,300
MAX Configuration - Local SSD            

 

Maximum sustained throughput (MB/s)

ECFS Configuration Type   Read Throughput (MB/s) Write Throughput (MB/s)
  Per System Per Node Per System Per Node
SSD Persistent Disks - SMALL - 3 nodes 4 Cores, 32GB RAM, 4 x 175GB PD SSD 700 233 200 66
SSD Persistent Disks - MEDIUM - 3 nodes 4 Cores, 56GB RAM, 4 x 1TB PD SSD 700 233 200 66
SSD Persistent Disks - MEDIUM - 3 nodes - Single Replication 4 Cores, 56GB RAM, 4 x 1TB PD SSD 1100 366 395 131
SSD Persistent Disks - LARGE - 3 nodes 16 Cores, 96GB RAM, 4 X 5TB PD SSD 1,700 566 330 110
SSD Persistent Disks - LARGE - 3 nodes - Single Replication 16 Cores, 96GB RAM, 4 X 5TB PD SSD 2000 666 910 303
Local SSD - 3 nodes 16 Cores, 96GB RAM, 8 X 375GB Local SSD 3,500 1167 1,100 367
Standard Persistent Disks - 6 nodes - Default 4 Cores, 64GB RAM, 4 x 1TB Standard PD 470 80 218 36
Standard Persistent Disks - 3 nodes 4 Cores, 64GB RAM, 4 x 1TB Standard PD 240 80 112 37
Standard Persistent Disks - 6 nodes 4 Cores, 64GB RAM, 4 x 3TB Standard PD 500 83 280 45
Standard Persistent Disks - 3 nodes 4 Cores, 64GB RAM, 4 x 3TB Standard PD 242 80 150 50
Standard Persistent Disks - 3 nodes 4 Cores, 32GB RAM, 4 x 175GB Standard PD 75 25 50 17
MAX Configuration - Local SSD          

 

 

Single Client comparison tests:

Centos 7 with nfs (erun)                                        
Test (100MB files...) ECFS Configuration Type   Read IOPS Write IOPS 70/30 IOPS Read Throughput (MB/s) Write Throughput (MB/s)
  Cluster Latency-ms Per Node Per Client Cluster Latency-ms Per Node Per Client Cluster Latency-ms Per Node Per Client Cluster Per Node Per Client Cluster Per Node Per Client
1 Client,1 Connection, 4 files SSD Persistent Disks - MEDIUM - 3 nodes 4 Cores, 56GB RAM, 4 x 1TB PD SSD                                    
1 Client,6 Connections, 4 files SSD Persistent Disks - MEDIUM - 3 nodes 4 Cores, 56GB RAM, 4 x 1TB PD SSD                                    
1 Client,1 Connection, 1 files SSD Persistent Disks - MEDIUM - 3 nodes 4 Cores, 56GB RAM, 4 x 1TB PD SSD                                    
1 Client,6 Connections, 1 files SSD Persistent Disks - MEDIUM - 3 nodes 4 Cores, 56GB RAM, 4 x 1TB PD SSD                                    
                                         
1 Client,1 Connection, 20 files Local SSD - 3 nodes 20 Cores, 128GB RAM, 8 X 375GB Local SSD 23500 1.7 7833   9300 1.9 3100   12300 1.7 4100   1000 333   665 222  
1 Client,1 Connection, 1 files Local SSD - 3 nodes 20 Cores, 128GB RAM, 8 X 375GB Local SSD 14600 1.7 4867   5700 1.9 1900   13500 1.9 4500   1300 433   240 80  
1 Client,30 Connections, 20 files Local SSD - 3 nodes 20 Cores, 128GB RAM, 8 X 375GB Local SSD 180000 2.6 60000   60000 2.7 20000   113000 2.7 37667   1600 533   770 257  
1 Client,30 Connections, 1 files Local SSD - 3 nodes 20 Cores, 128GB RAM, 8 X 375GB Local SSD 194000 2.3 64667   62000 2.6 20667   117000 2.6 39000   1600 533   810 270  
                                         
Windows 2016R2 with nfs services (Latency from the GUI...)                                        
1 Client,1 Node, 20 files Local SSD - 4 nodes 20 Cores, 128GB RAM, 8 X 375GB Local SSD 24000 2.1 6000 24000 16500 2.8 4125 16500 20000 2.2 5000 20000 875 218.75 875 320 80 320
1 Client,1 Nodes, 1 files Local SSD - 4 nodes 20 Cores, 128GB RAM, 8 X 375GB Local SSD 17000 1.8 4250 17000 4500 1.9 1125 4500 12000 1.9 3000 12000 950 237.5 950 165 41.25 165
10 Clients,1 Node, 20 files each Local SSD - 4 nodes 20 Cores, 128GB RAM, 8 X 375GB Local SSD 95000 3 23750 9500 47000 2.9 11750 4700 70000 2.8 17500 7000 1800 450 180 680 170 68
10 Clients,1 Node, 1 file each Local SSD - 4 nodes 20 Cores, 128GB RAM, 8 X 375GB Local SSD 91000 1.9 22750 9100 35000 2.3 8750 3500 60000 1.9 15000 6000 1800 450 180 670 167.5 67
10 Client,4 Nodes, 40 clients total 1 files each Local SSD - 4 nodes 20 Cores, 128GB RAM, 8 X 375GB Local SSD     0 0     0 0     0 0   0 0   0 0
10 Client,4 Nodes, 40 clients total 20 files each Local SSD - 4 nodes 20 Cores, 128GB RAM, 8 X 375GB Local SSD     0 0     0 0     0 0   0 0   0 0
                                         

 

Was this helpful?

How can we improve it?
Search
Clear search
Close search
Main menu
7614326105060688042
true
Search Help Center
true
true
true
false
false