November 6, 2024

Why the etcd Database Size Should Not Exceed 8GB?

Tania Duggal
Technical Writer

etcd is a distributed key-value store that provides a way to store data across a cluster of machines. It is an important component in many distributed systems, including Kubernetes, where it serves as the primary data store for cluster state and configuration. However, the etcd community strongly recommends that the database size should not exceed 8GB. This article explores the reasons behind this recommendation and the impacts of an excessively large etcd database on cluster performance and stability.

The Importance of Quota Restrictions

1. Startup Time

When etcd starts, it needs to open the underlying BoltDB database file, read all key-value data, and rebuild the in-memory index. As the database size grows, this process becomes increasingly time-consuming. A large database file can slow down the startup time, which can be problematic in scenarios where quick recovery is essential.

Example: A Kubernetes administrator might notice that after a node reboot, the etcd service takes significantly longer to become available, delaying the recovery of the entire cluster.

2. Memory Usage

etcd uses memory-mapped files(mmap) to map the database file into memory. If the database size exceeds the available memory on the node, it can lead to page faults, causing interruptions and decreased performance. Make sure that the database size is within the memory limits of the node, which helps maintain stable and efficient operation.

Example: An administrator might observe high memory usage and frequent OOM (Out of Memory) errors on the etcd nodes, leading to instability in the cluster.

>> Take a look at how you can debug and prevent OOM Status in Kubernetes

3. Index Performance

etcd maintains an in-memory index to facilitate fast read and write operations. As the number of keys increases, the in-memory index grows, leading to higher memory consumption and potential performance degradation. Large indexes can increase the latency of queries and updates, impacting the overall responsiveness of the system.

Example: Increased latency in API server responses due to slower etcd read/write operations.

4. BoltDB Performance

BoltDB, the underlying storage engine used by etcd, can experience performance issues with large database files. Transaction commit times can increase, leading to jitter and inconsistent performance. The time complexity of certain operations, such as allocating contiguous pages from the free list, can become a bottleneck as the database size grows.

Example: An administrator might notice inconsistent performance and increased latency during high write operations.

5. Cluster Stability

Large database files can pose risks to cluster stability. Expensive read requests, such as those involving large key ranges or high query rates, can lead to out-of-memory (OOM) errors and network bandwidth saturation. This can result in packet loss and degraded performance, affecting the overall reliability of the cluster.

Example: Frequent OOM errors and network issues during peak usage times.

6. Snapshotting

etcd uses snapshots to back up data and facilitate recovery of follower nodes. Generating and transmitting snapshots of large database files can consume significant CPU and network resources. Follower nodes may experience slower restoration times, and the leader node may struggle to keep up with the demands of snapshot generation and transmission.

Example: Slow recovery of follower nodes and increased load on the leader node during snapshot operations.

Why the etcd Database Size Should Not Exceed 8GB?

The etcd 8GB recommendation is based on practical experience and performance considerations. Even if you have a server with 32GB of memory, the 8GB limit helps ensure that etcd can operate efficiently and reliably. The etcd documentation provides more details on this limitation: etcd Performance

etcd db
etcd db

You can check the current size of your etcd database by using the etcdctl command-line tool:

etcdctl --endpoints= endpoint status --write-out=table

Note: For clusters managed by cloud providers, users typically do not need to deal with this issue directly, as the provider takes care of maintenance and ensures that the etcd database size remains within recommended limits.

What Amount of Nodes, Pods, and Other Resources Would 8GB Amount To?

The exact amount of nodes, pods, and other resources that 8GB can support depends on the specific use case and the size of the data stored in etcd. However, as a rough estimate, an 8GB etcd database can typically support a Kubernetes cluster with thousands of nodes and tens of thousands of pods.

Best Practices for Managing etcd Database Size

etcd db management
etcd db management best practices

1. Configure Appropriate Quota Values

To prevent the database size from growing uncontrollably, configure the --quota-backend-bytes flag to set a maximum size for the etcd database. The default value is 2GB, but it can be increased up to 8GB based on the specific requirements of your deployment. Ensure that the node's memory is sufficient to handle the configured quota.

2. Optimize Key-Value Sizes and Update Frequency

Minimize the size of keys and values stored in etcd to reduce the overall database size. Avoid frequent updates to large key-value pairs, as this can lead to rapid growth of the database file. Use efficient data structures and serialization formats to optimize storage usage.

Note: As a Kubernetes user or admin, you can control this by carefully designing your custom resources and avoiding storing large amounts of data in etcd. For example, avoid storing large configuration files or logs directly in etcd.

3. Perform Regular Compaction and Defragmentation

etcd supports Multi-Version Concurrency Control (MVCC), which keeps a history of key-value changes. Regularly perform compaction to purge old versions and reclaim space. Follow up with defragmentation to optimize the database file and improve performance. 

Note: Perform compaction and defragmentation during maintenance windows to minimize the impact on cluster performance. Refer to the etcd documentation for detailed instructions: etcd Maintenance.

4. Monitor and Analyze Database Usage

Use etcd's monitoring metrics to keep track of database size and usage patterns. Identify and address any irregularities that may indicate inefficient use of storage. Regularly review the types of objects stored in etcd and ensure that unnecessary data is removed. Monitor metrics such as etcd_debugging_mvcc_db_total_size_in_bytes, etcd_debugging_mvcc_keys_total, and etcd_debugging_store_expires_total. Use tools like Prometheus and Grafana to visualize and alert on these metrics.

5. Consider Sharding for Large Deployments

For deployments with very large data sets, consider sharding the data across multiple etcd clusters. This can help distribute the load and prevent any single cluster from becoming a bottleneck. Use the --etcd-servers-overrides flag in K8s API Server Configuration to configure per-resource etcd servers.

>> Take a look also at how you can configure CPU LIMITS with Best Practices

Maintaining Optimal etcd Performance

Maintaining the etcd database size within recommended limits is important for the performance, stability, and reliability of your cluster. By following best practices for database management, configuring appropriate quota values, and regularly monitoring usage, you can prevent the issues associated with excessively large etcd databases. This approach ensures smooth cluster operations and helps avoid the common pitfalls of database size management that could impact your production workloads.

PerfectScale Lettermark

Reduce your cloud bill and improve application performance today

Install in minutes and instantly receive actionable intelligence.
Subscribe to our newsletter
etcd community strongly recommends that the database size should not exceed 8GB. Delve into the implications of exceeding the 8GB limit for etcd database.
This is some text inside of a div block.
This is some text inside of a div block.

About the author

This is some text inside of a div block.
more from this author
By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.