Rebalancing

Data rebalancing refers to the process of redistributing data across nodes or partitions in a distributed system to ensure optimal utilization of resources and balanced load. As data is added, removed, or updated, or as nodes are added or removed, imbalances can emerge, which might lead to hotspots (some nodes being heavily used while others are under-utilized) or inefficient data access patterns.

References

Data Rebalancing | Dagster Glossary

Redistributing data across nodes or partitions for optimal performance.

🔗dagster.io