Monday, June 21, 2010

automaticly balancing data when growing a cluster

I've been thinking about load balancing lately and how to dynamically grow a cluster and its data set. I'd like to define "balance factor" as follows:

If the number of nodes in the cluster grows by a factor of x, at some future point when the data set has also grown by a factor of x the balance factor is the ratio of the smallest node to the largest node.

In my use case it is impractical for any node to know the size of all the nodes. Load balancing decisions must be made probabilistically on limited data. Today I happened across this blog post, which is very applicable to my scenario. It presents a very good solution for load balancing between a fixed number of bins, but when adding nodes without taking the system offline it is useful to look at more than 2 random points to maximize balance factor.

I did some experiments on different values for x and n (where n is the number of random points examined), and experimentally determined an approximate equation for balance factor b:

b = 1-(1/x)^(n-1)

(by approximate I mean that it is close enough for the range of values I care about, which is x between 1.1 and 10 and n between 2 and 25. It might be exactly right but I don't have time to do a proof before band practice.)

If you actually want to use this though, you need to determine n given x and the desired b. With some manipulation we get:

n = 1 + log(1-b)/log(1/x)

Let me know if you found this useful.


  1. Hey!

    DigiPeek is the best SEO & Link Building Service Provider In The World. I have 7+ Years Experience To Build SEO, Backlinks & Improve Website Ranking.

    If you need Profile Backlinks, Forum Backlinks, Dofollow Backlinks, Manual Backlinks, Trusted SEO Backlinks, Increase Domain Rating Then You Will Contact Me.

    I am glad to help You!

    Let's TRY!

  2. I was unable to oppose remarking. Totally composed!
    tech news

  3. The Crack at this point in the software turns a page that shows user potential way all time. It provides very powerful results and is one of the most popular software of it’s kind in the world. Cash Register Pro Crack

  4. Adobe XD CC Flow has a same clean handler border, which supports employers to flinch wily in the strategy area. Adobe Experience Design CC Crack

  5. Quotes to assist You Cope once Losing a husband those we tend to care regarding ne’er absolutely abandon U.S.A.. Husband Death Quotes

  6. Adolescent gamblers can't lose their house, or spouse or family, or file for bankruptcy, however they'll exhibit adolescent-specific opposed penalties. Gambling at any age is taken into account to be an issue when it interferes with the individual's relationships with friends and family or their college 카지노 사이트 and work obligations. Not only can early playing result in later playing problems, playing during the early life can result in current playing problems. Adults could spend their paycheck on playing when the money is supposed to pay for food and housing, whereas Adolescents could wager their pocket cash or their iPod or online game participant. Gambling at any group is taken into account to be an issue when it interferes with the individual's relationships with friends and family or their college and work obligations.