Secure your business from login to chargeback
Stop fraud, break down data silos, and lower friction with Sift.
- Achieve up to 285% ROI
- Increase user acceptance rates up to 99%
- Drop time spent on manual review up to 80%
By Kanak Biscuitwala /
Updated
We’re happy to announce that we’re open sourcing one component of our Kafka operations toolkit: an algorithm that computes a new partition assignment to brokers while obeying the following properties:
The code and binaries can be found at: https://github.com/SiftScience/kafka-assigner, and it has been released under the Apache License, Version 2.0.
One of the most important properties of many distributed systems is elasticity. They can be resilient to failure, and can be expanded to handle additional load. Apache Kafka certainly checks both boxes. However, the out-of-the-box ability for Kafka to change its broker cluster topology is somewhat limited. When a new node is added, for example, an operator has two choices:
The first choice is not ideal because one possible motivation for expanding the cluster is to allow it to share the burden of serving a topic that may require additional computing resources. Moving an entire topic is limited in that sense.
The second choice requires either a lot of manual work (many production Kafka clusters have thousands of partitions to assign), or custom scripting to automate the process. It turns out, though, that automation is not necessarily trivial. In particular, the assignment algorithm needs to take current assignments into account, or else the cluster will take up significant bandwidth moving data that it didn’t need to move. The primary goal when adding a node is to have the new node take on some amount of work, so we should only move enough data to give the new node an equal share of work, and not any more.
The situation is worse when taking decommissioning or replacement of nodes into consideration. Currently, the only officially supported way to do this is to come up with a custom reassignment. Given this pain point, and the lack of flexibility when expanding the broker cluster, we wrote our own algorithm.
The high-level algorithm is as follows (given all topic partitions, replication factor, and set of nodes):
It’s worth noting that a node can accept a partition if it has capacity, and no other replica of the partition is currently being served on the node’s rack (or by the node itself if rack awareness is disabled).
Overall, the algorithm is similar in spirit to Apache Helix’s AutoRebalanceStrategy, except that it has been simplified to focus on Kafka, and rack-awareness has been added on.
When getting a reassignment, the tool will print JSON that can be sent to Kafka’s built in assigner tool. See http://kafka.apache.org/0100/ops.html#basic_ops_partitionassignment for instructions on how to use that tool.
./kafka-assignment-generator.sh --zk_string my-zk-host:2181 --mode PRINT_REASSIGNMENT
This mode is useful for decommissioning or replacing a node. The partitions will be assigned to all live hosts, excluding the hosts that are specified.
./kafka-assignment-generator.sh --zk_string my-zk-host:2181 --mode PRINT_REASSIGNMENT --broker_hosts_to_remove misbehaving-host1,misbehaving-host2
This is less common, but it allows the operator to specify the specific hosts that should participate in the broker cluster.
./kafka-assignment-generator.sh --zk_string my-zk-host:2181 --mode PRINT_REASSIGNMENT --broker_hosts host1,host2,host3
./kafka-assignment-generator.sh --zk_string my-zk-host:2181 --mode PRINT_CURRENT_BROKERS
./kafka-assignment-generator.sh --zk_string my-zk-host:2181 --mode PRINT_CURRENT_ASSIGNMENT
This is not the only Kafka-related library that we feel addresses a common problem faced by companies deploying the system. We are also looking into releasing other Kafka-related code that we’ve written, as we mentioned in a previous blog post.
Kanak works at the intersection of data infrastructure and machine learning at Sift Science. Previously, he worked on scaling distributed systems at LinkedIn, and using mobile to improve journalism and finance at Stanford.
Stop fraud, break down data silos, and lower friction with Sift.