Hi. In the previous part, we harnessed randomness to generate boids and added a rule for obstacle avoidance. We'll be completing our experiment in this part by covering the following:
- Adding rules for flock behaviour
- Using parallelism to boost performance
- Bench-marking performance
You can download the project at this point from this tagged version.
Life in a flock consists of three simple rules as mentioned in the original paper.
- Collision Avoidance: avoid collisions with nearby flockmates
- Velocity Matching: attempt to match velocity with nearby flockmates
- Flock Centering: attempt to stay close to nearby flockmates
These rules together with obstacle avoidance governs how each boid moves relative to it's neighbours. In this section, we will integrate these rules into our simulation. At the end of it you will have something like this.
Refactor for discipline
frame_update directly changes the velocity of the boid if it encounters an obstacle. Here's the code snippet.
This won't work when there are multiple rules. So we'll introduce the concept the of acceleration.
- Each rule has a constant weightage.
- Each rule's contribution to the new velocity is directly proportional to its weight.
- The difference between current and new velocity is the acceleration.
- The magnitude of acceleration has an upper limit.
- The clamped acceleration changes the current velocity.
To integrate this concept into our boid model, we'll have to do some refactoring. Let's start by creating the
calc_acc method and few constant values.
This second parameter of this method
vel is the ideal velocity of the boid calculated by applying a rule. The acceleration is the difference between current velocity and maximum velocity along the ideal direction. A clamped value of acceleration is returned.
Next we move the obstacle avoidance logic to
apply_rules method. The only difference is in
acc += self.calc_acc(&dir) * OBSTACLE_W;. Instead of directly changing velocity, we add a proportional amount to the total acceleration.
Then we modify
frame_update to use the acceleration value from
apply_rules to calculate the new velocity. This was only refactoring and it should not have any effect on the behaviour.
For this animation, I changed
SPAWN_NUMBER to 50 in
main.rs and ran
cargo run --release. A release build is highly optimized and performs much better.
Rules make a flock
To use neighbour information, we'll have to pass some extra parameters to
frame_update. Most importantly we need the list of boids.
We pass a clone of all the boids and the index number of the boid calling the
frame_update method. Here,
clone performs a deep copy since we have derived the
Clone attribute for struct
Boid. This is essential, because Rust will not allow passing
frame_update. Its rules do not allow creating an immutable reference, when a mutable reference has already been borrowed by
iter_mut. Logically, it makes sense. If we change the contents of the list while using it for calculations, we will get unpredictable results.
To accommodate this change, method type signatures must be changed in
The functions are much more readable because each parameter has been formatted to a single line by
rustfmt. It is the default Rust code formatter and applies formatting rules uniformly across most open source projects.
Let's aggregate neighbour information. Define the following constants at the top of
There are three types of constants:
- Scaling factors most of which we have defined earlier.
- Rule weightage defines the contribution of each rule to the final acceleration.
- In limits,
VIEW_ANGis 120 degrees, indicating the range of vision for a boid.
VIEW_Ris the radius of the sphere inside which neighbours can be detected.
COLLISION_Ris the radius inside which a boid tries to avoid collision with other boids.
To implement rules in
apply_rules method, we need to create accumulators for neighbour information.
There are 3 accumulators for each flock rule and 2 counters for how many neighbours have matched certain conditions.
The logic is to filter for boids that match the following criteria:
- It's not the one calling this method. We do this by comparing the index.
- It is within the viewing angle. We've defined viewing angle to 120 degrees in any direction from the velocity.
- It is within the viewing radius of
With these we aggregate the following data:
neighbourssimply contains the number of boids that satisfied the above conditions.
alignmentsums up velocities.
cohesioncombined the position of neighbours.
collisionadds vectors in the opposite direction to neighbours that are too close. The closer the neighbou the larger the repulsion.
too_nearcounts how many neighbours matched this condition.
We use the aggregated data to create an acceleration that affects the boid.
A rule that has no valid neighbours should not contribute to acceleration. That is why we'll perform a few checks before applying the rules.
cohesionare only applied if there is at least one valid neighbour.
alignmentis applied in the direction of average velocity of all valid neighbours.
cohesionis calculated as the direction toward the average position (center of mass) of all valid neighbours.
collisionis applied only when there is at least one boid that is too close. It applies an acceleration towards the direction of repulsion.
cargo run --release and be awed by the magic.
If you don't get something like this, you'll have to tweak the constant values and weights. These are some general observations.
MIN_VELis a helpful step. It keeps the flock from collapsing on itself due to cohesive forces.
ALIGN_Wis good. It keeps the flock from becoming too uniform.
MAX_ACCcan make the boids jittery because a large acceleration causes sudden changes in velocity.
Mark the bench
Our experiment has achieved its goal but it's not yet complete. Rust is known for performance and hence, it is our duty to benchmark the performance of
We'll use the criterion crate for performance benchmarking. I won't be detailing every change needed to make this run, mostly because it involves a non-trivial amount of refactoring. You can directly download the tagged version. The major changes are given below.
- Till now, we treated rboids as a binary/executable project but
rboidsto be a library type project. To make this change, I added a
lib.rsfile that exports
pub mod boid;. There is no need to modify
main.rs. A library type project can still have a
mainfunction for executing code.
boid.rsfor convenience. Since
criteriontreats the project like a library, we cannot use any function from
- Add benching related dependencies and configuration to
- Create a benchmark for sampling
frame_updatewith different number of boids.
The benchmark runs for 10, 50, 100, 200, 500, 1000 number of boids. It samples 10 times for each value and each sample runs 100 frames. The benchmark ignores the cost of rendering the scene and assumes a constant
delta_time of 0.02 seconds.
cargo bench runs the benchmark and generates the data for the graphs shown below.
gnuplot is an external dependency necessary for generating these graphs.
The line graph shows that time complexity is roughly quadratic to the number of boids. Logically too, this makes sense, since each boid iterates over all other boids to compute aggregate neighbour information.
Other graphs show the variation in run time for different samples.
frame_update/1000 is the graph for the run with 1000 boids. At an average of 3.05 seconds for 100 frames, it comes out to about 33 frames fps. This does not match up to the 60 fps milestone we want to achieve.
Interestingly, this is the smallest section of this entire series. Rust has forced us to design our program a way that makes parallelism trivial. We'll use the
rayon crate. Add
rayon = "1.3.0" to dependencies.
We only need to parallelize the operation of calling
frame_update. This is computationally the most heavy method and the only one whose implementation we can control. Since the method itself takes no mutable references, there is no possibility of data races or deadlock situations of any kind. To do this, we convert the mutable iterator to a parallel mutable iterator by using
However, compiling it will throw an error. The key message is this.
std::marker::Sync is not implemented for dyn ncollide3d::query::ray::ray::RayCast<f32>
The compiler is telling us that something cannot be shared across threads because it does not implement
Sync. We are using
dyn RayCast<f32> to defined the type obstacle shapes. Let's check documentation for one of the shapes.
All the shapes we are using implement
Sync as part of Auto Trait. We need to explicitly tell the compiler that these shapes can be shared across threads. Change the type to
Box<dyn RayCast<f32> + Sync in all the lines that refer to obstacles.
When we use
win.input.delta_time(), the compiler checks if
win can be sent across threads which is not possible. The solution is to store
delta_time in a variable before passing it in the method call.
Congratulations! You have achieved parallelism in Rust. There should be no apparent change because our simulation only has 50 boids. The difference will become visible when simulating a large number of boids.
I added a new benchmark to compare sequential and parallel versions. Again, I will not be describing the benchmark here, mostly because I am not fully aware how
criterion works. The results are shown below and you can download the tagged version also given below.
Parallel operation drops the average time for 100 frames for 1000 boids from 3.02 seconds to 1.20 seconds. Yes! We broke the 60 fps mark.
We've reached the land of boids. Take a moment to enjoy your creation.
There's a lot more tinkering we can do with this experiment.
- Create a scene with different obstacles.
- Change the lighting and the material for the boids to make a more colourful animation.
- Add roll to the boid model. Right now the boid is a cone. Since it has rotational symmetry about its axis, we cannot make it roll as it changes the velocity. Change the boid model to something like a paper airplane and add 'make it roll' while it changes directions.
A combination of Rust and procedurally generated graphics has made this a fascinating experiment. I hope you have enjoyed working through it as much as I did creating it.
All criticism and questions are welcome. File an issue at the repo or comment below with your GitHub account.