Start with MaxSessions equal to the number of available AMPs or number of sessions you want to allocate for the job, using one instance of each operator (producers and consumer).
- Run the job and record how long it took for the job to complete.
- Then, increment the number of instances for the consumer operator. Do not change the number of sessions, because changing multiple variables makes it difficult to compare.
- Rerun the job and examine the job output:
- Did the job run faster?
- Was the second consumer instance used?
- How many rows were processed by the first vs. the second instance? (For example, were they about equal or was one doing 80% of the work while the other was only doing 20%?) If the work was balanced, another instance might improve performance. If the second instance did not do much work, a third one is likely to waste resources without much performance gain.
If the work was unbalanced, another instance might be better. If the second instance did not do much work, a third one would not likely get engaged.
- Repeat the process of increasing the number of instances for the consumer operator until you are using as many instances as it needs.
Now it is time to look at the producers. You can try increasing the number of each producer instance separately to see if it will feed data into the data stream at a higher rate.
- Increase the number of instances for each producer operator separately. Again, do not change the number of sessions.
- Rerun the job and compare the job output:
- Did the job run faster? Remember this is the ultimate goal!
- Was there a change in the number of consumer operator instances used? Because the work always is balanced across the producer instances, you should look at the impact on the consumer instances to see if the change impacted the job.
- Was there a change in the balance of the consumer operator instances? You want to balance the number of rows being loaded across the number of instances, using as many instances as necessary.Be careful not to trade off instance balance for overall job performance. Just because rows are read evenly across all instances, it does not necessarily mean that the balanced load makes the whole job run faster.
- Depending on your results, you may want to increase the number of instances for the producer or consumer operators.
Now that you know an acceptable number of instances, you can modify the value of MaxSessions to see if there is an impact.
- Decrease the value of MaxSessions. It is best to make MaxSessions a multiple of the number of instances for the operator so they are evenly balanced across the instances.
- Rerun the job and compare the output:
- Did the job run faster?
- Was there a change in the number of consumer operator instances used?
- Was there a change in the balance of data in the consumer operator instances?
- Depending on the results, you may want to use the original MaxSessions, or continue experimenting. You may even want to revisit the number of instances you are using.