Strategies for Balancing Sessions and Instances
Without concrete performance data, no recommendations or guidelines exist for determining the optimum number of sessions or instances, but some strategies exist for finding good balances.
Balancing sessions and instances helps to achieve the best overall job performance without wasting resources.
Strategy 1
Start with MaxSessions equal to the number of available AMPs or number of sessions you want to allocate for the job, using one instance of each operator (producers and consumer).
1 Run the job and record how long it took for the job to complete.
2 Then, increment the number of instances for the consumer operator. Do not change the number of sessions, because changing multiple variables makes it difficult to compare.
3 Rerun the job and examine the job output:
If the work was unbalanced, another instance might be better. If the second instance did not do much work, a third one would not likely get engaged.
4 Repeat the process of increasing the number of instances for the consumer operator until you are using as many instances as it needs.
Now it is time to look at the producers. You can try increasing the number of each producer instance separately to see if it will feed data into the data stream at a higher rate.
1 Increase the number of instances for each producer operator separately. Again, do not change the number of sessions.
2 Rerun the job and compare the job output:
Note: Be careful not to trade off instance balance for overall job performance. Just because rows are read evenly across all instances, it does not necessarily mean that the balanced load makes the whole job run faster.
3 Depending on your results, you may want to increase the number of instances for the producer or consumer operators.
Now that you know an acceptable number of instances, you can modify the value of MaxSessions to see if there is an impact.
1 Decrease the value of MaxSessions. It is best to make MaxSessions a multiple of the number of instances for the operator so they are evenly balanced across the instances.
2 Rerun the job and compare the output:
3 Depending on the results, you may want to use the original MaxSessions, or continue experimenting. You may even want to revisit the number of instances you are using.
Strategy 2
Start with MaxSessions equal to the number of available AMPs or number of sessions allocated for the job, using four instances of each operator (producers and consumer).
1 Run the job and examine the output:
2 Make adjustments based on your results.
3 Rerun the job and compare the output:
4 Repeat the process to optimize the number of producer and consumer instances.
Now that you know the best number of instances, you can modify the number of MaxSessions to see if there is an impact.
1 Decrease the number of MaxSessions. It is best to make MaxSessions a multiple of the number of instances for the operator so they are evenly balanced across the instances.
2 Rerun the job and compare the output:
3 Depending on the results, you may want to use the original MaxSessions, or continue experimenting. You may even want to re-visit the number of instances you are using.