In TRIRIGA, the point of running multiple workflow servers is to allow workflow processing to be done so that it is fair to all users, not necessarily to increase the throughput of the number of workflows done. Adding more workflow agents to an environment can slow down processing, and cause undesirable results, if workflows are not written with multi-threading in mind.
It is best practice to assign secondary workflow agents to specific power users that tend to run more workflows than a normal user. If the secondary workflow agents are left wide open, a set of workflow instances are picked up in parallel, and some can be processed out of order. It is important to know that increasing the number of threads on a single process server results in higher throughput than splitting the threads across two servers. Typically the bottleneck of performance in an environment is the database server, and not the process servers.
If you already have a system that is deployed with multiple workflow agents, consider either:
- Stopping the secondary agents, and increasing the threads on the primary workflow agent server to be the sum of the threads across the other servers, or
- Restricting the secondary agents so that they are exclusive for the set of power users.
If a set of parent/child type of records are being loaded, and have workflows processing for them, when Workflow Server 1 and Workflow Server 2 pick up the workflows to process, and the workflow for a parent record is on 1, the child record is on 2, then the child record could be completed before the parent workflow is completed. So the path of the child could be invalid since at the time of processing, the parent didn’t exist. A single workflow agent insures that the records are executed in order.
If you load the records at one level at a time, then it may be possible to run multiple agents without issue. But recall that a single workflow server instance running many threads can still easily overwhelm a database server in terms of processing… So multiple agents shouldn’t be used for performance, but rather for isolating (power) users out to allow general purpose users a chance to have their workflows run.
[Admin: This post is related to the 05.06.15 post about corrupted object paths.]