Slow compiling Groovy script

The running times for the same workflow are inconsistent between each running instance of that workflow. You can observe this behavior when there are script tasks in between user tasks. Usually, the execution time of the script tasks is longer when you run the workflow for the first time after deploying it or after restarting the application.

Cause

Each script task must be compiled before running. Compilation is the process of transforming the code into executable instructions and that can take some time. To avoid compiling the code of the same script every time you run a workflow, the Groovy engine has a cache where it stores compiled scripts for further usage. As a result, a script that has been cached has a shorter execution time than one that is not in the cache. Scripts have longer execution times when:

  • You run a workflow for the first time after deploying it.
  • You run a workflow for the first time after restarting the application.
  • The Java garbage collector clears the Groovy engine cache. Generally, this happens when the memory is low.

Factors that increase the execution time

In general, more code takes longer to be compiled. You can notice this when the workflow contains a large Groovy script or several script tasks in a row that are not separated by any user task.

Another common situation is having a large amount of script code in the groovy-lib directory, directly in the application, so that you can use the methods from this script in any Groovy script task. In this case, even if you don’t use these methods, the entire content of the groovy-lib directory is added to each script task before compilation, which increases the compilation time.

How to reduce execution time

  • Avoid using the common groovy-lib directory for scripts that are not used by a large number of tasks. Place those scripts in the workflow Groovy script task.
  • Avoid splitting up Groovy script tasks for each API call when all could be done in one script task.
  • When possible, have an administrator run the workflows once, before other users, especially in the following cases:
    • After deploying a new version of a workflow. Run the specific workflow.
    • After restarting the platform. Run all workflows.

    Note   This may have an impact on the data.

Long term solution

In our next generation platform, we plan to change this behavior and compile the code during the workflow deployment, eliminating the noticeable inconsistencies between running times for instances of the same workflow.

Collibra makes it easy for data citizens to find, understand and trust the organizational data they need to make business decisions every day. Unlike traditional data governance solutions, Collibra is a cross-organizational platform that breaks down the traditional data silos, freeing the data so all users have access.

©2020 Collibra. All Rights Reserved.

@

Not recently active