Uploaded image for project: 'camunda BPM'
  1. camunda BPM
  2. CAM-3303

Job/incident exception message is not populated when exception is from a transaction synchronization

      Problem:

      When a job is executed within a JTA transaction and an exception is thrown upon commit (i.e. invocation of JtaTransactionInterceptor#doCommit), the job's retries are decremented properly but the job's exception message (and incident message) is not populated.

      Reason:

      Job retries are implemented via a transaction listener. This listener is handed the exception when it is thrown. However, this is only done by the CommandContextInterceptor, meaning that only exceptions thrown from within a command or on its flush are treated that way. Exceptions thrown from a transacrtion synchronization (transaction listener) cannot be caught this way. A common example is JPA which often implements manged flush of the transaction scoped entity manager as TX-Sync on commit. Exceptions thrown by JPA at this point are not visible in the process engine database. Those kinds of exceptions are then only logged in ExecuteJobsRunnable#run.

      Possible solution:

      Implement job retry decrements in ExecuteJobsRunnable#run instead of the transaction listener. Then, the catched exception can not only be logged but also set on the job. However, we have to make sure that we do not change the existing behavior other than that. Especially an exception thrown within the job command should not be wrapped in any other exception by a command interceptor. Thus, there has to be some way of handing these "out", for example by setting them on the current command.

        This is the controller panel for Smart Panels app

            [CAM-3303] Job/incident exception message is not populated when exception is from a transaction synchronization

            Sebastian Menski added a comment - Hi philipp.ossler , the integrations fail, see for example: https://hq2.camunda.com/jenkins/ci/view/Broken/job/7.6-engine-IT-wildfly-82-h2/218/testReport/ https://hq2.camunda.com/jenkins/ci/view/Broken/job/7.6-EE-engine-IT-jbosseap-70-h2/221/testReport/ https://hq2.camunda.com/jenkins/ci/view/Broken/job/7.6-EE-engine-IT-weblogic-12R1-h2/279/testReport/

            It seems that the integration tests fail on all jboss, wildfly, websphere and weblogic builds.

            Sebastian Menski added a comment - It seems that the integration tests fail on all jboss, wildfly, websphere and weblogic builds.

            Sebastian Menski added a comment - I revert all changes for this ticket: https://github.com/camunda/camunda-bpm-platform/commit/da63f241c7b57fe980249ac5ae3be369648d3782

            It seems that the ConcurrentJobExecutorTest does change something related to the job execution and does not clean up afterwards. So in odd hours the test is executed before the RuntimeServiceAsyncOperationsTest which than hangs.

            "main" prio=10 tid=0x00007f1270006000 nid=0x1c4b in Object.wait() [0x00007f127670f000]
               java.lang.Thread.State: WAITING (on object monitor)
            	at java.lang.Object.wait(Native Method)
            	- waiting on <0x00000000eae3c030> (a org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread)
            	at java.lang.Object.wait(Object.java:485)
            	at org.camunda.bpm.engine.test.concurrency.ControllableThread.returnControlToTestThreadAndWait(ControllableThread.java:57)
            	- locked <0x00000000eae3c030> (a org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:434)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430)
            	at org.camunda.bpm.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:24)
            	at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:104)
            	at org.camunda.bpm.engine.impl.interceptor.ProcessApplicationContextInterceptor.execute(ProcessApplicationContextInterceptor.java:66)
            	at org.camunda.bpm.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:30)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427)
            	at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427)
            	at org.camunda.bpm.engine.impl.cmd.ExecuteJobsCmd.executeJob(ExecuteJobsCmd.java:120)
            	at org.camunda.bpm.engine.impl.cmd.ExecuteJobsCmd.execute(ExecuteJobsCmd.java:93)
            	at org.camunda.bpm.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:24)
            	at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:104)
            	at org.camunda.bpm.engine.impl.interceptor.ProcessApplicationContextInterceptor.execute(ProcessApplicationContextInterceptor.java:66)
            	at org.camunda.bpm.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:30)
            	at org.camunda.bpm.engine.impl.ManagementServiceImpl.executeJob(ManagementServiceImpl.java:109)
            	at org.camunda.bpm.engine.test.api.runtime.RuntimeServiceAsyncOperationsTest.testDeleteProcessInstancesAsyncWithQuery(RuntimeServiceAsyncOperationsTest.java:193)
            	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
            	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            	at java.lang.reflect.Method.invoke(Method.java:597)
            	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
            	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
            	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
            	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
            	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
            	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
            	at org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168)
            	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
            	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
            	at org.camunda.bpm.engine.test.ProcessEngineRule$1.evaluate(ProcessEngineRule.java:165)
            	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
            	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
            	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
            	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
            	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
            	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
            	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
            	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
            	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
            	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
            	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
            	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
            	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
            	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
            	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
            	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
            
               Locked ownable synchronizers:
            	- None
            

            Sebastian Menski added a comment - It seems that the ConcurrentJobExecutorTest does change something related to the job execution and does not clean up afterwards. So in odd hours the test is executed before the RuntimeServiceAsyncOperationsTest which than hangs. "main" prio=10 tid=0x00007f1270006000 nid=0x1c4b in Object.wait() [0x00007f127670f000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000eae3c030> (a org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread) at java.lang.Object.wait(Object.java:485) at org.camunda.bpm.engine.test.concurrency.ControllableThread.returnControlToTestThreadAndWait(ControllableThread.java:57) - locked <0x00000000eae3c030> (a org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:434) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1$1.execute(ConcurrentJobExecutorTest.java:430) at org.camunda.bpm.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:24) at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:104) at org.camunda.bpm.engine.impl.interceptor.ProcessApplicationContextInterceptor.execute(ProcessApplicationContextInterceptor.java:66) at org.camunda.bpm.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:30) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427) at org.camunda.bpm.engine.test.concurrency.ConcurrentJobExecutorTest$ControllableExecuteJobThread$1.execute(ConcurrentJobExecutorTest.java:427) at org.camunda.bpm.engine.impl.cmd.ExecuteJobsCmd.executeJob(ExecuteJobsCmd.java:120) at org.camunda.bpm.engine.impl.cmd.ExecuteJobsCmd.execute(ExecuteJobsCmd.java:93) at org.camunda.bpm.engine.impl.interceptor.CommandExecutorImpl.execute(CommandExecutorImpl.java:24) at org.camunda.bpm.engine.impl.interceptor.CommandContextInterceptor.execute(CommandContextInterceptor.java:104) at org.camunda.bpm.engine.impl.interceptor.ProcessApplicationContextInterceptor.execute(ProcessApplicationContextInterceptor.java:66) at org.camunda.bpm.engine.impl.interceptor.LogInterceptor.execute(LogInterceptor.java:30) at org.camunda.bpm.engine.impl.ManagementServiceImpl.executeJob(ManagementServiceImpl.java:109) at org.camunda.bpm.engine.test.api.runtime.RuntimeServiceAsyncOperationsTest.testDeleteProcessInstancesAsyncWithQuery(RuntimeServiceAsyncOperationsTest.java:193) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.camunda.bpm.engine.test.ProcessEngineRule$1.evaluate(ProcessEngineRule.java:165) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Locked ownable synchronizers: - None

            Sebastian Menski added a comment - Hi philipp.ossler , does this fix requires more connections to the database? As we see issues in the engine-it suite on oracle and sqlserver where the engine fails to acquire a new connection: https://hq2.camunda.com/jenkins/ci/job/7.6-engine-IT-XA-wildfly-100-sqlserver-2014/44/testReport/org.camunda.bpm.integrationtest.functional.classloading.variables/SerializableVariableTest/testResolveClass/ https://hq2.camunda.com/jenkins/ci/job/7.6-EE-engine-IT-jbosseap-62-oracle-10/lastCompletedBuild/testReport/junit/org.camunda.bpm.integrationtest.functional.ejb.remote/RemoteSingletonBeanInvocationTest/testMultipleInvocations/ https://hq2.camunda.com/jenkins/ci/job/7.6-EE-engine-IT-jbosseap-63-oracle-10/44/testReport/junit/org.camunda.bpm.integrationtest.functional.ejb.remote/RemoteSFSBInvocationTest/testInvokeBean/ https://hq2.camunda.com/jenkins/ci/job/7.6-engine-IT-XA-jboss-72-oracle-10/43/testReport/junit/org.camunda.bpm.integrationtest.functional.ejb.remote/RemoteSFSBInvocationTest/testMultipleInvocations/ /cc lipphardt

            Also we got an out of memory issue. Could this be related to this issue?

            Sebastian Menski added a comment - Also we got an out of memory issue. Could this be related to this issue? https://hq2.camunda.com/jenkins/ci/view/Broken/job/7.6-platform-jdks/95/container=h2,jdk=ibm-jdk-6- latest/testReport/junit/org.camunda.bpm.engine.spring.test.transaction/SpringTransactionIntegrationTest/testBasicActivitiSpringIntegration/

            I created a job which will hopefully generate a heap dump on out of memory if I configured everything correctly. Maybe we can see an issue there:
            https://hq2.camunda.com/jenkins/ci/job/TEST-ibm-jdk-heapdump/

            Sebastian Menski added a comment - I created a job which will hopefully generate a heap dump on out of memory if I configured everything correctly. Maybe we can see an issue there: https://hq2.camunda.com/jenkins/ci/job/TEST-ibm-jdk-heapdump/

            Hi philipp.ossler, meyer, lipphardt,

            I debugged at least the memory issue and found out that the engine-spring test suites creates a lot of engines without killing them. Philipps commit added more test methods to a test class which for whatever reason also leads to a lot more engines. I found 9 engines at the end of the test. I think this is a problem of the engine-spring test suite and the way the engines are created and not probably deleted.

            I create CAM-6727 for that issue. Until this the jdk ibm test suite will fail on odd hours.

            @meyer could you decided on the importance of CAM-6727?

            Cheers,
            Sebastian

            PS: We think the connection issue is also somehow related to this but I didn't had time to debug it. Also we maybe wait for fixes for this issue before we backport these changes.

            Sebastian Menski added a comment - Hi philipp.ossler , meyer , lipphardt , I debugged at least the memory issue and found out that the engine-spring test suites creates a lot of engines without killing them. Philipps commit added more test methods to a test class which for whatever reason also leads to a lot more engines. I found 9 engines at the end of the test. I think this is a problem of the engine-spring test suite and the way the engines are created and not probably deleted. I create CAM-6727 for that issue. Until this the jdk ibm test suite will fail on odd hours. @ meyer could you decided on the importance of CAM-6727 ? Cheers, Sebastian PS: We think the connection issue is also somehow related to this but I didn't had time to debug it. Also we maybe wait for fixes for this issue before we backport these changes.

            Just found a problem here: on the following servers, the MDB* is not invoked anymore: WLS, WAS, GF

            • org.camunda.bpm.container.impl.ejb.JobExecutionHandlerMDB

            Daniel Meyer added a comment - Just found a problem here: on the following servers, the MDB* is not invoked anymore: WLS, WAS, GF org.camunda.bpm.container.impl.ejb.JobExecutionHandlerMDB

              meyer Daniel Meyer
              thorben.lindhauer Thorben Lindhauer
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: