Kuali Rice Development
  1. Kuali Rice Development
  2. KULRICE-6950

Severe performance degradation on saving complex routing paths

    Details

    • Type: Bug Fix Bug Fix
    • Status: Closed Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 1.0.3, 2.0
    • Fix Version/s: 2.1.0-m3, 2.1
    • Component/s: Performance
    • Security Level: Public (Public: Anyone can view)
    • Labels:
      None
    • Similar issues:
      KULRICE-14269JPA predicates cause unnecessary SQL joins and performance degradation
      KULRICE-11057Complex Routing in document never gets ingested with no response from server
      KULRICE-8336Poor performance saving routing data
      KULRICE-6705autoGrowNestedPaths breaking complex field validations
      KULRICE-5767edoclite disapprove does not send emails as expected to all persons in route path
      KULRICE-8559When routing a doc with no route path defined, document goes FINAL but PostProcessor is never called
      KULRICE-5766A re-approve of an edoclite form should continue to send emails to next in route path.
      KULRICE-6888DocumentType export does not include route path or route nodes in certain cases
      KULRICE-8962CloneUtils deepCloneObject Performance Improvement
      KULRICE-5768edoclite return to Previous does not send emails the the appropriate person in the route path
    • Rice Module:
      KEW
    • KAI Review Status:
      Not Required
    • KTI Review Status:
      Not Required

      Description

      This is an issue which has been around since at least Rice 1.0.3 and perhaps longer. There is a grade change application at IU which has a particularly complex routing path which contains eight branches. When the routing data for this document type is saved it ends up running several thousand queries to save the route node and branch data which ends up taking several minutes to complete. That is a lot more overhead than expected since there are only eight branches and 36 route nodes in the routing path. To help me investigate this issue I created the two document types which I've attached to this issue and I wrote a simple servlet which executed the following code:

      try {
          String docType = req.getParameter("docType");
          //Testing.LongRoutePath
          //Testing.LongRoutePathWithoutBranches
          WorkflowDocument document = WorkflowDocumentFactory.createDocument("admin", docType);
          // There are no rules associated with this so this doc goes to final as soon as it is routed
          document.route("");
          // Setting the title so that routing data actually gets saved
          document.setTitle("Testing");
          // Sleep to give it a chance to route, may not be needed
          Thread.sleep(10000);
          PerformanceLogger logger = new PerformanceLogger();
          logger.log("starting save routing data");
          document.saveDocumentData();
          logger.log("finished save routing data");
      } catch (InterruptedException e) {
          e.printStackTrace();
      }
      

      The first document type, Testing.LongRoutePath, is a copy of the grade change document type but I replaced all of the nodes with more generic names and changed the split nodes to use SimpleSplitNodes to simplify the document type as much as possible. When I ran the code above for that document type it took 26 minutes to save the routing data for the document. This is significantly worse than the grade change application but I believe this is related to using the SimpleSplitNodes so it hits more branches. I believe that the OJB relationships between RouteNodeInstances and Branches are the source of this problem. Here is what I found from the OJB mappings:

      1. Route node instances auto-update all of their next nodes when saved
      2. Branches auto-update the join route node instance when they are saved

      I believe that point #2 is the problem here. It seems that when OJB saves the branch it is initiating a second save on the join node which is then cascaded to the rest of the nodes which follow it in the routing path. This ends up saving each join node and every node after it twice every time there is a branch. The Testing.LongRoutePath document type ends up having 16 branches so it ends up saving the last two nodes in the routing path 2^16 times.

      I tested my theory above by creating a second document type, Testing.LongRoutePathWithoutBranches, which has the same number of routing nodes (44) and it saves the routing data in a little over 1 second. I think this helps confirm that the problem here is at least not the total number of route nodes in the routing path. After running this test against Rice 1.0.3 I tested the same document types with Rice 2.0 and got similar results.

      I think all that it may take to fix this issue is to change the auto-update on the join node in the Branch OJB mappings to false, but some testing will need to be done to ensure that the routing data is still saved properly after changing that setting.

        Issue Links

          Activity

          Hide
          Eric Westfall added a comment -

          Aaron, if I'm hearing this correctly, it sounds like your patch tracks objects which have already been saved (within the current transaction) and doesn't re-save them, is that correct? If so, that sounds like a good thing to me but I worry about one particular case which I suspect we have in multiple places:

          1. Load object using OJB
          2. Update object state
          3. Save object using OJB
          4. Update object state again
          5. Save object using OJB

          In this case, would the second save essentially become a no-op?

          Show
          Eric Westfall added a comment - Aaron, if I'm hearing this correctly, it sounds like your patch tracks objects which have already been saved (within the current transaction) and doesn't re-save them, is that correct? If so, that sounds like a good thing to me but I worry about one particular case which I suspect we have in multiple places: Load object using OJB Update object state Save object using OJB Update object state again Save object using OJB In this case, would the second save essentially become a no-op?
          Hide
          Eric Westfall added a comment -

          Generally, I also like the idea of manually handling save of node instances, though we would need to track down everywhere that needs to happen. I suspect is's primarily from the workflow engine however so probably a fairly small surface area to cover in order to handle that manually. I'm pretty sure, as you have pointed out, that we are being overly wasteful here.

          Show
          Eric Westfall added a comment - Generally, I also like the idea of manually handling save of node instances, though we would need to track down everywhere that needs to happen. I suspect is's primarily from the workflow engine however so probably a fairly small surface area to cover in order to handle that manually. I'm pretty sure, as you have pointed out, that we are being overly wasteful here.
          Hide
          Aaron Hamid (Inactive) added a comment - - edited

          Hi Eric, the patch only tracks objects saved in a single call, not the entire transaction. I think it's only a slight variant of what is already in place: OJB keeps a single member field which is an array of Java objects which it has already stored in a given call (PersistenceBroker.store(someobject)), which it will recursively populate as it descends relationships. The problem is that the impl removes each item immediately after a nested store, so in effect it only keeps track of a single path in the graph. In some, maybe most, circumstances this is ok, but once you introduce cycles in the graph, you have an explosion of redundant updates. What my change does is simply expand the scope of this tracking to the entire call chain via a thread-local. Once the call chain returns to the entry point, the cache is cleared and removed. So for steps #3 and #5, these calls would be completely independent, they would be perceived as new stores of the object coming from the application, so I think there would be no change in behavior. The only difference the change makes is that the same identical object will not be updated redundantly within a single store call. I can't think of a reason it would be useful to allow this.

          (I'm not necessarily advocating for this change since I realize touching any of these legacy libraries is a potential liability. I would be curious to see how this change performs though perhaps on some other branch.)

          Show
          Aaron Hamid (Inactive) added a comment - - edited Hi Eric, the patch only tracks objects saved in a single call, not the entire transaction. I think it's only a slight variant of what is already in place: OJB keeps a single member field which is an array of Java objects which it has already stored in a given call (PersistenceBroker.store(someobject)), which it will recursively populate as it descends relationships. The problem is that the impl removes each item immediately after a nested store, so in effect it only keeps track of a single path in the graph. In some, maybe most, circumstances this is ok, but once you introduce cycles in the graph, you have an explosion of redundant updates. What my change does is simply expand the scope of this tracking to the entire call chain via a thread-local. Once the call chain returns to the entry point, the cache is cleared and removed. So for steps #3 and #5, these calls would be completely independent, they would be perceived as new stores of the object coming from the application, so I think there would be no change in behavior. The only difference the change makes is that the same identical object will not be updated redundantly within a single store call. I can't think of a reason it would be useful to allow this. (I'm not necessarily advocating for this change since I realize touching any of these legacy libraries is a potential liability. I would be curious to see how this change performs though perhaps on some other branch.)
          Hide
          Aaron Hamid (Inactive) added a comment -

          auto-update of joinNode is off now in trunk. I found an instance where code was relying on the save of the branch to implicitly save a new join node instance, and I fixed it with an explicit pre-emptive save of the node instance. I suspect this was probably a bug to begin with since the implicit save does not take simulation routing into account (simulation routing skips the db save and generates fake primary keys), so presumably simulating a workflow with joins was leaving random node instances around in the db or something.

          Show
          Aaron Hamid (Inactive) added a comment - auto-update of joinNode is off now in trunk. I found an instance where code was relying on the save of the branch to implicitly save a new join node instance, and I fixed it with an explicit pre-emptive save of the node instance. I suspect this was probably a bug to begin with since the implicit save does not take simulation routing into account (simulation routing skips the db save and generates fake primary keys), so presumably simulating a workflow with joins was leaving random node instances around in the db or something.
          Hide
          Aaron Hamid (Inactive) added a comment -

          Conservative OJB-mapping fix is in place.

          Show
          Aaron Hamid (Inactive) added a comment - Conservative OJB-mapping fix is in place.

            People

            • Assignee:
              Aaron Hamid (Inactive)
              Reporter:
              James Bennett
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 1 week
                1w
                Remaining:
                Remaining Estimate - 1 week
                1w
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Structure Helper Panel