Details

    • Type: Bug Fix Bug Fix
    • Status: Closed Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.4.1
    • Fix Version/s: 2.5
    • Component/s: Database
    • Security Level: Public (Public: Anyone can view)
    • Labels:
      None
    • Similar issues:
      KULRICE-13560Install XA Connection Pool fix on env17 and re-run concurrent user tests
      KULRICE-12941Install XA Connection Pool fix on env17 and re-run concurrent user tests
      KULRICE-12962Make Bitronix the default connection pooling & JTA provider
      KULRICE-4895Migrate from XAPool & JOTM to Bitronix connection pool & Transaction Manager
      KULRICE-6612Raise the default connection pool size in common-config-defaults.xml
      KULRICE-4948Enhance OJB's PlatformOracle9iImpl to support connection pools that use the JDBC 4 unwrap() mechanism
      KULRICE-4672create piece to connect rules into actions
      KULRICE-5959Create CI integration test job configured to use Bitronix for JTA transaction management and for connection pooling
      KULRICE-3077XADataSource Pool Issues With MySQL via NATed Networks
      KULRICE-9045ViewServiceImpl - buildView - DB Connections
    • Application Requirement:
      KS
    • Sprint:
      Core 2.5.0-m5 Sprint 1
    • KAI Review Status:
      Not Required
    • KTI Review Status:
      Pending Review
    • Code Review Status:
      Pending Review
    • Contributing Institution:
      Univ of Maryland
    • Include in Release Notes?:
      Yes

      Description

      So, the main problem i was trying to solve:
      Regardless of what we set the "wait for connection" property, the pool would NEVER actually wait for a connection causing connection errors. This has affected all institutions for many years.

      So I finally looked at the source code for XA.
      https://svn.kuali.org/repos/foundation/legacy/xapool-1.5.0/tags/xapool-1.5.0-patch4/src/

      and noticed an issue in the GenericPool.java

      int currentWait = 0;
      
                      Object obj = getFromPool(user, password);
      		while ((obj == null) && (currentWait < getDeadLockMaxWait())) {
       			log.info("GenericPool:checkOut waiting for an object :"+this.poolHelper.toString());
       			try {
       				synchronized (this) {
      					wait(getDeadLockRetryWait());
      				}
      			} catch (InterruptedException excp) {
      				log.error(
      					"GenericPool:checkOut ERROR Failed while waiting for an object: "
      						+ excp);
      			}
      			currentWait += getDeadLockRetryWait();
      			obj = getFromPool(user, password);
      		}
      

      the short of it is, the "wait" and "notifyAll" are not being used correctly together.
      the coder thought that by calling wait(getDeadLockRetryWait()); it would actually wait for that period of time.
      then it would increment the currentWait time:
      currentWait += getDeadLockRetryWait();

      and try to get a connection again.
      while ((obj == null) && (currentWait < getDeadLockMaxWait())) {

      What was REALLY happening: at almost Every millisecond or so notifyAll() gets called (when system is under load). This breaks out of the wait(getDeadLockRetryWait()), and increments the count. So the code actually functions like there is no wait() at all. ie:

                      int currentWait = 0;
                      Object obj = getFromPool(user, password);
      		while ((obj == null) && (currentWait < getDeadLockMaxWait())) { 		
      			currentWait += getDeadLockRetryWait();
      			obj = getFromPool(user, password);
      		}
      

      so instead of waiting for... 30 seconds, it's waiting for 1-5 milliseconds per loop through the while statement. This would quickly cause currentWait < getDeadLockMaxWait() to trigger causing the system to throw an exception.

      So, the fix for the wait is simple

      long currentWait = 0;
      
              Object obj = getFromPool(user, password);
              long startCheckTime = System.currentTimeMillis();
              while ((obj == null) && (currentWait < getDeadLockMaxWait())) {
                  try {
                      synchronized (this) {
                          wait(getDeadLockRetryWait());
                      }
                  } catch (InterruptedException excp) {
                      LOGGER.error(
                              "GenericPool:checkOut ERROR Failed while waiting for an object: " + excp);
                  }
                  currentWait = System.currentTimeMillis() - startCheckTime;
                  obj = getFromPool(user, password);
              }
      

      instead of assuming that we have waited for the full time, do a real calculation
      ie.
      startTime = System.currentTimeMillis();
      while ( same )

      { currentWait = System.currentTimeMillis() - startCheckTime; }

      putting that SMALL change in allows for the pool to wait (on average 10 ms). Just that small amount has DRASTICALLY changed my performance numbers.

      My tests used to crap out at 100 users over 10 sec, now i can reach 300 users over 10 sec. (with only 10 connections). After increasing the number of oracle processes available (from 50 -> 500) I have not been able to run out of connections.

      With this fix in place you might start running out of oracle processes. Something more to investigate.

      1. KSENROLL-13351__fix_code_that_waits_for_connection_.patch
        1 kB
        Garey Taylor
      2. KULRICE-12918_2.patch
        26 kB
        Martin Taylor

        Issue Links

          Activity

          Hide
          Garey Taylor added a comment -

          Code will need to be committed to trunk:
          https://svn.kuali.org/repos/foundation/legacy/xapool-1.5.0/trunk/

          When running build, make sure you generate sources and add them to the maven repo. This would have been solved YEARS ago if source had been attached.

          Show
          Garey Taylor added a comment - Code will need to be committed to trunk: https://svn.kuali.org/repos/foundation/legacy/xapool-1.5.0/trunk/ When running build, make sure you generate sources and add them to the maven repo. This would have been solved YEARS ago if source had been attached.
          Hide
          Martin Taylor (Inactive) added a comment -

          Had issues running on versions > 1.5. Added the following to the javac task in build.xml to get it to build properly:

                source="5"
                target="5"
                executable="/path-to/java/1.5.0/sys/bin/javac"
                fork="true"
                taskname="javac1.5"
          
          ant -v clean compile packages
          
          Show
          Martin Taylor (Inactive) added a comment - Had issues running on versions > 1.5. Added the following to the javac task in build.xml to get it to build properly: source= "5" target= "5" executable= "/path-to/java/1.5.0/sys/bin/javac" fork= " true " taskname= "javac1.5" ant -v clean compile packages
          Hide
          Martin Taylor (Inactive) added a comment -

          compiling this code may be tricky. Can do so on my local unix environment. Spoke to Jeff C. and we may have issues with running 1.5 on our ci environments. Solution would be to add all unimplemented methods causing the compile to break and leaving them returning MethodNotFoundException. That would give the same functionality as a 1.5 compiled jar but we could build on our ci environment.

          Show
          Martin Taylor (Inactive) added a comment - compiling this code may be tricky. Can do so on my local unix environment. Spoke to Jeff C. and we may have issues with running 1.5 on our ci environments. Solution would be to add all unimplemented methods causing the compile to break and leaving them returning MethodNotFoundException. That would give the same functionality as a 1.5 compiled jar but we could build on our ci environment.
          Hide
          Martin Taylor (Inactive) added a comment - - edited

          Included a patch to make the project buildable from 1.6+. Adds all missing methods related to newer 1.6 sql interfaces but sets to UnsupportedOperationException since it would have been MethodNotFoundException otherwise. Building on testci (https://testci.kuali.org/job/rice-2.5-sandbox-xapool-build/). Reviewing failed tests on jotmxapool directory.

          Show
          Martin Taylor (Inactive) added a comment - - edited Included a patch to make the project buildable from 1.6+. Adds all missing methods related to newer 1.6 sql interfaces but sets to UnsupportedOperationException since it would have been MethodNotFoundException otherwise. Building on testci ( https://testci.kuali.org/job/rice-2.5-sandbox-xapool-build/ ). Reviewing failed tests on jotmxapool directory.
          Hide
          Martin Taylor (Inactive) added a comment -

          I'm committed the changes to the GenericPool along with the other changes need to run in 1.6 libraries. I'm splitting off additional changes related to the cleanup/upgrade into another ticket.

          Show
          Martin Taylor (Inactive) added a comment - I'm committed the changes to the GenericPool along with the other changes need to run in 1.6 libraries. I'm splitting off additional changes related to the cleanup/upgrade into another ticket.
          Hide
          Martin Taylor (Inactive) added a comment -

          tagged https://svn.kuali.org/repos/foundation/legacy/xapool-1.5.0/tags/xapool-1.5.0-patch5/ and new jars are available on external repository along with the sources jar.

          Show
          Martin Taylor (Inactive) added a comment - tagged https://svn.kuali.org/repos/foundation/legacy/xapool-1.5.0/tags/xapool-1.5.0-patch5/ and new jars are available on external repository along with the sources jar.
          Hide
          Garey Taylor added a comment -
          Show
          Garey Taylor added a comment - there wan an error in patch 5, please use patch 6 https://svn.kuali.org/repos/foundation/legacy/xapool-1.5.0/tags/xapool-1.5.0-patch6/

            People

            • Assignee:
              Martin Taylor (Inactive)
              Reporter:
              Garey Taylor
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 2 days
                2d
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 4 hours Time Not Required
                4h

                  Agile

                    Structure Helper Panel