Uploaded image for project: 'Kuali Rice Development'
  1. Kuali Rice Development
  2. KULRICE-9105

Determine how to do a more efficient and less wasteful approach for XML object serialization for the maintenenace framework

    Details

    • Type: Task
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 2.6
    • Component/s: Analysis, JPA, Roadmap
    • Security Level: Public (Public: Anyone can view)
    • Labels:
    • Rice Module:
      KRAD
    • KRAD Feature Area:
      Persistence Framework, Utility
    • Sprint:
      2.4.0-rc1 Sprint 7, 2.4.0-rc1 Sprint 8
    • KAI Review Status:
      Not Required
    • KTI Review Status:
      Not Required
    • Include in Release Notes?:
      Yes

      Description

      Currently, we have PersistenceService.resolveProxy which is only used from the xml serialization process in these classes:

      • org.kuali.rice.krad.service.impl:
        • SerializerServiceBase
        • XmlObjectSerializerServiceImpl

      The current implementation uses some OJB-specific code to unwrap proxies and depends heavily on the underlying implementation of the proxy.

      Beyond the specific proxy issue, we need to come up with a better strategy for handling such serialization for XML both for the maintenance framework and for workflow routing. It currently uses the XStream library automatically.

      It's probably worth noting that the current implementation of these is very problematic for client applications for a few reasons:

      1. Generally speaking, way too much XML ends up getting serialized and stored in the database. This is particularly true for workflow XML.
      2. For XML which ends up in the maintenance document table, it becomes a problem during upgrades if any code structure ever changes because now the database has XML which can't be reconstituted to an object. XStream is particularly unforgiving in this nature and the framework it providers for customizing XML marshall/unmarshall is not awesome.
      3. It's harder then it needs to be to plug in custom serialization to these different processes.

      As apps begin to transition from KNS over to KRAD, now is probably the best time to fix these problems before they propagate to a whole new generation!

      Some ideas here in relation to the new KRAD data layer include:

      1. Delegate serialization down to the DataObjectService so that it can be handled in provider-specific ways (which would allow for proxies to be stripped appropriately)
      2. Delegate proxy resolution down to the DataObjectService so that it can be handled in a provider-specific way
      3. Provide some kind of utility class where custom proxy resolver can be registered statically. Then each provider implementation can register any proxy resolvers that they need with this higher level component and it would be used by the framework when needed to strip proxies from data objects.

      None of these solutions seem particularly attractive to me, though I can't place my finger on why. I'm going to continue to brainstorm around some ideas for how we can/should address this.

        Attachments

          Issue Links

            Activity

            Hide
            ewestfal Eric Westfall added a comment -

            Jonathan, i'm wondering if you might have already done this one?

            Show
            ewestfal Eric Westfall added a comment - Jonathan, i'm wondering if you might have already done this one?
            Hide
            jkeller Jonathan Keller added a comment -

            No - this is still using the same old XStream library.

            In fact, there's another issue which I never got to which was supposed to help here. It was mainly for the maintenance documents, so that it could use the metadata to not serialize non-updatable child objects. (I think that JIRA is also assigned to me.)

            Show
            jkeller Jonathan Keller added a comment - No - this is still using the same old XStream library. In fact, there's another issue which I never got to which was supposed to help here. It was mainly for the maintenance documents, so that it could use the metadata to not serialize non-updatable child objects. (I think that JIRA is also assigned to me.)
            Hide
            jkeller Jonathan Keller added a comment -

            Attaching testing files used to show method for not blowing up when deserializing an object when the target field (in the XML) does on exist on the current version of the class.

            Show
            jkeller Jonathan Keller added a comment - Attaching testing files used to show method for not blowing up when deserializing an object when the target field (in the XML) does on exist on the current version of the class.
            Hide
            jkeller Jonathan Keller added a comment -

            One piece I planned on doing was to use the Data object metadata on maintenance documents. There is an "isSavedWithParent" property. And, if the object is not saved with the parent, there is no reason to include it in the maintenance document XML.

            So...I'm starting to think we want to completely separate serialization for the maintenance documents from other types. (Probably still have a common parent class to hold the common XStream configuration.) But, give them independent interfaces so that they could be implemented separately if needs require.)

            Show
            jkeller Jonathan Keller added a comment - One piece I planned on doing was to use the Data object metadata on maintenance documents. There is an "isSavedWithParent" property. And, if the object is not saved with the parent, there is no reason to include it in the maintenance document XML. So...I'm starting to think we want to completely separate serialization for the maintenance documents from other types. (Probably still have a common parent class to hold the common XStream configuration.) But, give them independent interfaces so that they could be implemented separately if needs require.)
            Hide
            jkeller Jonathan Keller added a comment -

            Notes I took for serialization:

            Reference objects don't need to be checked for proxy classes - JPA will not respect lazy loading if no weaving. If weaved - pulling them instantiates without any proxy fields.

            Looks like org.eclipse.persistence.indirection.IndirectList class is also left behind as the list implementation class after instantiation - needs to be converted to ArrayList upon serialization. (There is a parent : IndirectCollection) which could be checked to also catch Sets and Maps

            The statement below can be used within the EclipseLink JPA persistence provider to check if a proxy is loaded without actually causing it to load.

            em.getEntityManagerFactory().getPersistenceUnitUtil().isLoaded( results.get(0), "names" )

            Show
            jkeller Jonathan Keller added a comment - Notes I took for serialization: Reference objects don't need to be checked for proxy classes - JPA will not respect lazy loading if no weaving. If weaved - pulling them instantiates without any proxy fields. Looks like org.eclipse.persistence.indirection.IndirectList class is also left behind as the list implementation class after instantiation - needs to be converted to ArrayList upon serialization. (There is a parent : IndirectCollection) which could be checked to also catch Sets and Maps The statement below can be used within the EclipseLink JPA persistence provider to check if a proxy is loaded without actually causing it to load. em.getEntityManagerFactory().getPersistenceUnitUtil().isLoaded( results.get(0), "names" )
            Hide
            jkeller Jonathan Keller added a comment - - edited

            This is a link to a small (Eclipse) project which I was using to play with EclipseLink and proxies to test how serialization would work.

            https://drive.google.com/a/kuali.org/file/d/0B1dW7BM9X5-rMFVrQ0R6b3A2OG8/edit?usp=sharing

            There are a couple parts to it.

            XStreamTest.groovy was an attempt to make the system not blow up when de-serializing a class where a field no longer existed. There is a (what I consider) a bug in that, if the field does not exist, it eventually attempts to resolve the property name as a class name and blows. That code was too deep in EL to fix, so I wrote a custom Mapper which we would need to inject as the top-level mapper (see the buildMapper() function - stolen from EL - and my "NotSoExplosiveDefaultMapper")

            That test takes a PersonImpl - converts it to XML - and then alters the XML so it looks like it was a PersonImplMissingField and then deserializes it.

            Show
            jkeller Jonathan Keller added a comment - - edited This is a link to a small (Eclipse) project which I was using to play with EclipseLink and proxies to test how serialization would work. https://drive.google.com/a/kuali.org/file/d/0B1dW7BM9X5-rMFVrQ0R6b3A2OG8/edit?usp=sharing There are a couple parts to it. XStreamTest.groovy was an attempt to make the system not blow up when de-serializing a class where a field no longer existed. There is a (what I consider) a bug in that, if the field does not exist, it eventually attempts to resolve the property name as a class name and blows. That code was too deep in EL to fix, so I wrote a custom Mapper which we would need to inject as the top-level mapper (see the buildMapper() function - stolen from EL - and my "NotSoExplosiveDefaultMapper") That test takes a PersonImpl - converts it to XML - and then alters the XML so it looks like it was a PersonImplMissingField and then deserializes it.
            Hide
            jkeller Jonathan Keller added a comment -

            The other part is the JpaTest - which you can't run from eclipse and get valid results.

            I could not get load-time weaving to work with the Groovy setup - so I had to create the jpa_test.sh shell script which would use the EL static compiler and save the altered class files to a different directory. (Had to be different from the one Eclipse was using to prevent overwrites.)

            Probably a lot of that script is not needed, but at one point I had to get in and inject certain classes into the classpath running Groovy itself due to classloader issues. It's more-or-less extracted from the main groovy launcher script.

            Anyway - it's mainly there to be able to test the various types of relationships we could define in JPA and see how they presented themselves.

            Ultimately, I was going to add XStream to the mix and test changes to the ReflectionProvider and ReflectionConverter classes to make the resulting XML come out as if it were a POJO and not a JPA managed object.

            Show
            jkeller Jonathan Keller added a comment - The other part is the JpaTest - which you can't run from eclipse and get valid results. I could not get load-time weaving to work with the Groovy setup - so I had to create the jpa_test.sh shell script which would use the EL static compiler and save the altered class files to a different directory. (Had to be different from the one Eclipse was using to prevent overwrites.) Probably a lot of that script is not needed, but at one point I had to get in and inject certain classes into the classpath running Groovy itself due to classloader issues. It's more-or-less extracted from the main groovy launcher script. Anyway - it's mainly there to be able to test the various types of relationships we could define in JPA and see how they presented themselves. Ultimately, I was going to add XStream to the mix and test changes to the ReflectionProvider and ReflectionConverter classes to make the resulting XML come out as if it were a POJO and not a JPA managed object.

              People

              • Assignee:
                jkeller Jonathan Keller
                Reporter:
                ewestfal Eric Westfall
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - 1 week, 2 days
                  1w 2d
                  Remaining:
                  Remaining Estimate - 1 week
                  1w
                  Logged:
                  Time Spent - Not Specified Time Not Required
                  Not Specified