.

Tuesday, October 31, 2017

Eclipselink 2.5.2 (JPA 2.1.0): Entity Cloning

This discussion made use of JPA 2.1.0 and Eclipselink 2.5.2. Specs and implementation and may change in future releases. It might be helpful to test with different versions by (locally) modifying the pom.xml file that appears in the github repository, Eclipselink Entity Copying, that contains in-depth code and discussion for the concepts relevant to this post.

Also, here is another great reference for when reading through the code: Eclipselink Attribute Groups

Among other cases, the need to clone entities JPA appears under the following conditions:
  • The absence of DTOs (Data Transfer Objects) - which may arguably represent the lack of a clean separation between the model layer and the controller layer
  • The need to protect entities from being polluted by changes, especially in multipart transactions
  • The need to simply persist or manage duplicate information

Although it can be argued that using a JPA provider to this extent is yet another antipattern, we won't talk about that kind of stuff here.

Anyway, say that the need arises for us to clone our entities. Of course, the easiest (but not necessarily the most succint) thing to do is to implement the Cloneable interface and code away on how we are to handle what.

Suddenly, we are hit with some realizations:
  • What if our entities traverse deeply?
  • What if our entities are only partially fetched?
  • How do we want to handle unfetched attributes?

This is when we find out how DTOs fall short. If JPA is to still be used in querying for the required information for, say, a view, then DTOs have to be smart to some degree. It might decided that different views that use information from the same entity would warrant per-view DTOs. DTOs might also have to be made aware of which attributes it should copy from the entity, especially when certain fetch optimizations (attribute narrowing via Eclipselink FetchGroups or JPA FetchGraphs) were used, since violating such optimizations usually lead to the loss of the advantage of their use in the first place.

Luckily, we're using Eclipselink, which has a nifty tool that we can use when in such a predicament: CopyGroups (and the JPAEntityManager.copy method). Honestly though, it won't be the most sturdy of swiss army knives, but the tool we'd be using a lot is luckily also the sharpest one in the set.

Eclipselink CopyGroups and JPAEntityManager.copy()


The main entry point to using this feature is in the following method of the org.eclipse.persistence.jpa.JpaEntityManager class:
It returns Object from a method without generics, so we still have to cast it. Also, the entityOrEntities parameter can also accept a Collection of the same type of entity. Also notice that the method accepts an AttributeGroup; it internally transforms this into a CopyGroup if it not already one. We'll usually pass CopyGroups when we use this method anyway.

We can obtain an instance of a JPAEntityManager via two ways:
  • Directly casting an EntityManager instance (make sure it runs on Eclipselink)
  • Calling unwrap(JpaEntityManager.class) on an EntityManager (again, it should run on Eclipselink)

With that, doing the actual copying is pretty much covered.

What we have to actually be familiar (and careful) with are the configuration options for the CopyGroup we pass to the copy method.

Experiment-discussion


The meat of this discussion actually appears in the test class found in this github repository:

Eclipselink Entity Copying

Simply run the "mvn test" Maven command from the directory that contains the pom.xml file to see if all the test pass (they should). After this, read the code found in the only test class under src/test/main/....

For this post, I'll just leave a summary of the discussion in the code for our reference.

Summary: CopyGroup Configuration


A CopyGroup has two main points of configuration:
  • cascade level, which defines which types (and not numerical depth) of associations the copying should include
  • declared attributes (as it is an AttributeGroup) which it should consider when cloning (only considered when the cascade level is Cascade Tree)

General Considerations

  • Whenever an attribute is added to a CopyGroup, its cascade level is set to CascadeTree. As CascadeTree is only depth that considers attributes, be mindful when adding attributes to a CopyGroup
  • When a FetchGroup, a type of AttributeGroup used for query optimization, is turned into a CopyGroup (via the toCopyGroup() method), it is automatically set to CascadeTree
  • Primary key and version columns can optionally be omitted from the copies via the CopyGroup configuraion methods setShouldResetVersion(boolean) and setShouldResetPrimaryKey(boolean). These options behave differently, according to the cascade level configured
  • Copies can back-reference; that is, when copying with circular references, same entities with the same key share the same reference

Cascade Level Options

  • Cascade All Parts
    • set via the CopyGroup method cascadeAllParts()
    • does not consider attributes it contains
    • copies ALL associations; initializes them if need be
      • For entities and associations that have been or are to be partially fetched, their respective copies would only have copied the attributes for partial fetching (as probably declared via FetchGroup when querying for the original)
      • For associations that were not declared in such a partial fetching scheme, they would be initialized as default
      • (i.e.) ALL associations would still be initialized; only, those that were declared to have a FetchGroup might lack some BASIC attributes
    • when an unfetched BASIC attribute is encountered, its corresponding value in the copy will be null
    • initialization triggered by copying affects the original; i.e. if an association was initialized via a query triggered by copying, then it also becomes initialized in the original
    • because ALL associations are initialized, it might not be worth using this cascade level for heavily associated entities
    • if the group is configured with setShouldResetPrimaryKey(true), the keys will only be reset if none of them are associations (all or nothing)
  • Cascade Tree
    • set via the CopyGroup method cascadeTree()
    • the cascade level is automatically set to Cascade Tree when an attribute is added to the CopyGroup
    • when a CopyGroup is obtained via a toCopyGroup() on a FetchGroup, the resulting CopyGroup uses the Cascade Tree level
    • when passing a CopyGroup without attributes, copying will involve "all attributes", though when it comes to associations, it is still unpredictable (needs more testing); it won't be probable that an empty CopyGroup will be used with the CascadeTree level anyway
    • when accessing an attribute/association from the copy that is not declared in the CopyGroup (which is not empty), an IllegalStateException is thrown;
      • this can be useful for adjusting/optimizing FetchGroups (use them as CopyGroups)
      • be careful when turning FetchGroups into CopyGroups:
        • if a complete CopyGroup is desired, then take it from the FetchGroup that is manually configured and passed as a query hint
        • FetchGroups taken from resulting entities (after being casted to FetchGroupTracker) have been broken down so that they only describe the entity it was taken from
    • if copying triggers initialization queries, then the original entities are affected as well
  • Cascade Private Parts
    • set via the CopyGroup method cascadePrivateParts()
    • supposed to behave like Cascade All Parts, except it cascades only associations annotated with @org.eclipse.persistence.annotations.PrivateOwned
    • it worked the other way around in the tests - all but the PrivateOwned association was cascaded
    • still unpredictable; needs more testing
  • Cascade None
    • set via the CopyGroup method cascadeNone()
    • still initialized associations, thus straying from its name and contract
    • still unpredictable; needs more testing

Unfortunately, only Cascade All Parts and Cascade Tree can be help up to their intention to a usable degree - luckily, Cascade Tree is the level that would see the most use.

In the end, the only CopyGroups actually worth using (fortunately, it should also be the common use case) are those derived from manually configured FetchGroups, or cascade-tree-level groups that were manually built with careful consideration.

Admittedly, this time, it probably seems like a disappointing turnout - one where we are only given a limited number of options.

Perhaps with this we can help each other dig deeper into this feature and learn more about it, or even have the guys over at Eclipselink help us with it.

In any case, once again, hope this helped. Thanks!