Persistence Design

University of Nantes — LS2N, France

Outline

  1. Objectives

  2. Principles of Persistence

  3. Object-Relational Mapping

  4. Persistence Design

  5. Conclusion

Objectives

  1. Make domain objects persistent; or

    • Create a database for an application or application family

  2. Use existing persistence mechanisms

    • Integrate an existing database, used by other applications

Persistence Mechanisms

Relational databases

PostgreSQL, MySQL, etc.

NoSQL databases

MongoDB, Net4J, Redis, Cassandra, etc.

Textual files

JSON, XML, CSV, etc.

In-memory databases

H2, Apache Derby, HSQLDB

Also: Container and Pluggable Databases (CDB and PDB)

Persisted Objects

What should be persisted?
  • Essentially, domain objects:

    • Users, Orders, Products, etc.

  • Possibly, technical objects:

    • Windows positions, network timeouts, cache size, etc.

Entities and Value Objects

  • Domain objects are either Entities or Value Objects

  • Entities are objects that have a distinct identity

  • Values Objects are objects that matter only as the combination of their properties.

    • Two values objects with the same property values are considered equal.

Entities and Value Objects
Figure 1. Entities and Value Objects

Principles of Persistence

Principles of Persistence

  1. Object Materialization and Dematerialization

  2. Unique Identifier

  3. Persistent Object States

  4. Aggregates

  5. Lazy Materialization

  6. Polymorphic Associations and Queries

Principle 1: Object Materialization and Dematerialization

Dematerialization

The process of translating an in-memory object into a format that is suitable for a persistence mechanism

Materialization

The opposite, the process of translating an object from its persistent mechanism, into live object that can be used by a program

Materialization and Dematerialization

materialization
Figure 2. Object Materialization and Dematerialization

Principle 2: Unique Identifier

Unique Identifier

To relate records to objects, and ensure that there are no duplicates, both must have a unique identifier

  • In object-oriented languages, each object have an unique identifier, its OID

    • Usually its memory address

  • In databases, the OID cannot be used as record identifiers

    • Usually not easily accessible

    • It changes every time the object is materialized

Unique Identifier

Design Alternatives
  • Let the DBMS handle the identifiers

    • e.g. using sequence generators

  • Universally Unique Identifier (UUID)

    • 16 bytes number

    • easy generation at source

  • High-Low algorithm

    • Ask the database for a range of identifiers

High-Low Algorithm Example
Figure 3. High-Low Algorithm Example

Principle 3: Object States

Persistent Object States
Figure 4. Persistent Object States
Legend
  • New: new object, not persisted

  • Clean: unmodified

  • Dirt: modified

  • Old: persisted object, retrieved from DB

Principle 4: Aggregates

Aggregate

An aggregate is a cluster of domain objects that can be treated as a single unit

Diagram

Principle 5: Lazy Materialization

  • In Lazy Materialization, not all objects are materialized at once

  • Referenced objects are materialized on demand

A Board Containing two Countries
Figure 5. A Board Containing two Countries
Lazy Materialization of Board
Figure 6. Lazy Materialization of Board

Lazy Materialization Design

Applying the Proxy Design Pattern to Lazy Materialization
Figure 7. Applying the Proxy Design Pattern to Lazy Materialization

Principle 6: Polymorphic Associations and Queries

Polymorphic association

When a class A is associated to an abstract parent class B, but instances of A are linked to subinstances of B

Polymorphic query

When the query requests information corresponding the properties of a parent class (e.g. all mission names)

Diagram

Object-Relational Mapping

Mapping Classes to Tables

Mapping patterns [Keller 1997]
  • One inheritance tree, one table.

  • One class, one table.

  • One inheritance path, one table.

  • Objects in BLOBs.

  • [[[Keller 1997]]] Wolfgan Keller 1997. "Mapping Objects to Tables A Pattern Language". In : Proc. of European Conference on Pattern Languages of Programming and Computing, Kloster Irsee, Germany, 1997. p. 207.

An inheritance Tree, a Table.

Diagram
Diagram

Consequences

Advantages
  • Often, the simplest solution

  • Actually, most frequently chosen

  • Enables polymorphic queries and associations

Drawbacks
  • Many columns will contain the NULL value

  • These columns cannot be declared NOT NULL, even if this constraint is valid for subclasses

An Inheritance Path, a Table.

Diagram
Diagram

Consequences

Advantages
  • Most natural method: one table per entity type

  • No need for joins to retrieve information

Drawbacks
  • Can’t simply translate polymorphic associations

  • For example, a class that references any article (blog post or book).

  • In fact, no relational table corresponds to any article, so a referential integrity constraint (foreign key) cannot be imposed.

  • It’s hard to query the database to perform a polymorphic query: e.g. the title of all articles

A Class, a Table.

Diagram
Diagram

Consequences

Identity Preservation
  • The properties of an object are spread over several tables (those corresponding its class anf superclasses)

  • Its identity is then preserved by giving the same primary key to the rows corresponding to the object in the different tables

  • The primary keys of tables corresponding to child classes are foreign keys to the primary key of the parent class

  • To retrieve information on an instance of a child class, simply perform a join on these primary keys

More Consequences

Advantages
  • Simple: bijection between classes and tables

  • Enables polymorphic queries and associations

Drawbacks
  • If the inheritance hierarchy is height, numerous joins are required to reconstitute information scattered across numerous tables

  • This results in complex instructions and, above all, poor performance

  • The "type" column helps to alleviate this problem; it is possible, for example, to retrieve the titles of articles without join

Objects in BLOBs.

Diagram
Diagram

Consequences

Advantages
  • Allows reading and writing of any object with a single database operation

  • If the database allows variable length BLOBs, space consumption is optimal.

Drawbacks
  • Scanning classes for properties is difficult. As the internal structure is not accessible, you need to register functions with the database that give you access to the attributes

  • The database cannot be shared with other applications

  • More adapted Key-Value databases that for relational ones

  • Schema evolution is comparable to schema evolution in an object oriented database

Mapping Associations to Tables

Patterns for Mapping Associations
  • Foreign Key Association

  • Association Table

Patterns for Mapping Aggregations
  • Single Table Aggregation

  • Foreign Key Aggregation

Foreign Key Association

One-to-one

Place an OID foreign key in one or both tables.

One-to-many

Create an OID in the many table

Diagram
Diagram

Consequences

Advantages
  • Write Performance: Writing all owned objects in an [*] association costs the number of changed associated objects, as unchanged objects are not written.

  • Integration of legacy systems: As most relational legacy systems use this mapping, converting [*] associations to objects is no source of new problems.

  • Space consumption is near optimal, except for the space required for the foreign key column (USER_ID) in the dependent object table (COUNTRY).

Drawbacks
  • Read performance: Reading an User object costs a join operation or two read operations, one of them multiple. You then have the User object plus a set of references to all Countries.

Association Table

All cardinalities

Create an associative table with keys of both objects

Diagram
Diagram

Consequences

  • Analogous to Foreign Key Association, only adapted to the slightly different context.

Single Table Aggregation

Diagram
  • Solution: put the aggregated object’s attributes into the same table as the aggregating objects.

Diagram

Consequences

Advantages
  • The solution is optimal in terms of performance as only one table needs to be accessed to retrieve an aggregating object with all its aggregated objects.

  • Aggregated objects are automatically deleted on deletion of the aggregating objects, increasing the consistency of the database.

  • No application kernel code or database triggers are needed.

More Consequences

Drawbacks
  • The columns for aggregated objects attributes are likely to increase the number of pages retrieved with each database access, resulting in a possible waste of I/O bandwidth.

  • If the aggregated object type is aggregated in more than one object type, the design results in poor maintainability as each change of the aggregated type causes an adaptation all of the aggregating object types database tables.

  • If you want to form a, Ad-hoc query that scans all AddressType objects in the database, this is very hard to formulate.

Foreign Key Aggregation

Diagram
Diagram

Consequences

Advantages
  • Factoring out aggregated objects into separate tables allows easy querying these tables with ad-hoc queries.

  • Maintenance: Factoring out objects like the Address into tables of their own makes them easier to maintain and hence makes the mapping more flexible.

Drawbacks
  • Consistency: Aggregated objects are not automatically deleted on deletion of the aggregating objects.

  • Performance: Requires a join operation or at least two database accesses, where Single Table Aggregation needs a single database operation.

Persistence Design

Design Approaches

  1. Ad hoc implementation

  2. Transparent and automated persistence layer

  3. Repositories

Approach 1: Ad Hoc Implementation

public Teacher loadMartin() {
	String url   = "jdbc:mysql://serveur.info.univ-nantes.fr/ensdb";
	String query = "SELECT NAME, BIRTHDAY FROM Teacher " + "WHERE id = 'Martin'";
	try {
		Class.forName  ("com.mysql.jdbc.Driver");
		Connection con = DriverManager.getConnection(url,"login","password");
		Statement stmt = con.createStatement ();
		ResultSet rs = stmt.executeQuery (query);

		Teacher martin = new Teacher(rs.getString("NAME"), rs.getDate("BIRTHDAY");
		stmt.close();
		con.close();
		return martin;
	} catch (SQLException e) {}
	  catch (java.lang.Exception e) {}
	}
}

Consequences

Advantages
  • Fast implementation, suitable for small applications.

  • The developer has complete control over queries.

Drawbacks
  • Considerable dependency between domain classes and the database schema.

  • If the schema or DBMS changes, the code must change too.

Approach 2: ORM Manager

  • An ORM Manager provides transparent and automated persistence layer

Diagram

Consequences

Advantages
  • Database independence.

  • Programmers ignore data schema details.

  • In the case of Java: the domain depends on abstractions (JPA) and not on the persistence framework

Drawbacks
  • Considerable impact on performance if you do not have the skill to understand how it works.

  • Difficult to optimize database.

Repositories

  • Create a repository for each aggregate

Diagram

One Repository for each Aggregate

The Risk Server Requires three Interfaces
Figure 8. The Risk Server Requires three Interfaces

Repository Interfaces

Design repositories as in-memory collections

Diagram

Persistence Mechanisms as Components

Diagram

Interface Operations

Parameter Types Alternatives
  • Domain classes

    • Impact on coupling?

  • Interfaces

  • Data Transfer Objects (DTO)

    • Who creates the DTOs ?

Diagram

Conclusion

Conclusion

Persistence principles recall:
  • Object Materialization and Dematerialization

  • Unique Identifier

  • Persistent Object States

  • Aggregates

  • Lazy Materialization

  • Mapping Classes to Tables

  • Mapping Associations to Tables