Also: Container and Pluggable Databases (CDB and PDB)
University of Nantes — LS2N, France
Gerson Sunyé gerson.sunye@univ-nantes.fr
Objectives
Principles of Persistence
Object-Relational Mapping
Persistence Design
Conclusion
Make domain objects persistent; or
Create a database for an application or application family
Use existing persistence mechanisms
Integrate an existing database, used by other applications
PostgreSQL, MySQL, etc.
MongoDB, Net4J, Redis, Cassandra, etc.
JSON, XML, CSV, etc.
H2, Apache Derby, HSQLDB
Also: Container and Pluggable Databases (CDB and PDB)
Essentially, domain objects:
Users, Orders, Products, etc.
Possibly, technical objects:
Windows positions, network timeouts, cache size, etc.
Domain objects are either Entities or Value Objects
Entities are objects that have a distinct identity
Values Objects are objects that matter only as the combination of their properties.
Two values objects with the same property values are considered equal.
Object Materialization and Dematerialization
Unique Identifier
Persistent Object States
Aggregates
Lazy Materialization
Polymorphic Associations and Queries
The process of translating an in-memory object into a format that is suitable for a persistence mechanism
The opposite, the process of translating an object from its persistent mechanism, into live object that can be used by a program
To relate records to objects, and ensure that there are no duplicates, both must have a unique identifier
In object-oriented languages, each object have an unique identifier, its OID
Usually its memory address
In databases, the OID cannot be used as record identifiers
Usually not easily accessible
It changes every time the object is materialized
Let the DBMS handle the identifiers
e.g. using sequence generators
Universally Unique Identifier (UUID)
16 bytes number
easy generation at source
High-Low algorithm
Ask the database for a range of identifiers
New: new object, not persisted
Clean: unmodified
Dirt: modified
Old: persisted object, retrieved from DB
An aggregate is a cluster of domain objects that can be treated as a single unit
In Lazy Materialization, not all objects are materialized at once
Referenced objects are materialized on demand
When a class A is associated to an abstract parent class B, but instances of A are linked to subinstances of B
When the query requests information corresponding the properties of a parent class (e.g. all mission names)
One inheritance tree, one table.
One class, one table.
One inheritance path, one table.
Objects in BLOBs.
[[[Keller 1997]]] Wolfgan Keller 1997. "Mapping Objects to Tables A Pattern Language". In : Proc. of European Conference on Pattern Languages of Programming and Computing, Kloster Irsee, Germany, 1997. p. 207.
Often, the simplest solution
Actually, most frequently chosen
Enables polymorphic queries and associations
Many columns will contain the NULL value
These columns cannot be declared NOT NULL, even if this constraint is valid for subclasses
Most natural method: one table per entity type
No need for joins to retrieve information
Can’t simply translate polymorphic associations
For example, a class that references any article (blog post or book).
In fact, no relational table corresponds to any article, so a referential integrity constraint (foreign key) cannot be imposed.
It’s hard to query the database to perform a polymorphic query: e.g. the title of all articles
The properties of an object are spread over several tables (those corresponding its class anf superclasses)
Its identity is then preserved by giving the same primary key to the rows corresponding to the object in the different tables
The primary keys of tables corresponding to child classes are foreign keys to the primary key of the parent class
To retrieve information on an instance of a child class, simply perform a join on these primary keys
Simple: bijection between classes and tables
Enables polymorphic queries and associations
If the inheritance hierarchy is height, numerous joins are required to reconstitute information scattered across numerous tables
This results in complex instructions and, above all, poor performance
The "type" column helps to alleviate this problem; it is possible, for example, to retrieve the titles of articles without join
Allows reading and writing of any object with a single database operation
If the database allows variable length BLOBs, space consumption is optimal.
Scanning classes for properties is difficult. As the internal structure is not accessible, you need to register functions with the database that give you access to the attributes
The database cannot be shared with other applications
More adapted Key-Value databases that for relational ones
Schema evolution is comparable to schema evolution in an object oriented database
Foreign Key Association
Association Table
Single Table Aggregation
Foreign Key Aggregation
Place an OID foreign key in one or both tables.
Create an OID in the many table
Write Performance: Writing all owned objects in an [*] association costs the number of changed associated objects, as unchanged objects are not written.
Integration of legacy systems: As most relational legacy systems use this mapping, converting [*] associations to objects is no source of new problems.
Space consumption is near optimal, except for the space required for the foreign key column (USER_ID) in the dependent object table (COUNTRY).
Read performance: Reading an User object costs a join operation or two read operations, one of them multiple.
You then have the User object plus a set of references to all Countries.
Create an associative table with keys of both objects
Analogous to Foreign Key Association, only adapted to the slightly different context.
Solution: put the aggregated object’s attributes into the same table as the aggregating objects.
The solution is optimal in terms of performance as only one table needs to be accessed to retrieve an aggregating object with all its aggregated objects.
Aggregated objects are automatically deleted on deletion of the aggregating objects, increasing the consistency of the database.
No application kernel code or database triggers are needed.
The columns for aggregated objects attributes are likely to increase the number of pages retrieved with each database access, resulting in a possible waste of I/O bandwidth.
If the aggregated object type is aggregated in more than one object type, the design results in poor maintainability as each change of the aggregated type causes an adaptation all of the aggregating object types database tables.
If you want to form a, Ad-hoc query that scans all AddressType objects in the database, this is very hard to formulate.
Factoring out aggregated objects into separate tables allows easy querying these tables with ad-hoc queries.
Maintenance: Factoring out objects like the Address into tables of their own makes them easier to maintain and hence makes the mapping more flexible.
Consistency: Aggregated objects are not automatically deleted on deletion of the aggregating objects.
Performance: Requires a join operation or at least two database accesses, where Single Table Aggregation needs a single database operation.
Ad hoc implementation
Transparent and automated persistence layer
Repositories
public Teacher loadMartin() {
String url = "jdbc:mysql://serveur.info.univ-nantes.fr/ensdb";
String query = "SELECT NAME, BIRTHDAY FROM Teacher " + "WHERE id = 'Martin'";
try {
Class.forName ("com.mysql.jdbc.Driver");
Connection con = DriverManager.getConnection(url,"login","password");
Statement stmt = con.createStatement ();
ResultSet rs = stmt.executeQuery (query);
Teacher martin = new Teacher(rs.getString("NAME"), rs.getDate("BIRTHDAY");
stmt.close();
con.close();
return martin;
} catch (SQLException e) {}
catch (java.lang.Exception e) {}
}
}Fast implementation, suitable for small applications.
The developer has complete control over queries.
Considerable dependency between domain classes and the database schema.
If the schema or DBMS changes, the code must change too.
An ORM Manager provides transparent and automated persistence layer
Database independence.
Programmers ignore data schema details.
In the case of Java: the domain depends on abstractions (JPA) and not on the persistence framework
Considerable impact on performance if you do not have the skill to understand how it works.
Difficult to optimize database.
Create a repository for each aggregate
Design repositories as in-memory collections
Domain classes
Impact on coupling?
Interfaces
Data Transfer Objects (DTO)
Who creates the DTOs ?
Object Materialization and Dematerialization
Unique Identifier
Persistent Object States
Aggregates
Lazy Materialization
Mapping Classes to Tables
Mapping Associations to Tables