The rejoining algorithm might not use all of the given relations: it begins
with the relation with the largest number of records, then joins it with enough
relations to contain all of the present attributes. This is not limited to
relations that the starting relation is linked to by foreign keys, and is not
limited to them either, since in some cases this constraint would make it
impossible to rejoin with all of the present attributes.
Since the algorithm may not use all of the given relations, the algorithm may
ignore some types of database inconsistency, where different relations hold
data inconsistent with each other. In this case, the rejoining will be lossy.
Rejoining the results of reduce
can also be lossy.
Due to the above issues, the algorithm will be changed to use all of the
relations in the future.
Not all databases can be represented as a single data frame. A simple example
is any database where the same attribute name is used for several difference
sources of data, since rejoining results in inappropriate merges.