- (Brandon, 2005) ⇒ Daniel Brandon. (2005). “Recursive Database Structures.” In: Journal of Computing Sciences in Colleges, 21(2).
Recursion has always been an interesting topic in computer science, and one of the most difficult data structures to implement in modern relational database systems is recursive relationships. In a recursive relationship one type of entity is related in some way to other entity instances of the same type. Since a relationship exists, referential integrity constraints must be enforced and processing must be developed to follow the bi-directional relationship in both mapping directions. For non-recursive relationships, referential integrity constraint enforcement and bi-directional information flows have been automated to a great extent in modern database products; but this automation is not available for the recursive case. This paper discusses this problem, presents proposed solutions, and illustrates a practical application of such a solution.
Recursive database structures, like binary relations, can appear in three ways as shown in the E-R diagrams of Figure 1. Here the common names of the three types (one-to-one, one-to-many, and many-to-many) are based upon the maximum cardinality in each mapping direction. An example of a one-to-one recursive relationship would be the relationship between members in a club where each member could sponsor one other member and each member was sponsored by one other member. A common example of a one-to-many recursive relationship is a set of general ledger accounts where each account has one master account (except at the top of the hierarchy) and an account could have many subsidiary accounts (except at the bottom of the hierarchy). An example of a many-to-many recursive relationship is a “parts explosion" where each part could be composed of many other parts, and each part could be a part of many other parts.
A many-to-many situation can be broken down into two one-to-many situations with the insertion of an “intersection table ", and the one-to-one is a simplification of the one-to-many. So the general structure that must be dealt with is that of a 1-to-many recursive relationship or a so called “forest " or " tree-structure ". This is shown in Figure 2, with typical mapping name for the two directions.
BASIC RELATIONAL DATABASE IMPLEMENTATION
Recursive database structures have always been one of the more problematic issues in database implementation for several reasons. One reason is that referential integrity cannot be set up directly in many products. The " textbook " approach to represent the tree structure via relational tables uses a foreign key mapped from the " many " side of the relationship to the "one" side of the relationship (Kroenke, 2006):
Here " entityCode " is the primary key (underlined) and “entityMasterCode" is the foreign key (shown in italics) of table (mathematical relation) XXX. In modern relational database products, for binary relationships, one can use a visual window and " drag " a primary key into a related table field to indicate a foreign key, and then check off some boxes to indicate the type of integrity constraints needed. In most products one can also describe primary and foreign key relationships and constraints via SQL (as shown later in this paper). However for recursive relationships, the foreign key is not always required such as in the forest of Figure 3. In addition there are additional constraints of the foreign key when it is required, such as the foreign key reference is at a lower level of the tree / forest. Thus one is forced to hand code referential integrity checking via the scripting language available with a database product or with a third generation language (C + +, Java, PHP) in the forms processing associated with the database adds, deletes, and modify operations.
The most difficult obstacle with recursive relations concerns the navigation in each mapping direction and with a "rollup" process. Consider the case of a general ledger hierarchical chart of accounts as illustrated in Figure 3. Here we need to be able to easily retrieve the master account for a given account and to retrieve the subsidiary accounts of a given master account. In addition we need to be able to efficiently "rollup" information from the bottom of the hierarchy to the top. Here debits and credits are posted to the “activity " accounts at the bottom of the hierarchy and these need to be successively added to balances up though the hierarchy.
There are a number of ways to represent the tree structure via relational tables. The textbook method where a foreign key is mapped from the "many" side of the relationship to the "one" side of the relationship was previously shown and this is also called the "adjacency list model" of tree structures. The adjacency model suffers from a number of possible " anomalies ", which can be minimized if primary key codes are not allowed to be changed. The SQL to create this table would look something like:
CREATE TABLE XXX (entityCode char(20) NOT NULL PRIMARY KEY, … [non-key attributres] entityMasterCode char(20) DEFAULT NULL REFERENCES XXX(entityCode));
Depending upon whether this was a “tree” (only one node without a master) or a “forest” (multiple nodes at the top without masters)]], applicable constraints would also need to be added. This method is the easiest to implement for inserts into the XXX table and for certain simple queries. If the entityCode being inserted is not at the top of the hierarchy (has a master), then referential integrity must be enforced, that is the foreign key (master) must exists before the subsidiary code was inserted; insertions also need to
|2005 RecursiveDatabaseStructures||Daniel Brandon||Recursive Database Structures||2005|