Repeating Data Groups

Repeating data groups can be defined as lists, repeating elements, or internal structures inside an attribute. This structure, although common in legacy data structures, violates first normal form and must be eliminated in an RDBMS model. An RDBMS cannot handle variable-length repeating fields because it offers no ability to subscript through arrays of this type. The entity below contains a repeating data group, �children's-names.� Repeating data groups violate first normal form, which basically states that an entity is in first normal form if each of its attributes has a single meaning and not more than one value for each instance.

Repeating data groups, as shown below, present problems when defining a database to contain the actual data. For example, after designing the EMPLOYEE entity, you are faced with the questions, �How many children's names do you need to record?� �How much space should you leave in each row in the database for the names?� and �What will you do if you have more names than remaining space?�

The following sample instance table might clarify the problem:

EMPLOYEE

emp-id

emp-name

emp-address

children's-names

E1

Tom

Berkeley

Jane

E2

Don

Berkeley

Tom, Dick, Donna

E3

Bob

Princeton

-

E4

John

New York

Lisa

E5

Carol

Berkeley

-

In order to fix the design, it is necessary to somehow remove the list of children's names from the EMPLOYEE entity. One way to do this is to add a CHILD table to contain the information about employee's children, as follows:

Once that is done, you can represent the names of the children as single entries in the CHILD table. In terms of the physical record structure for employee, this can resolve some of your questions about space allocation, and prevent wasting space in the record structure for employees who have no children or, conversely, deciding how much space to allocate for employees with families.

The following tables are the sample instance tables for the EMPLOYEE-CHILD model:

EMPLOYEE

emp-id

emp-name

emp-address

E1

Tom

Berkeley

E2

Don

Berkeley

E3

Bob

Princeton

E4

Carol

Berkeley

CHILD

emp-id

child-id

child-name

E2

C1

Tom

E2

C2

Dick

E2

C3

Donna

E4

C1

Lisa

This change makes the first step toward a normalized model; conversion to first normal form. Both entities now contain only fixed-length fields, which are easy to understand and program.