Previous Topic: Entity and Non-Key AreasNext Topic: Alternate Key Attributes


Primary Key Selection

Choosing the primary key of an entity is an important step that requires serious consideration. Before you actually select a primary key, consider several attributes, which are referred to as candidate key attributes. Typically, the business user who knows the business and business data can help identify candidate keys.

For example, to use the EMPLOYEE entity in a data model (and later in a database) correctly, you uniquely identify instances. In the customer table, you could choose from several potential key attributes including: the employee name, a unique employee number assigned to each instance of EMPLOYEE, or a group of attributes, such as name and birth date.

The rules that you use to select a primary key from the list of all candidate keys are stringent. The rules can be consistently applied across all types of databases and information. The rules state that the attribute or attribute group must:

Example:

Consider which attribute you would select as a primary key from the following list of candidate keys for an EMPLOYEE entity:

If you use the rules in the preceding list to find candidate keys for EMPLOYEE, you could compose the following analysis of each attribute:

After analysis, there are two candidate keys. One is “employee number” and the other is the group of attributes containing “employee name” and “employee birth date.” “employee number” is selected as the primary key because it is the shortest and helps ensure uniqueness of instances.

When choosing the primary key for an entity, data architects often assign a surrogate key. A surrogate key is an arbitrary number that is assigned to an instance to identify it within an entity uniquely. “employee number” is an example of a surrogate key. A surrogate key is often the best choice for a primary key. A surrogate key is short, can be accessed the fastest, and helps ensure unique identification of each instance. The system can also automatically generate surrogate keys so that numbering is sequential and does not include any gaps.

A primary key chosen for the logical model is not always the primary key used to access the table efficiently in a physical model. The primary key can be changed to suit the needs and requirements of the physical model and database at any point.