This activity is designed to help you understand the process of designing and constructing ERDs using Systems Architect. Entity Relationship Diagrams are a major data modeling tool and will help organize the data in your project into entities and define the relationships between the entities. This process has proved to enable the analyst to produce a good database structure so that the data can be stored and retrieved in a most efficient manner.
| 1. Identify Entities | Identify the roles, events, locations, tangible things or concepts about which the end-users want to store data. |
| 2. Find Relationships | Find the natural associations between pairs of entities using a relationship matrix. |
| 3. Draw Rough ERD | Put entities in rectangles and relationships on line segments connecting the entities. |
| 4. Fill in Cardinality | Determine the number of occurrences of one entity for a single occurrence of the related entity. |
| 5. Define Primary Keys | Identify the data attribute(s) that uniquely identify one and only one occurrence of each entity. |
| 6. Draw Key-Based ERD | Eliminate Many-to-Many relationships and include primary and foreign keys in each entity. |
| 7. Identify Attributes | Name the information details (fields) which are essential to the system under development. |
| 8. Map Attributes | For each attribute, match it with exactly one entity that it describes. |
| 9. Draw fully attributed ERD | Adjust the ERD from step 6 to account for entities or relationships discovered in step 8. |
| 10. Check Results | Does the final Entity Relationship Diagram accurately depict the system data? |
A company has several departments. Each department has a supervisor and at least one employee. Employees must be assigned to at least one, but possibly more departments. At least one employee is assigned to a project, but an employee may be on vacation and not assigned to any projects. The important data fields are the names of the departments, projects, supervisors and employees, as well as the supervisor and employee SSN and a unique project number.
The entities in this system are Department, Employee, Supervisor and Project. One is tempted to make Company an entity, but it is a false entity because it has only one instance in this problem. True entities must have more than one instance.
We construct the following Entity Relationship Matrix:
Dept. Employee Supervisor Project
Department is assigned run by
Employee belongs to Works on
Supervisor runs
Project Uses
We connect the entities whenever a relationship is shown in the entity Relationship Matrix.
______________ ______________
| Department |_________________Run by___________________| Supervisor |
| | | |
-------------- --------------
|
|
| ------------ -----------
|___Is_____| Employee |________Works on_____________| Project |
assigned | | | |
------------ -----------
From the description of the problem we see that:
______________ ______________ | Department | Run by | Supervisor | | |-++------------------------------------++-| | --|----------- -------------- W + | ------------ ----------- | Is | Employee | Works on | Project | --------+<-| |->+-----------------------0<-| | Assigned ------------ -----------
The primary keys are Department Name, Supervisor SSN, Employee SSN, Project Number.
There are two many-to-many relationships in the rough ERD above, between Department and Employee and between Employee and Project. Thus we need the associative entities Department-Employee and Employee-Project. The primary key for Department-Employee is the concatenated key Department Name and Employee SSN. The primary key for Employee-Project is the concatenated key Employee SSN and Project Number.
________________ _____________________
| Department | Run by | Supervisor |
|___Key Data___|-++----------------------------++-|___Key Data________|
|Dept Name[PK1]| |Supervisor SSN[PK1]|
--|------------- ---------------------
+
+
| -------------------- -------------------
| Is | Employee-Dept | involves | Employee |
--------+<-|____Key Data______|->+-----------------++-|_____Key Data____|
Assigned |Dept Name[PK1][FK]| |Employee SSN[PK1]|
|Emp. SSN[PK1][FK] | -------------------
-------------------- +
+
Works|on
|
0
M
-------------------- -------------------
| Project | Includes |Employee-Project |
|____Key Data______|++-------------------------+<-|____Key Data_____|
|Project Num.[PK1] | |Emp. SSN[PK1][FK]|
-------------------- |Proj Num[PK1][FK]|
-------------------
The only attributes indicated are the names of the departments, projects, supervisors and employees, as well as the supervisor and employee SSN and a unique project number.
Attribute Entity Attribute Entity Department Name Department Supervisor SSN Supervisor Employee SSN Employee Supervisor Name Supervisor Employee Name Employee Project Name Project Project ID Project
________________ _____________________
| Department | Run by | Supervisor |
|___Key Data___|-++----------------------------++-|___Key Data________|
|Dept Name[PK1]| |Supervisor SSN[PK1]|
--|------------- |____Non-Key Data___|
+ | Supervisor Name |
+ ---------------------
|
| -------------------- -------------------
| Is | Employee-Dept | involves | Employee |
--------+<-|____Key Data______|->+-----------------++-|_____Key Data____|
Assigned |Dept Name[PK1][FK]| |Employee SSN[PK1]|
|Emp. SSN[PK1][FK] | |___Non-Key DATA__|
-------------------- | Employee Name |
-------------------
+
+
Works|on
|
0
M
-------------------- -------------------
| Project | Includes |Employee-Project |
|____Key Data______|++-------------------------+<-|____Key Data_____|
|Project Num.[PK1] | |Emp. SSN[PK1][FK]|
|___Non-Key Data___| |Proj Num[PK1][FK]|
| Project Name | -------------------
--------------------
The final ERD seems to model the data in this system well.
A data entity is anything real or abstract about which we want to store data. Entity types fall into five classes: roles, events, locations, tangible things, or concepts. The best way to identify entities is to ask the system owners and users to identify things about which they would like to capture, store and produce information. Another source for identifying entities is to study the forms, files, and reports generated by the current system. E.g. a student registration form would refer to Student (a role), but also Course (an event), Instructor (a role), Advisor (a role), Room (a location), etc.
There are natural associations between pairs of entities. Listing the entities down the left column and across the top of a table, we can form a relationship matrix by filling in an active verb at the intersection of two entities which are related. Each row and column should have at least one relationship listed or else the entity associated with that row or column does not interact with the rest of the system. In this case, you should question whether it makes sense to include that entity in the system.
Using rectangles for entities and lines for relationships, we can draw an Entity Relationship Diagram (ERD). The Entity Relationship Diagram editor in Systems Analyst provides a tool for drawing this type of diagram. Since Systems Analyst is an integrated upper CASE tool, one can associate the entities in the diagram with their primary keys, foreign keys and other attributes identified in step 7 using the definition screen. See the Skill Exercises for instructions on how to do this.
At each end of each connector joining rectangles, we need to place a symbol indicating the minimum and maximum number of instances of the adjacent rectangle there are for one instance of the rectangle at the other end of the relationship line. The placement of these numbers is often confusing. The first symbol is either 0 to indicate that it is possible for no instances of the entity joining the connector to be related to a given instance of the entity on the other side of the relationship, 1 if at least one instance is necessary or it is omitted if more than one instance is required. For example, more than one student must be enrolled in a course for it to run, but it is possible for no students to have a particular instructor (if she is on sabbatical).
The second symbol gives the maximum number of instances of the entity joining the connector for each instance of the entity on the other side of the relationship. If there is only one such instance, this symbol is 1. If more than 1, the symbol is a crows foot opening towards the rectangle.
If you read it like a sentence, the first entity is the subject, the relationship is the verb, the cardinality after the relationship tells how many direct objects (second entity) there are.
I.e. A student is enrolled in one or more courses
subj verb objects
For each entity we must find a unique primary key so that instances of that entity can be distinguished from one another. Often a single field or property is a primary key (e.g. a Student ID). Other times the identifier is a set of fields or attributes (e.g. a COURSE needs a department identifier, a course number, and often a section number; a Room needs a Building Name and a Room Number). When the entity is written with all its attributes, the primary key is underlined.
Looking at the Rough Draft ERD, we may see some relationships which are non-specific or many-to-many. I.e., there are crows feet on both ends of the relationship line. Such relationships spell trouble later when we try to implement the related entities as data stores or data files, since each record will need an indefinite number of fields to maintain the many-to-many relationship.
Fortunately, by introducing an extra entity, called an associative entity for each many-to-many relationship, we can solve this problem. The new associative entity's name will be the hyphenation of the names of the two originating entities. It will have a concatenated key consisting of the keys of these two entities. It will have a 1-1 relationship with each of its parent entities and each parent will have the same relationship with the associative entity that they had with each other before we introduced the associative entity. The original relationship between the parents will be deleted from the diagram.
The key-based ERD has no many-to-many relationships and each entity has its primary and foreign keys listed below the entity name in its rectangle.
A data attribute is a characteristic common to all or most instances of a particular entity. In this step we try to identify and name all the attributes essential to the system we are studying without trying to match them to particular entities. The best way to do this is to study the forms, files and reports currently kept by the users of the system and circle each data item on the paper copy. Cross out those which will not be transferred to the new system, extraneous items such as signatures, and constant information which is the same for all instances of the form (e.g. your company name and address). The remaining circled items should represent the attributes you need. You should always verify these with your system users. (Sometimes forms or reports are out of date.)
For each attribute we need to match it with exactly one entity. Often it seems like an attribute should go with more than one entity (e.g. Name). In this case you need to add a modifier to the attribute name to make it unique (e.g. Customer Name, Employee Name, etc.) or determine which entity an attribute "best' describes. If you have attributes left over without corresponding entities, you may have missed an entity and its corresponding relationships. Identify these missed entities and add them to the relationship matrix now.
If you introduced new entities and attributes in step 8, you need to redraw the entity relationship diagram. When you do so, try to rearrange it so no lines cross by putting the entities with the most relationships in the middle. If you use a tool like Systems Architect, redrawing the diagram is relatively easy.
Even if you have no new entities to add to the Key-Based ERD, you still need to add the attributes to the Non-Key Data section of each rectangle. Adding these attributes automatically puts them in the repository, so when we use the entity to design the new system, all its attributes will be available.
Look at your diagram from the point of view of a system owner or user. Is everything clear? Talk through the Cardinality pairs with a team member. Also, look over the list of attributes associated with each entity to see if anything has been omitted.
Drivers (identified by DRIVER-NO, with each driver having a NAME, HOME-ADDRESS, and a DATE-OF-BIRTH) take out vehicles to make deliveries. A vehicle (identified by VEHICLE-NO, with each vehicle being of a particular MAKE and YEAR-OF-MANUFACTURE) may be taken out of a depot whenever available and kept out for any length of time (ranging from one or two hours to a number of days). It is possible for a vehicle to be taken out more than once on a given day. There is only one depot.
Each time a driver takes out a vehicle, he or she takes out a load made up of any QTY of any of a number of item types (identied by an ITEM-NO and having a COLOR, WEIGHT, and DESCRIPTION). Every time a vehicle is taken out, the driver can incur expenses of allowed types (e.g. fuel costs). Each expense type has a CODE-NO. The AMOUNT and CODE-NO are recorded each time an expense is incurred. One or more expenses of the same type may be incurred during the same trip. Any number of stops can be made during the trip. An ADDRESS of the stop is recorded for each stop, together with the QTY-LEFT of each item type left at the stop. A driver stops at an ADDRESS only once during every trip. However, stops can be made at the same ADDRESS on different trips.
The entities in this system are DRIVER (a role), VEHICLE (a tangible item), LOAD (an event), EXPENSE (an event), STOP (a location) and ITEM (a tangible item). Most of these entities are evident from the wording in the problem. What may not be clear is the need for both LOAD and ITEM. A LOAD is the contents of a truck on a single trip. A LOAD contains many ITEMS which are dropped off at the various STOPs. Note that DEPOT is not an entity since it has only one instance.
Driver Load Stop Vehicle Item Expense
Driver Drives
Load Carried in Consists of Incurs
Stop Gets
Vehicle Driven by Carries
Item Makes up Left at
Expense Incurred by
It may seem strange that we have related Expense with Load instead of with Driver. This is because the expenses are recorded and turned in for each load. Thus, they are associated with the trip more than with the driver or the vehicle.
______________ ______________
| Vehicle | Driven by | Driver |
| |->0------------------------------------0<-| |
--|----------- --------------
+
+
| ------------ -----------
| Carries | Load | Incurs | Expense |
--------0<-| |->+-----------------------0<-| |
------------ -----------
W
+
|
Made|up of
|
+
M
---------- ----------
| Item | Left At | Stop |
| |->+----------------------------0<-| |
---------- ----------
To save space, we filled in the Cardinality on the diagram above. The underlying assumption is that this system is working over time. I.e., drivers drive some vehicles one day and others another day. Similarly, a house may be a stop on more than one day, an item may be part of more than one load, etc.
The Primary Keys for the entities are as follows:
Entity Primary Key Entity Primary Key Driver DRIVER-NO Vehicle VEHICLE-NO Load DATE, TIME, VEHICLE-NO Item ITEM-NO Stop ADDRESS, DATE, TIME Expense CODE-NO, DATE, TIME
Note that there are three sets of Date/Time combinations. We have used hyphenated labels for date and time in the ERD below.
Most of the relationships in this ERD are many-to-many. We will need the associative entities VEHICLE-DRIVER, LOAD-EXPENSE, LOAD-ITEM and ITEM-STOP.
--------------------
Associated with | Driver-Vehicle | Requires
--------------------0<-|_____Key Data_____|->0--------------
| |Driver-No[PK1][FK]| |
+ |Vehicle-No[PK1][FK] +
_____+__________ -------------------- ________+______
| Vehicle | | Driver |
|___Key Data___| |__Key Data___|
|Vehicle-No[PK1] |Driver-No[PK1]
--|------------- ---------------
+
+
| ____________________ ____________________
| Carries | Load | Incurs | Load-Expense |
--------0<-|__Key Data________|-++-----------0<-|___Key Data_______|
|Load-Date [PK1] | |Expense-Date [PK1]|
|Load-Time [PK1] | |Expense-Time [PK1]|
|Vehicle-No[PK1][FK] |Code-No[PK1][FK] |
-------------------- |Load-Date [PK1] |
+ |Load-Time [PK1] |
+ |Vehicle-No[PK1][FK]
| --------------------
Made|up of W
| +
| involves
+ +
____M________________ ______+____________
| Load-Item | | Expense |
|_____Key Data______| |___Key Data______|
|Load-Date [PK1] | |Expense-Date [PK1]
|Load-Time [PK1] | |Expense-Time [PK1]
|Vehicle-No [PK1][FK] |Code-No [PK1] |
|Item-No [PK1][FK] |>+---- -------------------
--------------------- |has
+ +
+ ___+_________
| | Item |
Includes |__Key Data_|
| |Item-No[PK1]
+ -------------
____M_______________ ___________________
| Stop-Load-Item | Left At | Stop |
|_____Key Data_____|->+------------++-|____Key Data_____|
| Address [PK1] | | Address [PK1] |
|Load-Date [PK1] | | Stop-Date [PK1] |
|Load-Time [PK1] | | Stop-Time [PK1] |
|Vehicle-No[PK1][FK] -------------------
|Item-No [PK1][FK] |
|Stop-Date [PK1] |
|Stop-Time [PK1] |
--------------------
You may wonder why we used the Associative entity STOP-LOAD-ITEM instead of STOP-ITEM to connect STOP to the Items left at the stop. This is because all the items that will be left at a stop are contained in LOAD-ITEM as they must be contained in a single load. We will see another reason when we try to map the attributes in step 8.
The following data attributes were identified: DRIVER-NO, NAME, HOME-ADDRESS, DATE-OF-BIRTH, VEHICLE-NO, MAKE, YEAR-OF-MANUFACTURE, QTY, ITEM-NO, COLOR, WEIGHT, DESCRIPTION, CODE-NO, AMOUNT, ADDRESS, QTY-LEFT, ADDRESS, LOAD-DATE, LOAD-TIME, STOP-DATE, STOP-TIME, EXPENSE-DATE, EXPENSE-TIME.
Driver Vehicle Load Item
DRIVER-NO VEHICLE-NO VEHICLE-NO ITEM-NO
NAME MAKE LOAD-DATE COLOR
HOME-ADDRESS YEAR-OF-MANUFACTURE LOAD-TIME WEIGHT
DESCRIPTION
Stop Expense Vehicle-Driver Load-Item
ADDRESS CODE-NO VEHICLE-NO VEHICLE-NO
STOP-DATE EXPENSE-DATE DRIVER-NO LOAD-DATE
STOP-TIME EXPENSE-TIME LOAD-TIME
AMOUNT ITEM-NO
QTY
Stop-Load-Item Load-Expense
VEHICLE-NO VEHICLE-NO
LOAD-DATE LOAD-DATE
LOAD-TIME LOAD-TIME
ITEM-NO CODE-NO
ADDRESS EXPENSE-DATE
STOP-DATE EXPENSE-TIME
STOP-TIME
QTY-LEFT
Note that without the associative entities Load-Item and Stop-Load-Item, we would not have found a place for QTY and QTY-LEFT. That is because QTY depends on both LOAD and ITEM and QTY-LEFT depends on STOP and LOAD-ITEM. This is another reason for introducing Stop-Load-Item instead of Stop-Item as we first intended to do. It is theoretically possible for two trucks to pull up at the same time and each deliver the same item, so QTY-LEFT depends on the Load as well as the Stop and the Item.
This ERD looks just like the Key-Based ERD above except that each entity will have a Non-Key Data section in which all the attributes in the table above that were not listed in the Key Data section will be listed.
A swim meet director needs to assign swimmers to events and keep track of those of different age groups belonging to swim clubs. A swimmer can belong to only one club at a time and is always attached to a club. A swimmer must be entered in at least one event. Each swimmer is in exactly one age group. An event must have at least 2 but there is no upper limit to the number of swimmers who can enter the event. The attributes of the entities swimmer, event, club, and age group are as follows:
Entity: SWIMMER Entity: CLUB
*USS-NUMBER *CLUB-CODE
NAME CLUB-NAME
AGE HEAD-COACH
SEX CLUB-ADDRESS-GROUP
CLUB-CODE CLUB-STREET-ADDRESS
EVENT-GROUP (repeats 1 - n times) CLUB-CITY
EVENT-NUMBER CLUB-STATE
EVENT-DESCRIPTION CLUB-ZIP-CODE
SEED-TIME CLUB-SCORE (sum of the swimmer
ACTUAL-TIME points for swimmers in this club)
EVENT-POINTS
SWIMMER-POINTS (sum of event points)
Entity: EVENT Entity: AGE-GROUP
*EVENT-NUMBER *SEX + LOWER-AGE + UPPER-AGE
SEX DESCRIPTION
AGE-RANGE
DESCRIPTION
The Asterisks mark the primary key attributes. AGE-GROUP has a
concatenated primary key.
The nullity field is for you to specify if it is OK for the field to have nothing in it. For a primary key you want to specify NOT NULL. Other attributes may be null. Skip the NOT NULL WITH DEFAULT for now.
For now skip to the description box and type in a short description of the attribute.
For a group attribute, like CLUB-ADDRESS-GROUP, choose the STRUC
button instead of the ELEM one. This brings up another definition screen
with an attribute window where you can type the attributes that make up
the group and then highlight each of them as an ELEM type.
Return to the
List of Activities