Back Next

Database, Normalisation & SQL coding

Differences between data stored in files and in Databases

There are 4 things you need to comment in the exam

Data integrity

Databases maintains the integrity of the data stored by providing validation and preventing any corrupted or incomplete data

The file just stores the data which it is being supplied with and sometimes the data could be corrupted or overwritten

Data privacy

Databases allow access rights which limits the users access to the data stored in the databases

Files do not have any proper access rights. However, the only method to restrict access to data is to make copies of the file and distribute the edited files to the individuals which contain the data only for that particular user

Data redundancy

If files are distributed and copied then data is unnecessarily redundant and so this will take unneccessary storage

The database organises the data in the most efficient way possible to remove data redundancy. And also many different users of different access rights can access the same and only database

Data dependancy

When data is stored in a file the structure of the file data must be known but, the databases uses queries to store and retrienve data without knowing how data is stored in the database(this is known as the external level).

Also when storing new data the programs may have to be changed where as in databases this is handled by the DBMS software and the users just has to add new data

Relational databases

You may have heard what databases are, it contains tables which store data

However, in AS we need to know what relational databases are

A flatline table is table which is considered to be a stand alone table and has no relation with other tables

In the relational database model the tables have links between other tables

You will need to know some new terms in this model

Relation

A Special type of table used in relational databases

Tuple

Stores data of a particular instance or item in a relation

This is the same as a row or a record

Attribute

Stores data of a particular property of an item or instance in a relation

It is same as a column or a field

Candidate key

One or more key attributes which contain unique values for each tuple

In the table(relation) there might be attributes which contain unique values. So this could be used to uniquely identify each record or tuple

Primary key

This is a candidate key which is used to uniquely identify each tuple

Each table must have a primary key as it prevents data redundancy as each tuple in the relation is automatically unique when the primary key is unique

So as we can see the primary key is chosen from the candidate keys

Secondary key

A candidate key is not used as a primary key

This is mainly used for indexing which we will learn later

Foreign key

A key which is used to link to a primary key of another table

This is how the relationship between tables are made

Referential integrity

The use of a foreign key ensures that a value can only be entered in one table if the same value already exists in the referenced table

This means that foreign key of a table can not be different from the primary key of the referenced table

This ensure that the values in the foreign key must match the primary key values

Firstly, the attributes are made, then the foreign and primary key are defined

Also the primary key can not be updated or changed/deleted if the foreign can't refer to the primary key

Entity relationship modelling

These are represented using E-R diagrams

An entity is an object,thing or person which can store data in fields and rows

So an entity has to have many instances or items to be displayed as a table

In relational databases the tables must be linked so the E-R modelling helps us to find the link between the entities

The exam will us scenario where a database is required to store data

1. First we need to identify the entities

2. Identify the type of relationship. Remember one to one relationships can be ignored

Cardinals or relationship

When finding the relationship between two entities we need to use logic

There are 4 types of relationships between entities and for each we will give an example

relationships in databases. One to one and one to many and many to many

However we could also add some additional information

These are detailed E-R diagrams which has the maximum and minimum values for each entity also. We can see a zero meaning it could have a mimimum of zero records

This | symbol means that the minimum value is one

Detailed E-R diagrams are not usually tested so you can ignore them but you need to know how to draw a normal one

One to One

It means one instance is linked to one instance in the other table

One to many(vice versa)

An example is when one band contains many members

Many to many

We usually say it like this

One is to many and many is to one

For example like a auction center where many people can bid for a single product(instance) and also many products could be auctioned by a single person

Forming the relationships are required and need practice

For example what is the relationship between a BAND and the MEMBER entities?

It is one to many

Representing this relationships in a diagram is called the E-R diagram and they usually give alot of hints

The E-R diagram doesn't just contain a relationship between just two entities but many entities

There are few things you need to remember everytime you see these questions

The table/entity which contains the foreign key is the Many side and the table/entity which contains the primary key is the one side. ALways remember this

No matter what keep this in your mind

Many to Many relationships

It is very hard to represent a many to many relationship in practical databases using foreign keys and primary keys - as then both tables must have foreign keys which is not right

So we make an intermediate table called a link entity which joins two entities which has many to many relationships

So the Link entity table is a separate table which has two foreign keys which are used to link the primary keys of each entities

The two foreign keys are both used as primary keys

We will see that more in normalisation

Normalisation

Normalisation is very similar to the E-R diagrams however, in normalisation this is organising already recorded data in the most efficient way possible whereas E-R diagrams are made before data is recorded

Usually groups of data in a flat line table or a stand alone table may have repeating groups and other inefficient things. Normalisation helps to break the single table to several tables called entities and links them together

There are 3 steps in normalisation:

1 NF or 1st Normalisation

The relations can't have repeating groups and so repeating groups are removed

Repeating groups are when the attributes have more than one set of the same values

Like the below example

Surname	Department	Payrate	Country	City
Donaldson	Accounts	$25	USA	New york
Newton	HR	$28	UK	London
Xian	Accounts	$25	China	Shanghai

We can see Donaldson and Xian have common data

The attributes or fields which have the same value are grouped together in a non repeated table and it is linked to another table which contains an attribute which is repeating

So the relationship is one to many

Let's see the same example. This is in the 1NF

Jobdetails
Department	Payrate
Accounts	$25
HR	$28

So the repeating groups are collapsed together

This table is then linked to a separate table.

Employeedetails
Department	Surname	Country	City
Accounts	Donaldson	USA	New york
HR	Newton	UK	London
Accounts	Xian	China	Shanghai

As we can see the 2nd table has the repeating groups for the department field but not for the pay rate. So this ensures data redundancy is minimized as possible and it is very significant in large sets of data where many attributes have repeating groups

Also we can see the 1st table could be called the Jobdetails entity which the department field acts as the Primary key. The Employeedetails entity is referenced to the Jobdetails entity by using the foreign key department

So again by using the rule, the place which has the foreign key is the many sides and the one with the primary key is the one side

The relationship is one to many

Another point to remember is that the foreign key of the Employeedetails entity is department but what is the primary key. As the department field is not unique it is not enough to uniquely identify each record. So we need to define another key as a primary key.

The last name can be used as the primary key. However, we use both the departments and the last name as primary keys. This is because the foreign key of the new table must be also the primary key. So we will use the surname and departments as both primary keys

2 NF or 2nd Normalisation

A rule in the relational database model is that non-key attributes or the other attributes must depend fully on the primary key or keys. This means we must be able to identify each record uniquely depending on the primary keys only

However, sometimes the other attributes may not be dependant on both primary keys but only a single primary key. This is known as partial dependancy

I just realized the example above is very poor and just focus on the method. So I will state that the Country and city are dependants on the surname but not on the department key

The 2nd normal form removes any partial dependency and it must be already in the 1 NF to do this

Jobdetails - Surnameorigin
Department	Surname
Accounts	Donaldson
HR	Newton
Accounts	Xian

Surname origin
Surname	Country	City
Donaldson	USA	New york
Newton	UK	London
Xian	China	Shanghai

As we see we made a separate table so that the country and the city are dependant on the primary key only. In the Surname origin table the primary key is the surname. Now in the previous table, the name changes from Employeedetails to Jobdetails - surnameorigin

This example is completely fictional and i just created it.

Now in the 2nd table(jobdetails-surnameorigin) both the Surname and Department keys are foreign keys as they link between the two tables

They are also both Primary keys

The relationship is many to one

This is the same as the link entity method to display a many to many relationship logically/practically

3 NF or 3rd Normalisation

Again back to the point that the non key attributes must depend only on the primary key.

Sometimes the non key attributes could be dependant on another non key attribute

For example the city depends on the country and so the 3rd normal form removes any Transitive dependancy(non - key dependancy). So a 4th table is made

Surnameorigin
Surname	Country
Donaldson	USA
Newton	UK
Xian	China

Countrydetails
Country	City
USA	New york
UK	London
China	Shanghai

As we can see we made a separate table so the city field is only dependant on the country field

The primary key of this table is Country and the foreigh key is in the previous table

The relationship is many to one

Normalisation is a method and it is very tricky. But just remember these two principles

Normal forms can't have repeating groups or any partial or transitive dependancy

Database Management System (DBMS)

The database management system can be divided to 3 levels

External level

This is the level used by the external users of the data in the database. Where they store and retrieve data without any knowledge of the structure of the database. They have access rights to control which data they are able to access

The employees in a company use this level

Conceptual level

This is the level used by the DBA (Database adminstrator) who creates tables and databases. The DBA is also responsible in setting access rights for the users in the external level. The DBA uses the DBMS software to perform these functions

Internal level

This is the level handled by the DBMS software which controls the storage of data in the physical storage(internal schemes). It controls where data is stored in the storages and this is the level where data is directly accessed or stored

DBA

This is the person who is in charge of the conceptual level and uses the DBMS software to create the structure of the database

There are some tasks which the DBA is in charge of and you need to know these

1. Setting up Regular and incremental Back ups

2. Setting up the access rights for users in the external level

3. Creating databases and tables and their relationships

4. Data dictionary and indexing function could be used by the DBA

DBMS Software

This software controls the whole database and allows the DBA to create tables and the external usere to store or retrieve data

In other words the DBMS software handles the internal operations of the database

Usually the DBA doesn't require to know SQL to be able to create the databases or to access or store data

The Developer interface provides an interface for the DBA to create the structures easily

The Query processor is a special software which is used to retrieve/ manipulate data in the database

You will need to know the features provided by the DBMS software

1. Performs daily and incremental backups

2. Sets access rights for each user

3. Validates data when entered - maintains data integrity

4. Indexing function to access data faster

5. Data dictionary - which gives the detal of how the data in the database are stored in the physical memory to the DBA

Indexing

A small secondary table which contains an attribute of unique values so it can point to the tuple of the table much faster

This concept makes accessing data from a large table much faster

A separate table called an index table contains an attribute of the table which contains unique values(candidate key)

We usually use the secondary key for this , so the index table points it to the corresponding tuple for fast access

SQL

We need to learn some scripting of the SQL language

SQL is divided into two categories

Data Definition Language (DDL)

To define the structure of the database

Data Manipulation Language (DML)

In manipulating the data in the database such as storing and retrieving data from the database

In reality we don't really separate the language to these two parts

We will show an example for each

DDL

We will see an example

Creating Database structures:

CREATE DATABASE ClientDatabase;

CREATE TABLE Clientdetails(PRIMARY KEY(Client ID varchar(10))
,ClientName varchar(25),DOB Date,Subscription boolean);

ALTER TABLE Client-booking ADD PRIMARY KEY(Booking ID varchar(25))
ADD FOREIGN KEY(ClientID REFERENCES Clientdetails(Client ID);

As you can see we need to remember some points

SQL is Case Insensitive

The program statements can go for many lines but each statement must end with ;

We use the Keywords DATABASE and TABLE only in DDL and not in DML

So when we want to define an attribute as a primary key we need to put it in brackets

The ALTER keyword is used to alter an existing table such as adding a new field

For each attribute we need to define the data type, databases don't contain the datatype STRING so we use varchar(x) where x is number of characters

The Database follows all other datatypes except STRING

DML

Adding data to the table

INSERT INTO Clientdetails(Client ID, DOB)
VALUES (2345, 23/7/2002);

There is another way of entering but this is the correct one

The clientdetails is the table name and inside the brackets are the field names

The values must be in the same order as the fields in the brackets

Retrieving data from the database

SELECT DOB
FROM Clientdetails
WHERE Client ID = 2345;

So this will return 23/7/2002

We need to select which field name you need to be displayed and from which table and also the where statement gives us the condition or else the whole DOB attribute will be displayed

SELECT DOB
FROM Clientdetails;

Has no condition so it displays all the values in DOB attribute/field

We can also display all the fields using the *

SELECT *
FROM Clientdetails
WHERE ClientID = 2345;

So this displays all the fields in the table clientdetails

If we need to have some format of displaying the data we can either use:

SELECT DOB
FROM Clientdetails
WHERE ClientID = 2345
ORDER BY DOB ASC|DSC;

So this orders the DOB in order of the Date. The default is Ascending Order

or we could use:

SELECT DOB
FROM Clientdetails
WHERE ClientID = 2345
GROUP BY Department;

This means that the date of birth is displayed in groups of department

If the department has repeating groups then this could be used

The WHERE condition could be further used to compare field values of different tables and not only in the same table

SELECT DOB
FROM Clientdetails
INNER JOIN Client-booking
WHERE Clientdetails.ClientID = 2345 AND Client-booking.BookingID = 2345;

So this is used to display the field value when both conditions of two different table values are correct

We need to use INNER JOIN to join the other table. We also need to use a dot to show it is a field of a particular table

Clientdetails.ClientID means the field ClientID from the table Clientdetails

Updating/Changing existing data

If we need to change the existing data in the table

UPDATE CLientdetails
SET ClientID = 2678
WHERE ClientID = 2345;

So this changes the clientID which was 2345 to 2678

Deleting data from the table

DELETE FROM Clientdetails
WHERE ClientID = 2678;

If the client is no longer in the business then we need to delete it them from our table

This deletes the whole record which has the the ClientID 2678

Always remember the foreign key records must be first deleted before the primary key records are deleted in the referenced table (referential integrity)

Summary

I will give a small summary of each keywords

CREATE - DDL for creating databases or tables

DATABASE or TABLE - DDL used for defining the name

ALTER - DDL change the structure of an existing table such as adding a new field/attribute

ADD - DDL used in adding new fields

INSERT - DML inserting new data to a table

VALUES - DML order of the values to be inserted in to the corresponding fields

SELECT - DML retrieving data

FROM - DML Identifys from which table

WHERE - DML conditions

GROUP BY - DML groups displayed data

ORDER BY - DML orders displayed data

INNER JOIN - DML used to compare field values from two different tables

UPDATE - DML used to update a table

SET - DML sets a field to a value

DELETE - DML deletes a record from a table

Back Next

Revise Now

GCE Subjects

Coding Lessons

More

What do you want to do?

Computer Science AS

AS Practical

Computer Science A2

A2 Practicals