Wednesday, 9 April 2008

Lecture Review Week 11

This shall be the last weekly review for this semester since the examinations will only be testing on topics till this week, and some of the modules have already finished the lecture series.


Database --- The last topic of this module is Normalization. The motivation for this is to remove anomalies in the relations by decomposing the table to smaller tables. For this module, we only need to learn 2NF, 3NF, and BCNF.

Relations in BCNF are certainly in 3NF and 2NF as well. Here's the conditions for each of the three cases:

2NF: for functional dependency in the relation like this {A}->{B}
A is not a proper subset of the candidate key or
B is a part of any candidate key

3NF: for functional dependency in the relation like this {A}->{B}
A is a super key or
B is a part of some candidate key

BCNF(Boyce Codd Normal Form): for functional dependency (F.D.) in the relation like this {A}->{B}
A is a super key

Apart from identifying the different normal forms, obviously, u need to know how to decompose the original relations. The first algorithm breaks the relation R, down to BCNF.

First identify the F.D. that violates BCNF, and we calculate the closure for the LHS of this F.D.

{LHS}+ = {some attributes}

Remove the attributes from R to form R1, and another R2 which consists of the remaining attributes in R union with LHS. Then check for both R1 and R2 that F.D.1 and F.D.2 satisfy with BCNF. If any of them does not satisfy, break that relation further until it satisfy.

This algorithm is not dependency preserving.

Another algorithm is to break R into 3NF:

First, you have to calculate the minimal cover for F.D. the method is in Lecture Review 10. Then re-combine the minimal cover having the same LHS to get the extended minimal cover.

With this extended minimal cover, we create R1, R2, ...., with each of the dependencies. Then we check that none of the relations is contained in another relation. Remove the one that is contained. Then check that at least one of the R contains the candidate key, if not, create another relation with only the candidate key.


Artificial Intelligence --- The last topic is machine learning. It's probably one of the easier topic for this module. For this topic, there are a series of test cases. From these cases, we are to construct a decision tree to predict the outcome for other values.

The decision tree is constructed with some form of algorithm calculating the information gain from choosing the different attributes as the root node. However, it seems more intuitive to construct it by observation at times. The chosen attribute should reduce the entropy of the test cases the most.

Software Engineering --- We continued with testing of the software. This week is white box testing. To do this form of testing, we have to consider the internal functioning of the codes. Looking at the codes, we draw the Control Flow Graph (CFG), then use test cases to test each of the possible routes, varying each time by one single path.

No comments: