If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > Denormalization of tables for data mininig

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 09-29-11, 08:23
digu digu is offline
Registered User
 
Join Date: Sep 2011
Posts: 4
Denormalization of tables for data mininig

Hi!

Im developing a data mining application with java (netbeans) and db2 express-c aiming at an association rule analysis of a basket.

Therefore I need to denormalize tables in order to have every transaction in one row/tuple. That is to say I have to have multiple rows in one row as follows:

my transaction consists of a transaction ID (T-ID) and the related products (P-IDs) as well as the related employees (E-ID). Thus I have two normalized tables T1, T2 that should be denormalized:

T1: T-ID, P-ID > for the products being sold
T2: T-ID, E-ID > for the employees participating in the transaction

Lets suppose following contes:
T1:
1,P1
1,P3
2,P2

T2:
1,E1
1,E2

To use a data mining algo i need to have these data in one row per transaction thus for the upper content there is just Yes/No for every attribute and every attribute is an attribute value:

Destination Table with these Attributes: T-ID, P1, P2, P3, E1, E2
1,Y,N,Y,Y,Y
2,N,Y,N,N,N

Do you know wether there are kind of denormazilation algorithms/procedures in DB2 or Java to do so?

Last edited by digu; 09-29-11 at 08:37.
Reply With Quote
  #2 (permalink)  
Old 09-29-11, 11:08
tonkuma tonkuma is offline
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,193
I thought the way is not practical.

Because, if number of products and/or number of employees increased,
porcessing of the denormazed tables need to deal some issues, like
a) wheather processed by programs or human beings,
checking every columns of p1, p2, ..., e1, e2, ... may be repeated torublesome work.
b) the maximun number of columns in a table is restricted by DB2's limit(1012 on DB2 for LUW).


Anyway, if you want to denormalize like that way,
a procedure may be ...
1) join t1 and t2 by full outer join.
2) unpivot the joined table by using a technique like in
http://sirdug.org/downloads/SQLonFire_1_SirDUG.pdf

Last edited by tonkuma; 09-29-11 at 16:11. Reason: Add "in a table"
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On