A Research and Development Initiative Promoted by DI.S.T.A. Funded by EEC in the framework of BEEP project and by MURST in the framework of MITILUS project Project Management: Attilio Giordana (Computer Science) Aldo Viarengo (Biology) Team: M. Botta, Burlando, L. Portinale, A. Serra, M. Rapetti, G. Porcelli. Project Goals To provide a user friendly apparatus world wide accessible for sharing and analyzing biological data collected by the environment monitoring laboratories To develop new data mining algorithms oriented to biological data analysis. Hardware Support: Web server+DB server+Beowulf (20 Pentium III 800) Hermes (Oracle) WEB Server switch now6 now13 now5 now12 now19 now4 now11 now18 now3 now10 now17 now2 now9 now16 now1 now8 now15 now0 now7 now14 switch Beowulf (20 PC) User Environment {a-i} {a-i} Color { m-t} BlueRedGreenWhite { b,g} { d,e,f,h} } Shape{{ cp,r,s { m,t} { n,q} } DB {e} 1 x1 o1 2 x2 3 +1 4 { a,i } • • • • • • Triangle Square Oval Circle {f}Size{d,h} { n} { q} DB o2 +1 DB 1 x1 DB {a-i} {a-i} Color { m-t} White Blue Red Green { b,g} { d,e,f,h} { c } Shape { m,t} { n,q} { p,r,s } • • • • • • Triangle Circle Square Oval {d,h} { n} {e} {f} Size { q} o1 2 x2 3 o2 { a,i } +1 4 +1 Every user will benefit of a personal database together with an application specific set of data-mining tools Software Architecture Servelets Java Interface WEB Server Oracle Database Manager Data Mining Algorithms Data Storage (192 Gbytes) A User Fiendly Graphic Interface Dataset name Tool1 Tool2 view 1 view 2 Tool3 op1 op1 Tool4 op2 ........ op2 Tool5 op3 op3 .............................. op4 Data Intensive Algorithms Run in Parallel on the Beowulf Algorimi ad uso interattivo Servlets D-Tree Neural Net Algoritm Server .............. G-net Sequence Analysis Cluster Workpackages WP1: Database Design + Meta data WP2: Graphic Interface design and implementation 2.1: Approach selection 2.2: User autentication procedure 2.3: Oracle interafce 2.4: Data visualization WP3: Tool configuration interface 3.1 G-Net 3.2 Mine-Rule 3.3 Clustering Algorithms 3.4 Characterization Algorithm 3.5 Decision/Regression Trees 3.6 Neural Networks Workpackages..... WP4: Servlet implementation WP5: Algorithm server implementation 5.1: Design 5.2: implementation WP6: Algorithm implementation 6.1: New KDD algorithm implementation 6.2: Existing algorithm revision Work-Flow 15/1/2001 15/2/2001 15/3/2001 Wp1 Wp1 Wp2.1 (revision) Wp2.2 - Wp2.4 Wp3.1-Wp3.6 Wp4 Wp5.1 Wp5.2 Wp6.1 - Wp6.2