A Research and Development Initiative Promoted by DI.S.T.A.
Funded by EEC in the framework of BEEP project
and by MURST in the framework of MITILUS project
Project Management:
Attilio Giordana (Computer Science)
Aldo Viarengo (Biology)
Team:
M. Botta, Burlando, L. Portinale, A. Serra, M. Rapetti, G. Porcelli.
Project Goals
To provide a user friendly apparatus world wide accessible for sharing
and analyzing biological data collected by the environment monitoring laboratories
To develop new data mining algorithms oriented to biological data analysis.
Hardware Support: Web server+DB
server+Beowulf (20 Pentium III 800)
Hermes
(Oracle)
WEB
Server
switch
now6
now13
now5
now12
now19
now4
now11
now18
now3
now10
now17
now2
now9
now16
now1
now8
now15
now0
now7
now14
switch
Beowulf (20 PC)
User Environment
{a-i}
{a-i}
Color
{ m-t}
BlueRedGreenWhite
{ b,g} { d,e,f,h}
}
Shape{{ cp,r,s
{ m,t} { n,q}
}
DB
{e}
1
x1
o1
2
x2
3
+1
4
{ a,i }
• • •
• • •
Triangle
Square
Oval Circle
{f}Size{d,h}
{ n}
{ q}
DB
o2
+1
DB
1
x1
DB
{a-i}
{a-i}
Color
{ m-t}
White
Blue
Red Green
{ b,g} { d,e,f,h} { c }
Shape
{ m,t} { n,q}
{ p,r,s }
• • •
• • •
Triangle
Circle
Square
Oval
{d,h}
{ n}
{e} {f}
Size
{ q}
o1
2
x2
3
o2
{ a,i }
+1
4
+1
Every user will benefit of
a personal database together
with an application specific set
of data-mining tools
Software Architecture
Servelets
Java
Interface
WEB
Server
Oracle
Database
Manager
Data Mining
Algorithms
Data Storage
(192 Gbytes)
A User Fiendly Graphic Interface
Dataset name
Tool1
Tool2
view 1
view 2
Tool3
op1
op1
Tool4
op2
........
op2
Tool5
op3
op3
..............................
op4
Data Intensive Algorithms
Run in Parallel on the Beowulf
Algorimi ad uso
interattivo
Servlets
D-Tree
Neural Net
Algoritm
Server
..............
G-net
Sequence Analysis
Cluster
Workpackages
WP1: Database Design + Meta data
WP2: Graphic Interface design and implementation
2.1: Approach selection
2.2: User autentication procedure
2.3: Oracle interafce
2.4: Data visualization
WP3: Tool configuration interface
3.1 G-Net
3.2 Mine-Rule
3.3 Clustering Algorithms
3.4 Characterization Algorithm
3.5 Decision/Regression Trees
3.6 Neural Networks
Workpackages.....
WP4: Servlet implementation
WP5: Algorithm server implementation
5.1: Design
5.2: implementation
WP6: Algorithm implementation
6.1: New KDD algorithm implementation
6.2: Existing algorithm revision
Work-Flow
15/1/2001
15/2/2001
15/3/2001
Wp1
Wp1
Wp2.1
(revision)
Wp2.2 - Wp2.4
Wp3.1-Wp3.6
Wp4
Wp5.1
Wp5.2
Wp6.1 - Wp6.2
Scarica

Portale per Data Mining: Hardware