A new family of regular
semivalues and applications
Roberto Lucchetti
Politecnico di Milano,Italy
Main
To
goal:
rank genes from DNA data provided by Microarray Analysis.
Tools:
Cooperative
Game Theory, in particular Power indices
Power indices rank players according to their “strength” in the game.
In the EU council the strongest states (GE,FR,IT,UK) have a
some 10 times power w.r.t. the weakest state (MT)
In UN the veto players have a some 100 (10) times power w.r.t. non
permanent players, according to Shapley (Banzhaf).
R.Lucchetti Politecnico di Milano
2
A (TU) game is
with
N={1,…,n} is the set of players,
v is the characteristic function of the game.
A  N is called coalition.
v(A) is the utility (or cost) for the coalition A.
GN represents the set of all games having N as set of players.
Remark:
GN

n-1
2
R
R.Lucchetti Politecnico di Milano
3
A Base for GN:
Unanimity games
Subclass of games:

Simple games.
Among them the weighted majority games:
R.Lucchetti Politecnico di Milano
4
Introduction: how an array works
A chip can contain millions of DNA probes
Introduction: how a microarray works
Hybridization
When a single DNA helix meets a single mRNA helix, if they are complementary
they will stick to each other.
Hybridization helps researchers to identify what RNA sequences are present in a
sample and this tells them what genes are being expressed by the organism and
how much they are being expressed.
Introduction: how a microarray works
GeneChip microarrays use the natural chemical attraction between the RNA
target (from the sample preparation) and the DNA on the array to determine the
expression level of a given gene.
DNA/RNA
Adenine (A)
T
Guanine (G)
C
Thyimine (T)/Uracil (U)
Cytosine (C)
A
G
Introduction: how a microarray works
The RNA extract from a sample is copied in cRNA (through a process known as
PCR). Copying the RNA allows it to be more easily detected on the array. At the
same time the RNA is copied, a chemical flourescent molecule called biotin is
attached to the strand. This molecule will show where the sample RNA has stuck
to the DNA probe on the array.
Introduction: how a microarray works
If the gene is highly expressed,many RNA
molecules will stick to the probe and the
probe location will shine brightly when the
laser hit it.
If the sample RNA doesn’t match it will be
rejected by the probe on the array and when
the laser hits the probe, nothing glows.
Introduction: how a microarray works
The whole point of microarray gene expression analysis is to compare
expression levels among different samples. Let’s simplify the situation with an
example in which we have four genes and two samples.
Gene1: 2RUDE Gene2: 2LOUD Gene3: GETOUT Gene4: FATMET
Gene4 is not glowing.
…
Array1
Array2
Array3
array 1
array 2
array 3
array 4
…
gene 1
0,67
0,45
1,32
1,34
…
gene 2
1,01
1,13
1,54
2,13
…
gene 3
1,38
1,21
1,23
0,12
…
gene 4
0,65
0,98
0,54
…
…
gene 5
0,17
1,32
2,43
…
…
…
…
…
…
…
…
Expression level of gene 4 in
array 2

The Microarray Game:
An mxn Boolean matrix M such that
Given the column
, supp
R.Lucchetti Politecnico di Milano
12
Sample 1
Sample 2
Sample 3
gene1
0.5
0.2
1
gene2
0.4
1
0.3
gene3
0.8
0.4
0.2
Sample1
Sample2
Sample3
Sample 4
gene1
0.7
0.3
1.8
0.8
gene2
0.1
0.2
0.5
0.9
gene3
1
0.6
1.7
0.1
Sample1
Sample2
Sample3
Sample4
gene1
0
0
1
0
gene2
1
1
0
0
gene3
1
0
1
1
R.Lucchetti Politecnico di Milano
13
A
power index
for the game
(N,v)
is
(x1,…,xn) such
that:
xi represents the power of player i in game v.

weighted voting does not work…
The

most famous:
Shapley ()
and
Banzhaf () .
R.Lucchetti Politecnico di Milano
14
Shapley ()
and
Banzhaf()
the marginal contribution of i to S  {i}
R.Lucchetti Politecnico di Milano
15
 is a probabilistic value if there is a probability
on
such that
Shapley
Banzhaf
R.Lucchetti Politecnico di Milano
16
If pi(S)=p(|S|)>0, the probabilistic value is called regular semivalue
Examples:
Banzhaf
Shapley
p-binomial
Regular semivalues are points in the simplex:
R.Lucchetti Politecnico di Milano
17

Properties for power indices
Let
The solution  has the dummy player (DP) property, if for each
player such that
for all coalitions A not containing i,
R.Lucchetti Politecnico di Milano
18
Let
be a permutation.
Given the game v, denote by
the game
and by
The solution  has the symmetry (S) property if, for each
permutation as above
R.Lucchetti Politecnico di Milano
19
The new family of power indices
Let
Define
on the unanimity game
as
and extend it by linearity on a generic
R.Lucchetti Politecnico di Milano
20
R.Lucchetti Politecnico di Milano
21
R.Lucchetti Politecnico di Milano
22
R.Lucchetti Politecnico di Milano
23

Theorem 1
There exists one and only one value  fulfilling the symmetry, linearity
and dummy player properties, and assigning aS to all non null players
in the unanimity game uS , where a1=1 and as>0 for s=2,…,n.
 fulfills the formula:
R.Lucchetti Politecnico di Milano
24
 Theorem
2
 a is a regular semivalue
 2 fulfills the formula:
for all a>0.
•Corollary
a
The family of the weighting coefficients of the values
, a>0,
is an open curve in the simplex of the regular semivalues, containing
the Shapley value.
The addition of the Banzhaf value to the curve provides a one-point
compactification of the curve.
R.Lucchetti Politecnico di Milano
25
Theorem 3

study of the term:
Key tool
Let
, let
Then
Moreover, for all natural l, and positive real a,x:
Finally, for each natural m, the following formula holds:
R.Lucchetti Politecnico di Milano
26
Calculating the indices in weighted majority
games
Let
count in how many ways the sum of the weights of j players different
from i can give k. Then the following proposition holds.
Let  be the value defined in the theorem above. Let q>0 be a positive integer,
and let w1,…,wn be non negative integers.
Let v=[q;w1,…,wn] be the associated weighted majority game.
Then the following formula holds:
An efficient algorithm based on generating functions and formal series allows for
a fast calculation of the coefficients
R.Lucchetti Politecnico di Milano
27
 Applications
The EU
R.Lucchetti Politecnico di Milano
28
STATI
GE
FR
IT
UK
SP
PL
RO
NL
BE
CZ
GR
HU
PT
SE
AU
BG
FI
DK
SK
IR
LT
LV
SLO
CY
ES
LU
MT
SY
0,086738
0,086738
0,086738
0,086738
0,079975
0,079975
0,039937
0,036825
0,034068
0,034068
0,034068
0,034068
0,034068
0,028193
0,028193
0,028193
0,019606
0,019606
0,019606
0,019606
0,019606
0,011042
0,011042
0,011042
0,011042
0,011042
0,008178
S2
0,02797
0,02797
0,02797
0,02797
0,025999
0,025999
0,013476
0,012476
0,011555
0,011555
0,011555
0,011555
0,011555
0,00961
0,00961
0,00961
0,006721
0,006721
0,006721
0,006721
0,006721
0,003813
0,003813
0,003813
0,003813
0,003813
0,00285
BF
SY(I)/MT
S2(I)/MT
BF(I)/MT
0,032688 10,6066383
9,815722703
8,260803639
0,032688 10,6066383
9,815722703
8,260803639
0,032688 10,6066383
9,815722703
8,260803639
0,032688 10,6066383
9,815722703
8,260803639
0,031164 9,77960769
9,123884457
7,875663381
0,031164 9,77960769
9,123884457
7,875663381
0,017889 4,88360405
4,729163962
4,520849128
0,016691
4,5031054
4,378366807
4,218094516
0,015475 4,16600531
4,055048061
3,910791003
0,015475 4,16600531
4,055048061
3,910791003
0,015475 4,16600531
4,055048061
3,910791003
0,015475 4,16600531
4,055048061
3,910791003
0,015475 4,16600531
4,055048061
3,910791003
0,012989 3,44756282
3,372390341
3,282537276
0,012989 3,44756282
3,372390341
3,282537276
0,012989 3,44756282
3,372390341
3,282537276
0,00916 2,39749856
2,358602005
2,314885014
0,00916 2,39749856
2,358602005
2,314885014
0,00916 2,39749856
2,358602005
2,314885014
0,00916 2,39749856
2,358602005
2,314885014
0,00916 2,39749856
2,358602005
2,314885014
0,005251 1,35024683
1,338033557
1,327015416
0,005251 1,35024683
1,338033557
1,327015416
0,005251 1,35024683
1,338033557
1,327015416
0,005251 1,35024683
1,338033557
1,327015416
0,005251 1,35024683
1,338033557
1,327015416
0,003957
1
1
1
29
The power indices, when considering the 56 genes common to the indices,
among the first 100 common to all indices.
Data from 40 tumor samples vs 22 normal, 2000 genes
0,7
0,6
Normalized values
0,5
0,4
sigma2(10^-4)
sigma3(10^-6)
Banzhaf(10^-13)
Shapley(10^-2)
0,3
0,2
0,1
0
0
10
20
30
40
50
60
Genes
R.Lucchetti Politecnico di Milano
30
Data from a Colon Rectal Cancer
10 Healthy 12 Tumoral tissues
An
extended microarray game considers also how much
the genes are abnormally expressed w.r.t a normality
interval.
Given
the normality interval [mi,Mi] of the gene i,
si the standard deviation, Nki=[mi-ksi,mi+ksi], assign
k to the ij cell of the matrix if value of gene i in
patient j falls in Nik \ Nik-1
A weighted Shapley value is used to rank genes.
This allows better differentiating the genes. Taking
the first 100 genes in the ranking, the game is
formed as an average of weighted majority games.
Then
we calculate the Shapley, Banzhaf and 2 indices
R.Lucchetti Politecnico di Milano
31
Gene
expression analysis was
performed by using Human
Genome U133A-Plus 2.0
GeneChip arrays (Affymetrix,
Inc., Calif).
FOSB
The
following 7 genes are
quoted in medical literature as
having great importance in the
onset of the disease:
CYR61, UCHL1, FOS,FOSB,
EGR1, VIP, KRT24.
One
of them was ranked
around the 100-th position by
the weighted Shapley value. All
other ones are among the first
50 and played the subsequent
game.
S
B
2
2
1
1
CYR61 1
2
2
FOS
3
3
3
VIP
5
5
6
EGR1
10
9
9
35
35
KRT24 45
R.Lucchetti Politecnico di Milano
32
References

R.Lucchetti P.Radrizzani, E. Munarini, A new family of regular semivalues
and applications, Int.J.of Game Theory DOI 10.1007/s00182-010-0263-5

R. Lucchetti-S. Moretti-F. Patrone-P. Radrizzani, The Shapley and Banzhaf
indices in microarray games, Computers and Operations Research, 37,
(2010) p. 1406-1412.

R. Lucchetti-P.Radrizzani, Microarray Data Analysis Via Weighted Indices
and Weighted Majority Games, Computational Intelligent Methods for
Bioinformatics and Biostatistics II, Masulli, Peterson, Tagliaferri (Eds),
Lecture Notes in Computer Science, Springer (2010) p.179-190.

S.Moretti , F.Patrone, S.Bonassi, The class of microarray
games and the relevance index for genes. TOP 15 (2007), p256-280.

D. Albino, P. Scaruffi, S. Moretti, S.Coco, C.Di Cristofano, A.Cavazzana,
M.Truini, S.Stigliani, S.Bonassi, G.Ptonini (2008): Stroma poor and stroma
rich gene signatures show a low intratumoral gene expression
heterogeneity in Neuroblastic tumors. Cancer 113, p. 1412-1422.
R.Lucchetti Politecnico di Milano
33
Scarica

a 1