Finding the Optimal Strategies in Robotic Patrolling with
Adversaries in Topologically-Represented Environments
Francesco Amigoni, Nicola Basilico, Nicola Gatti
{amigoni,basilico,ngatti}@elet.polimi.it
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
Robotic Patrolling
€
€€
€
A patrolling strategy determines the path followed by the robot,
usually the next cell to move to
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
Randomized Patrolling Strategies
The patroller should adopt an unpredictable patrolling strategy,
randomizing over cells and trying to reduce the intrusion risk
(Pita et al., AAMAS08)
€
Randomized strategy:
the robot determines the next cell according to a probability distribution
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
Patrolling Strategies with Adversaries
• Considering a model of the adversary (Agmon et al., AAMAS08,
Paruchuri et al., AAMAS08) can provide the patrolling robot a larger
expected utility than not considering it, i.e., it can lead to better
strategies (Amigoni et al., IAT2008)
• Model of the adversary can include: its preferences over the possible
targets, its knowledge about the patroller’s strategy, …
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
The Problem
The problem we addressed in this work:
finding the optimal randomized patrolling strategy in a arbitrary
environment while considering a model of the adversary
Our approach applies to environments with arbitrary topology
generalizing (Agmon et al., ICRA08)
€
Agmon et al., ICRA08
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
The Basic Patrolling Model
• Time is discrete
• Environment: represented by a directed graph, e.g., a grid of cells
or a topological map (Carpin et al., IROS08)
• Single patrolling robot


It can move between adjacent nodes
It can detect a possible intruder in its current node
• Single intruder


It knows the strategy of the patrolling robot, for example because it
can observe the patroller movements before attempting to intrude
It can directly enter any node
• Penetration time di is required to successfully complete an intrusion in
a node i

When attempting to penetrate in a node i at time t, the intruder can be
detected during {t,t+1,…,t+ di}
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
The Basic Patrolling Model
1
2
3
5
7
6€
9
4
Final States
•
8
12
10
The indruder enters node i at time t:
•
If the patroller does not visit cell i
in the interval {t,t+1,…,t+ di}
the intruder wins
•
Otherwise the intruder is
captured and the patroller wins
13
P
•
The intruder never enters
move(7)
Utilities
P
…
I
…
enter(1)
I
…
I
…
P
…
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
P
…
1 time unit
•
Xi ,Yi (i ∈ {1, 2, …, 13}) : patroller’s
and intruder’s utilities when the
intruder successfully attacks node i
•
X0 ,Y0 : patroller’s and intruder’s
utilities when the intruder is captured
Objective
The proposed method finds the probability distribution over the patroller
movements, i.e., given the current node, finding the probability of moving
in each adjacent node
€
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
Solving the Game
• Two competing actors: we study their behaviors in a game-theoretical
framework
• The patrolling problem can be modeled as a leader-follower game
• Two players
• The leader commits to a strategy
• The follower observes such commitment and acts as a best responder
• Patrolling strategy: A = {αi,j}, where αi,j is the probability of doing
move(j) when i is the current node
• The optimal A can be derived by computing the equilibrium of the
leader-follower game resorting to a bilevel optimization problem
(Conitzer and Sandholm, 2006)
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
Solving Algorithm
Step 1: is there any strategy A such that the game will never end?
•
Single bilinear feasibility problem
•
If a solution is found, it is the best patrolling strategy and the intruder will never
attempt to enter
If the above problem does not admit a solution, Step 2:
•
We safely assume that the game will end,
i.e., the intruder will enter
•
We compute A such that the patroller’s expected payoff is maximum
•
This amounts to solve a bilinear optimization problem for every possible action
of the intruder
Game Model
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
Solving algorithm
Optimal patrolling
strategy that
maximizes
patroller’s
expected utility
An Example
X5 = 0.5
Y5 = 0.5
d5 = 7
X1 = 0.8
Y1 = 0.2
d1 = 7
0.774
0.451
X0.226
1 = 0.8
Y1 = 0.2
d1 = 7
0.549
0.344
0.127
0.529
0.676
0.096
0.228
X0 = 1
Y0 = -1
X0.102
5 = 0.5
Y5 = 0.3
d5 = 7
0.898
With this strategy the game never ends, i.e., the intruder will never enter
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
Another Example
X5 = 0.5
Y5 = 0.3
d5 = 4
X1 = 0.8
Y1 = 0.2
d1 = 5
1
0.546
0.546
0.546
X5 = 0.5
Y5 = 0.3
d5 = 4
X1 = 0.8
Y1 = 0.2
d1 = 5
0.454
X0 = 1
Y0 = -1
0.454
0.454
1
With this strategy the intruder will try to enter in cell 1 when the patroller is in cell
5, the expected utility of the patroller is 0.819
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
Model Extensions
•
•
Augmented sensing capabilities:
we introduce the range parameter
X0 = 1
Y0 = -1
X4 = 0.8
Y4 = 0.4
X6 = 0.7
Y6 = 0.5
X12 = 0.8
Y12 = 0.4
expected utility
1
0.9
r=1
0.8
r=2
r=3
0.7
r=4
0.6
penetration time
1
3
5
7
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
9
11
Synchronized multirobot setting:
a single patroller able to sense an
arbitrary subset of cells
Conclusions and Future Works
• We presented an approach to find optimal randomized
patrolling strategies in arbitrary environments with
adversaries
• Future Works
• Accounting for intruder’s movements and limited
observation capabilities
• Extending our framework with multiple non-synchronized
patrollers
F. Amigoni, N. Basilico, N. Gatti
DEI, Politecnico di Milano
Scarica
1 - Politecnico di Milano

Slides

Slides - Politecnico di Milano

politecnico di milano - LeHo | Learning at Home and in the Hospital