Finding the Optimal Strategies in Robotic Patrolling with Adversaries in Topologically-Represented Environments Francesco Amigoni, Nicola Basilico, Nicola Gatti {amigoni,basilico,ngatti}@elet.polimi.it F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano Robotic Patrolling € €€ € A patrolling strategy determines the path followed by the robot, usually the next cell to move to F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano Randomized Patrolling Strategies The patroller should adopt an unpredictable patrolling strategy, randomizing over cells and trying to reduce the intrusion risk (Pita et al., AAMAS08) € Randomized strategy: the robot determines the next cell according to a probability distribution F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano Patrolling Strategies with Adversaries • Considering a model of the adversary (Agmon et al., AAMAS08, Paruchuri et al., AAMAS08) can provide the patrolling robot a larger expected utility than not considering it, i.e., it can lead to better strategies (Amigoni et al., IAT2008) • Model of the adversary can include: its preferences over the possible targets, its knowledge about the patroller’s strategy, … F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano The Problem The problem we addressed in this work: finding the optimal randomized patrolling strategy in a arbitrary environment while considering a model of the adversary Our approach applies to environments with arbitrary topology generalizing (Agmon et al., ICRA08) € Agmon et al., ICRA08 F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano The Basic Patrolling Model • Time is discrete • Environment: represented by a directed graph, e.g., a grid of cells or a topological map (Carpin et al., IROS08) • Single patrolling robot It can move between adjacent nodes It can detect a possible intruder in its current node • Single intruder It knows the strategy of the patrolling robot, for example because it can observe the patroller movements before attempting to intrude It can directly enter any node • Penetration time di is required to successfully complete an intrusion in a node i When attempting to penetrate in a node i at time t, the intruder can be detected during {t,t+1,…,t+ di} F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano The Basic Patrolling Model 1 2 3 5 7 6€ 9 4 Final States • 8 12 10 The indruder enters node i at time t: • If the patroller does not visit cell i in the interval {t,t+1,…,t+ di} the intruder wins • Otherwise the intruder is captured and the patroller wins 13 P • The intruder never enters move(7) Utilities P … I … enter(1) I … I … P … F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano P … 1 time unit • Xi ,Yi (i ∈ {1, 2, …, 13}) : patroller’s and intruder’s utilities when the intruder successfully attacks node i • X0 ,Y0 : patroller’s and intruder’s utilities when the intruder is captured Objective The proposed method finds the probability distribution over the patroller movements, i.e., given the current node, finding the probability of moving in each adjacent node € F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano Solving the Game • Two competing actors: we study their behaviors in a game-theoretical framework • The patrolling problem can be modeled as a leader-follower game • Two players • The leader commits to a strategy • The follower observes such commitment and acts as a best responder • Patrolling strategy: A = {αi,j}, where αi,j is the probability of doing move(j) when i is the current node • The optimal A can be derived by computing the equilibrium of the leader-follower game resorting to a bilevel optimization problem (Conitzer and Sandholm, 2006) F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano Solving Algorithm Step 1: is there any strategy A such that the game will never end? • Single bilinear feasibility problem • If a solution is found, it is the best patrolling strategy and the intruder will never attempt to enter If the above problem does not admit a solution, Step 2: • We safely assume that the game will end, i.e., the intruder will enter • We compute A such that the patroller’s expected payoff is maximum • This amounts to solve a bilinear optimization problem for every possible action of the intruder Game Model F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano Solving algorithm Optimal patrolling strategy that maximizes patroller’s expected utility An Example X5 = 0.5 Y5 = 0.5 d5 = 7 X1 = 0.8 Y1 = 0.2 d1 = 7 0.774 0.451 X0.226 1 = 0.8 Y1 = 0.2 d1 = 7 0.549 0.344 0.127 0.529 0.676 0.096 0.228 X0 = 1 Y0 = -1 X0.102 5 = 0.5 Y5 = 0.3 d5 = 7 0.898 With this strategy the game never ends, i.e., the intruder will never enter F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano Another Example X5 = 0.5 Y5 = 0.3 d5 = 4 X1 = 0.8 Y1 = 0.2 d1 = 5 1 0.546 0.546 0.546 X5 = 0.5 Y5 = 0.3 d5 = 4 X1 = 0.8 Y1 = 0.2 d1 = 5 0.454 X0 = 1 Y0 = -1 0.454 0.454 1 With this strategy the intruder will try to enter in cell 1 when the patroller is in cell 5, the expected utility of the patroller is 0.819 F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano Model Extensions • • Augmented sensing capabilities: we introduce the range parameter X0 = 1 Y0 = -1 X4 = 0.8 Y4 = 0.4 X6 = 0.7 Y6 = 0.5 X12 = 0.8 Y12 = 0.4 expected utility 1 0.9 r=1 0.8 r=2 r=3 0.7 r=4 0.6 penetration time 1 3 5 7 F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano 9 11 Synchronized multirobot setting: a single patroller able to sense an arbitrary subset of cells Conclusions and Future Works • We presented an approach to find optimal randomized patrolling strategies in arbitrary environments with adversaries • Future Works • Accounting for intruder’s movements and limited observation capabilities • Extending our framework with multiple non-synchronized patrollers F. Amigoni, N. Basilico, N. Gatti DEI, Politecnico di Milano