RANSAC RANdom SAmple Consensus Prof. Mariolino De Cecco, Ing. Luca Baglivo, Ing. Nicolò Blasi, Ilya Afanasyev Department of Structural Mechanical Engineering, University of Trento Email: [email protected] http://www.mariolinodececco.altervista.org M. De Cecco - Robotics and Sensor Fusion • ANALISI DI REGRESSIONE • L’analisi di regressione consente di determinare i parametri di un modello in modo tale che “al meglio” interpreti i dati sperimentali mediante un legame algebrico ingresso-uscita. M. De Cecco - Robotics and Sensor Fusion Regressione di un modello • Scopo • Determinare i parametri ci, i=1,...m, in base alla ripetizione delle misure delle grandezze xi, i=1,...N, e delle corrispondenti uscite yi ed alla scelta del tipo di modello, minimizzando un certo indice di prestazione. M. De Cecco - Robotics and Sensor Fusion 3. regressione del modello • Nel metodo dei minimi quadrati l’indice di prestazione è costituito dalla somma dei quadrati degli scarti (anche detti residui): N c1 ,..., cm i2 i1 Essendo: i yi f c1,c2 , cm , x1 , x2 , xn i i L’insieme dei parametri si determina: c1 ,...,cm : N 2 min i ck i1 M. De Cecco - Robotics and Sensor Fusion 3. regressione del modello i Caso lineare - Calcolo retta ai minimi quadrati: i yi a xi b Si noti che i residui risultano essere lineari in funzione dei parametri da determinare Dunque anche il fitting con un polinomio qualsiasi risulta risolvibile in maniera analoga y i y ax b x N Dati sperimentali M. De Cecco - Robotics and Sensor Fusion 3. regressione del modello parametri (a,b) • Posizione del problema N 2 min a,b min yi a xi b a,b a,b i1 (La soluzione ha un solo minimo) a,b 0 a a,b 0 b M. De Cecco - Robotics and Sensor Fusion 3. regressione del modello • Soluzione a C xy C xx b y ax N N C xx xi x xi nx 2 2 Dove: N 1 y yi N i1 N 1 x xi N i1 2 yi axi b y i1 N 2 i1 N 2 i1 N C xy xi x yi y xi yi nx y i1 a N i1 y C xx N M. De Cecco - Robotics and Sensor Fusion 3. regressione del modello b y xi 2 i1 NC xx What happens if we have outliers ? outlier Fitted line outlier M. De Cecco - Robotics and Sensor Fusion Fitting with outliers Let’s try to formalize a more general way of fitting data with outliers. The problem can be stated in the following way: given a set of 2D data points, find the line which minimizes the sum of squared perpendicular distances (orthogonal regression), subject to the condition that none of the valid points deviates from this line by more than a threshold t This is actually two problems: 1. classification of the data into inliers (valid points) and outliers. The threshold t is set according to the measurement noise (for example t = 3), 2. line fit to the data M. De Cecco - Robotics and Sensor Fusion Fitting with outliers The key idea is very simple: repeat N times the following: • select randomly two points • compute the connecting line • the support for this line is measured by the number of points that lie within the distance threshold t the line with most support is deemed the robust fit The points within the threshold distance are the inliers (and constitute the consensus set). M. De Cecco - Robotics and Sensor Fusion Fitting with outliers The intuition is that if one of the points is an outlier then the line will not gain much support M. De Cecco - Robotics and Sensor Fusion Fitting with outliers Scoring a line by its support has the additional advantage of favouring better fits. The line (a, b) has a support of 10, whereas the line (a, d), where the sample points are neighbours, has a support of only 4. Consequently, even though both samples contain no outliers, the line (a. b) will be selected M. De Cecco - Robotics and Sensor Fusion Fitting with outliers As stated by Fischler and Bolles [Fischler, 1981]: "The RANSAC procedure is opposite to that of conventional smoothing techniques: Rather than using as much of the data as possible to obtain an initial solution and then attempting to eliminate the invalid data points (like for example using the Chauvenet criteria), RANSAC uses as small an initial data set as feasible and enlarges this set with consistent data when possible". M. De Cecco - Robotics and Sensor Fusion Fitting with outliers Objective Robust fit of a model to a data set S which contains outliers. Algorithm Randomly select a sample of s data points from S and instantiate the model from this subset Determine the set of data points Si which are within a distance threshold t of the model. The set Si, is the consensus set of the sample and defines the inliers of S If the size of Si (the number of inliers) is greater than some threshold T, reestimate the model using all the points in Si and terminate (iv) If the size of Si is less than T, select a new subset and repeat the above, (v) After N trials the largest consensus set Si is selected, and the model is re-estimated using all the points in the subset Si M. De Cecco - Robotics and Sensor Fusion Algorithm { This to verify if the sample is able to provide a reliable fitting Select a sample of s data Verify the sample data Use the s data to estimate the model Determine the consensus set Si and its cardinality ns no This to ensure a Maximum Likelikood solution { ns >T yes Use Si to estimate the model Determine the new consensus set Si and the cardinality nsnew no nnew = nold yes M. De Cecco - Robotics and Sensor Fusion Algorithm revised Stop It is often computationally infeasible and unnecessary to try every possible sample. Instead the number of samples N is chosen sufficiently high to ensure with a probability, p, that at least one of the random samples of s points is free from outliers. Usually p is chosen at 0.99 N Log1 p Log 1 1 S The number N of samples required to ensure, with a probability p = 0.99, that at least one sample has no outliers for a given size of sample S, and proportion of outliers M. De Cecco - Robotics and Sensor Fusion Number of samples For a given and p the number of samples increases with the size of the minimal subset It might be thought that it would be advantageous to use more than the minimal subset, three or more points in the case of a line, because then a better estimate of the line would be obtained, and the measured support would more accurately reflect the true support. However, this possible advantage in measuring support is generally outweighed by the severe increase in computational cost incurred by the increase in the number of samples. How large is an acceptable consensus set? A rule of thumb is to terminate if the size of the consensus set is similar to the number of inliers believed to be in the data set, given the assumed proportion of outliers, i.e. for n data points T < (1 - ) n M. De Cecco - Robotics and Sensor Fusion Number of samples - consensus set An advantage of RANSAC is its ability to achieve robust estimation of the model parameters, i.e., it can estimate the parameters with a high degree of accuracy even when significant amount of outliers are present in the data set. A disadvantage of RANSAC is that there is no upper bound on the time it takes to compute these parameters. When an upper time bound is used (a maximum number of iterations) the solution obtained may not be the optimal one, it may not even be one that fits the data in a good way. A reasonable model can be produced by RANSAC only with a certain probability, a probability that becomes larger the more iterations that are used Another disadvantage of RANSAC is that it requires the setting of problem-specific thresholds. M. De Cecco - Robotics and Sensor Fusion RANSAC - conclusions Suppose we acquired the 3D cloud of points of a cube: QuickTime™ and a decompressor are needed to see this picture. M. De Cecco - Robotics and Sensor Fusion Example - plane fitting close all clear all %% CARICHIAMO E PLOTTIAMO IL SET DI DATI load('Points.mat'); all_point = Points(: , 1:3); all_point(: , 1) = -all_point(: , 1); % convenzione Kinect ? % Plot all points figure('Name','All points') hold on axis equal plot3(Points(:,1), Points(:,2), Points(:,3), '.' ) %% FITTIAMO CON RANSAC IL PIANO % decimiamo i punti ss = length(all_point); ii = 1 : 10 : ss; all_pointDec = all_point(ii , :); % FIT RANSAC [B, P, inliers] = ransacfitplane(all_pointDec', 0.01, 1); % Plot risultato plotPlane3D( all_point' , B ) M. De Cecco - Robotics and Sensor Fusion Example - main function % RANSACFITPLANE - fits plane to 3D array of points using RANSAC % Usage [B, P, inliers] = ransacfitplane(XYZ, t, feedback) % This function uses the RANSAC algorithm to robustly fit a plane % to a set of 3D data points. % % Arguments: % XYZ - 3xNpts array of xyz coordinates to fit plane to. % t - The distance threshold between data point and the plane % used to decide whether a point is an inlier or not. % feedback - Optional flag 0 or 1 to turn on RANSAC feedback % information. % % Returns: % B - 4x1 array of plane coefficients in the form % b(1)*X + b(2)*Y +b(3)*Z + b(4) = 0 % The magnitude of B is 1. % This plane is obtained by a least squares fit to all the % points that were considered to be inliers, hence this % plane will be slightly different to that defined by P below. % P - The three points in the data set that were found to % define a plane having the most number of inliers. % The three columns of P defining the three points. % inliers - The indices of the points that were considered % inliers to the fitted plane. % Copyright (c) 2003-2008 Peter Kovesi M. De Cecco - Robotics and Sensor Fusion Example - RANSACFITPLANE s = 3; % Minimum No of points needed to fit a plane. fittingfn = @defineplane; distfn = @planeptdist; degenfn = @isdegenerate; [P, inliers] = ransac(XYZ, fittingfn, distfn, degenfn, s, t, feedback); % Perform least squares fit by means of the inlying points B = fitplane(XYZ(:,inliers)); %-----------------------------------------------------------------------% Function to define a plane given 3 data points as required by % RANSAC. In our case we use the 3 points directly to define the plane. function P = defineplane(X); P = X; M. De Cecco - Robotics and Sensor Fusion Example - RANSACFITPLANE % Function to calculate distances between a plane and a an array of points in the X matrix where % each column represents the x, y, z coordinates (X: 3 x Npts array of xyz coordinates) % The plane is defined by a 3x3 matrix, P. The three columns of P defining % three points that are within the plane. function [inliers, P] = planeptdist(P, X, t) n = cross(P(:,2)-P(:,1), P(:,3)-P(:,1)); % Plane normal. n = n/norm(n); % Make it a unit vector. npts = length(X); d = zeros(npts,1); % d will be an array of distance values. % The perpendicular distance from the plane to each point: % d = n’ * ( X - P(:,1) ) [1 x Npts] for i=1:3 d = d + ( X(i,:)’ - P(i,1) ) * n(i) ; end inliers = find(abs(d) < t); % Function to determine whether a set of 3 points are in a degenerate % configuration for fitting a plane as required by RANSAC. In this case % they are degenerate if they are colinear. function r = isdegenerate(X) % The three columns of X are the coords of the 3 points. r = iscolinear(X(:,1),X(:,2),X(:,3)); M. De Cecco - Robotics and Sensor Fusion Example - RANSACFITPLANE At the end we obtain the following fitted ‘floor’ plane QuickTime™ and a decompressor are needed to see this picture. M. De Cecco - Robotics and Sensor Fusion Example - OUTCOME Ideas for homework 1. To use the camera view to assert consensus. Colour or curvature information could be used 2. To use the 3D and colour information to assert consensus 3. To modify the consensus criteria simulating the laser view of the fitted object M. De Cecco - Robotics and Sensor Fusion Ideas for homework …. M. De Cecco - Robotics and Sensor Fusion ….