Lezione 17. Pianificazione e stima dei costi • • [S2001, Cap. 23] [GJM91, Sez. 8.2] Composizione dei costi Misure di produttività (LOC, FP, OP) Tecniche di stima dei costi di sviluppo software Stima algoritmica dei costi • • il modello COCOMO’81 il modello COCOMO 2 Slide 1 Pianificazione di progetto e stime di costi Nella fase di planning di un progetto vengono descritte: • • • Ma un planning dettagliato deve anche quantificare, per ogni attività, il costo, suddiviso in • • • le attività in cui il progetto si articola le loro interdipendenze logiche e temporali (sequenziali / parallele) l’individuazione e allocazione di man-power alle singole attività (staffing) costi di acquisizione/manutenzione di hardware/software costi di viaggio e training costi del personale coinvolto nel lavoro (dominante) La stima del costo è essenziale • • per allocare un budget di progetto per definire il prezzo da proporre al Cliente nel contratto Slide 2 I costi del personale che sviluppa il sistema comprendono: • • salari di ingegneri del software e programmatori overheads (per una quota relativa allo staff coinvolto): » riscaldamento, illuminazione, affitto spazi ufficio... » servizi di segreteria, ufficio personale, assicurazioni, servizi legali, biblioteca, mensa, pulizie... » collegamenti rete e telefonici… » contributi casse pensioni... Normalmente gli overheads sono una o due volte i salari • un ing. che guadagna 100 ML./anno costa 200-300 ML./anno alla sua organizzazione Slide 3 Fattori nel calcolo dei prezzi del software Factor Market opportunity Cost estimate uncertainty Contractual terms Requirements volatility Financial health Description A development organisation may quote a low price because it wishes to move into a new segment of the software market. Accepting a low profit on one project may give the opportunity of more profit later. The experience gained may allow new products to be developed. If an organisation is unsure of its cost estimate, it may increase its price by some contingency over and above its normal profit. A customer may be willing to allow the developer to retain ownership of the source code and reuse it in other projects. The price charged may then be less than if the software source code is handed over to the customer. If the requirements are likely to change, an organisation may lower its price to win a contract. After the contract is awarded, high prices may be charged for changes to the requirements. Developers in financial difficulty may lower their price to gain a contract. It is better to make a small profit or break even than to go out of business. Slide 4 Produttività del programmatore e impatto sui costi di progetto Si misura il software secondo qualche parametro, e si divide per il tempo-programmatore consumato (-) Le metriche basate su quantità/tempo non tengono conto del fattore qualità Parametri considerati: • misure di dimensione » » • linee di codice sorgente linee di codice oggetto misure di funzionalità » Un modello altamente grossolano: function points Dimensione team programmazione (# persone) Stima delle linee di codice del sistema (LOC) / / Durata progetto (mesi) Man-power necessario (mesi-uomo) * Produttività (LOC/mese-uomo) Costo progetto (M.lire) +% Prezzo al Cliente (M.lire) Costo unitario (salari+overheads) (M.lire/mese-uomo) Slide 5 Linee di codice (LOC) Misura classica: linee di codice sorgente per mese-uomo, includendo analisi, progetto, validazione, documentazione. Ma…: • • • • • Da FORTRAN a C++ il concetto di linea di codice si complica Diversi criteri di conteggio (commenti?, dichiarazioni di dati?, comandi su piu’ linee? macro-expansion?) I programmatori ‘verbosi’ sono più produttivi? I programmatori in linguaggi a più basso livello sono più produttivi? Il programmatore che riusa software già disponibile è meno produttivo? Un programmatore ‘meno produttivo’ puo’ produrre codice • • • piu’ affidabile, piu’ facile da capire piu’ facile da mantenere e modificare Slide 6 Tempi di sviluppo con linguaggi ad alto / basso livello Low-level language Analysis Design C oding Validation High-le vel language Analysis Design Coding Validation Slide 7 Esempio Assembly code High-level language Assembly code High-level language Analysis Design Coding Testing Documentation 3 weeks 5 weeks 8 weeks 10 weeks 2 weeks 3 weeks 5 weeks 84 weeks 6 weeks 2 weeks Size Effort Productivity 5000 lines 28 weeks 714 lines/month 1500 lines 20 weeks 300 lines/month Slide 8 Valori tipici di produttività in LOC Real-time embedded systems: • Systems programs: • 150-400 LOC/P-month Commercial applications: • 40-160 LOC/P-month (mese-programmatore) 200-800 LOC/P-month In Extreme Programming (continua evoluzione del codice) LOC è poco significativa [Beck2000] Slide 9 Function points Metrica indipendente dal linguaggio (ma molto soggettiva…) e orientata a sistemi di data-processing. • + Una prima stima è possibile appena definite le interazioni del sistema con l’esterno, prima del progetto dettagliato Si basa sul conteggio di alcuni elementi del programma • • • • (Albrecht ‘79) external inputs and outputs user interactions external interfaces files used by the system E’ definita come somma pesata del numero di occorrenze degli elementi • i pesi proposti da Albrecht vanno da un valore 3 (per input dall’esterno) a un valore 15 (per file interni gestiti dal programma) • UFC (Unadjusted Function Count) = i (# elem. tipo i) * peso i Per ottenere il valore FC finale, UFC viene poi modificato in base a parametri di complessità del progetto, quali: • il grado di distributività, di riuso, di performance, ... Slide 10 Produttività: function points / P-month Per un dato linguaggio si può definire, basandosi su dati storici, il numero medio di LOC per Function Point (AVC). Se si dispone di una stima dei Function Points di un nuovo sistema S, si puo’ stimare la dimensione del parametro LOC di S: • LOC(S) = Function Points(S) * AVC Valori tipici di AVC: • • Linguaggio assembler: 200-300 LOC/FP Linguaggio di 4a generazione 20-40 LOC/FP Slide 11 Object points Object points are an alternative function-related measure to function points, when 4GLs (->) are used Object points do not count object classes, they are obtained by a weighted estimate of • • • The number of separate screens displayed (weights 1, 2, 3) The number of reports produced by the system(weights 2, 5, 8) The number of 3GL modules (high level programming languages such as Java) needed to supplement the 4GL code (weight 10) Easily estimated from system specification (i.e. early) Used in COCOMO-2 estimation model (--->) Slide 12 (4GL: fourth generation language) An "application specific" language. The term was invented by Jim Martin to refer to non-procedural high level languages built around database systems. The first three generations were developed fairly quickly, but it was still frustrating, slow, and error prone to program computers, leading to the first "programming crisis", in which the amount of work that might be assigned to programmers greatly exceeded the amount of programmer time available to do it. Meanwhile, a lot of experience was gathered in certain areas, and it became clear that certain applications could be generalised by adding limited programming languages to them. Thus were born report-generator languages, which were fed a description of the data format and the report to generate and turned that into a COBOL (or other language) program which actually contained the commands to read and process the data and place the results on the page. Some other successful 4th-generation languages are: database query languages, e.g. SQL; Focus, Metafont, PostScript, RPG-II, S, IDL-PV/WAVE, Gauss, Mathematica and data-stream languages such as AVS, APE, Iris Explorer. Slide 13 Valori tipici di produttività in Object Points In object points, productivity has been measured between 4 and 50 object points/month depending on tool support and developer capability (Boehm ‘95) Slide 14 Fattori che influenzano la produttività Factor Application domain experience Process quality Project size Technology support Working environment Description Knowledge of the application domain is essential for effective software development. Engineers who already understand a domain are likely to be the most productive. The development process used can have a significant effect on productivity. This is covered in Chapter 31. The larger a project, the more time required for team communications. Less time is available for development so individual productivity is reduced. Good support technology such as CASE tools, supportive configuration management systems, etc. can improve productivity. As discussed in Chapter 28, a quiet working environment with private work areas contributes to improved productivity. Ma, a parità di qualità del codice, un programmatore puo’ essere fino a 10 volte piu’ produttivo di un altro (Sackman ‘68) Slide 15 Tecniche di stima di costi Minate da molti elementi di incertezza: • • • Requirements (in evoluzione, anche in dipendenza dal processo di sviluppo scelto) Personale (variabilità della produttività individuale) Tecnologie in parte ancora da definire, … La scarsa affidabilità delle tecniche di stima dei costi ‘scientifiche’ puo’ indurre a stime più ‘politiche’ • con la legge di Parkinson: L’impegno (effort) necessario è … quello possibile: » il cliente vuole il risultato dopo 12 mesi, » (e lo sviluppatore ha 5 persone disponibili) ==> » lo sforzo stimato è … 60 mesi-uomo… • con il criterio ‘pricing to win’: » il costo stimato è … uguale alla disponibilità del Cliente... • con il criterio del costo fisso: » anziché stimare i costi per un obiettivo prefissato » si fissano i costi, e si ridimensionano gli obiettivi (dinamicamente) Slide 16 Criteri empirici più ‘seri’: • • • Valutazione iterativa da parte di un insieme di esperti (di dominio e tecnologia), fino a convergenza sulla previsione Valutazione per analogia con progetti nello stesso settore Modellazione algoritmica dei costi: » previsione dei costi secondo una formula/algoritmo che fa dipendere i costi da attributi del prodotto (tipicamente LOC), del progetto, del processo » la formula è indotta dalla osservazione di dati storici sperimentali (valori degli attributi, costi) relativi a un corpus di progetti precedenti. Per progetti complessi, Boehm suggerisce di applicare piu’ tecniche di predizione dei costi, fino a quando i risultati convergono. Slide 17 Algorithmic cost modelling Empirical method, based on the analysis of historical data (completed projects) and identification of best fit formula Effort is estimated as a function of product, project and process attributes whose values are estimated by project managers • Effort = A SizeB M • A is an organisation-dependent constant (local practices, type of developed SW), • B reflects the disproportionate effort for large projects (between 1 and 1.5) • M is a multiplier reflecting product, process and people attributes • Size estimated in LOC, FP, OP. Most models are similar but with different values for A, B and M Slide 18 The COCOMO model [Boehm ‘81] COnstructive COst MOdel; developed at TRW, a US defense contractor - Versions in ‘81, ‘89, ‘95. Provided with support tools, but ‘independent’ from software vendors... Based on a cost database of more than 60 different projects Exists in three stages • • • 1. Basic - Gives an ‘order of magnitude’ estimate based on product attributes 2. Intermediate - modifies basic estimate using project and process attributes 3. Advanced - Estimates project phases and parts (subsystems) separately. Not discussed here Slide 19 1. Basic COCOMO formula for project classes simple • • moderate • • • Project team may have experience mixture, system may have more significant non-functional constraints, organization may have less familiarity with application (HARDER) PM = 3 (KDSI) 1.12 TDEV = 2.5 (PM) 0.35 embedded Hardware/software systems • small teams, familiar environment, well-understood applications, simple non-functional requirements (EASY) PM = 2.4 (KDSI) 1.05 TDEV = 2.5 (PM) 0.38 tight constraints, including local regulations and operational procedures; unusual for team to have deep application experience (HARD) PM = 3.6 (KDSI) 1.2 TDEV = 2.5 (PM) 0.32 KDSI = thousands of Delivered Source Instructions (= source lines, excl. comments) PM = Programmer Months (‘Effort’) TDEV = Expected duration of project (Time) ) Slide 20 Effort estimates Person-months 1000 Embedded 800 600 Intermediate 400 Simple 200 0 0 20 40 60 80 100 120 KDSI Slide 21 Esempi di stime in basic COCOMO Simple project , 32 KDSI • • • PM = 2.4 (32) 1.05 TDEV = 2.5 (91) 0.38 N = 91/14 = 91 person*month = 14 month = 6.5 person Embedded project, 128 KDSI • • • PM = 3.6 (128)1.2 TDEV = 2.5 (1216)0.32 N = 1216/24 = 1216 person-months = 24 months = 51 persons Effort (PM) Effort (PM) Durata (TDEV) Durata (TDEV) Numero persone necessarie (N) Numero persone disponibili Slide 22 COCOMO assumptions Implicit productivity estimate (but it still depends on size!) • • Simple mode = 16 LOC/day Embedded mode = 4 LOC/day Time required is a function of total effort, NOT team size Not clear how to adapt model to personnel availability Slide 23 2. Intermediate COCOMO Takes basic COCOMO as starting point Identifies personnel, product, computer and project attributes which affect cost Multiplies basic COCOMO cost (required effort) by attribute multipliers which may increase or decrease costs Multipliers are assigned values in the range [0.7, 1.66] • multiplier < 1 implies reduced cost Slide 24 Intermediate COCOMO attributes (--> multipliers) Personnel attributes • • • • • Analyst capability Programmer capability Programming language experience Application experience • • Product attributes • • • Computer attributes Project attributes • Modern programming practices » structured programming, when COCOMO was defined; » O-O programming today Reliability requirement Database size Product complexity Model tuning - Each organization must identify its own attributes and associated multiplier values A statistically significant database of detailed cost information is necessary (i.e. constraints imposed on SW by the adopted HW) Execution time constraints Memory space constraints • • Software tools Required development schedule » Mismatch between basic COCOMO and Client schedule gives attribute > 1 Slide 25 Example Embedded software system on microcomputer hardware. Basic COCOMO predicts a 45 person-month effort requirement Attributes: • • • • Intermediate COCOMO predicts • RELY = 1.15, STOR = 1.21, TIME = 1.10, TOOL = 1.10 45 * 1.15*1.21.1.10*1.10 = 76 person-months. Total cost = 76 * $7000 = $532, 000 Slide 26 Management options for previous example A. Use existing hardware, development system and development team B. Processor and memory upgrade C. Memory upgrade only Har dware cost increase Experience decrease Hardware cost increase E. New de velopment system F. Staff with hardware experience D. More experienced staff Hardware cost increase Experience decr ease Slide 27 Costs of alternatives based on varying COCOMO params. Slide 28 Calibrazione dei parametri delle formule COCOMO Effort Curve fitted to measured effort Se il Predicted effort secondo i parametri COCOMO standard si scosta dai valori sperimentali misurati (i punti sul diagramma), i parametri vengono modificati secondo il criterio dei minimi quadrati, portando a una nuova curva di previsione ottimale Predicted effort 0 20 40 Size 60 80 100 Slide 29 Staffing requirements Staff required can’t be computed by diving the effort (man months) by the required schedule The number of people working on a project varies depending on the phase of the project The more people work on the project, the more total effort is usually required Very rapid build-up of people often correlates with schedule slippage Slide 30 COCOMO 2 levels COCOMO 2 is a 3 level model that allows increasingly detailed estimates to be prepared as development progresses L1. Early prototyping level • L2. Early design level • Estimates based on object points and a simple formula is used for effort estimation Estimates based on function points that are then translated to LOC (L3. Post-architecture level • Estimates based on lines of source code ) Slide 31 L1. Early prototyping level Supports prototyping projects and projects where there is extensive reuse Based on standard estimates of developer productivity in object points/month Takes CASE tool use into account Formula is simplified: • PM = ( NOP (1 - %reuse/100 ) ) / PROD • PM is the effort in person-months, NOP is the number of object points and PROD is the productivity (Ops/month) Slide 32 L2. Early design level Estimates can be made after the requirements have been agreed Based on standard formula for algorithmic models • PM = A SizeB M + PMm • where: • M = PERS RCPX RUSE PDIF PREX FCIL SCED • PMm = (ASLOC (AT/100)) / ATPROD generated code • • A = 2.5 in initial calibration, Size in KSLOC, but derived from estimated Function Points, and FP to KSLOC conversion table (progr. language-dependent) B varies from 1.1 to 1.24 depending on novelty of the project, development flexibility, risk management approaches and the process maturity (overlap with M…) • reflects the amount of automatically Slide 33 Multipliers Multipliers (values 1 to 6) reflect the capability of the developers, the non-functional requirements, the familiarity with the development platform, etc. • • • • • • • PERS - personnel capability RCPX - product reliability and complexity RUSE - the reuse required PDIF - platform difficulty PREX - personnel experience FCIL - the team support facilities SCED - required schedule Slide 34 The exponent term B This depends on five scale factors (5 = low, 0 = high) Their sum/100 is added to 1.01 Example • • • • • Precedenteness - new project - 4 Development flexibility - no client involvement - Very high - 1 Architecture/risk resolution - No risk analysis - Very Low - 5 Team cohesion - new team - nominal - 3 Process maturity - some control - nominal - 3 Scale factor is therefore 1.17 Project duration: TDEV = 3 (PM)(0.33+0.2*(B-1.01)) • PM is the effort computation and B is the exponent computed as discussed above (B is 1 for the early prototyping model). This computation predicts the nominal schedule for the project Slide 35