MIRIAM VOGHERA NOUNS AND VERBS IN SPEAKING AND WRITING 1. INTRODUCTION This paper presents a quantitative investigation of the role of nouns and verbs in spoken and written Italian texts. The aim is to explore if, and to what extent, differences in the frequency of nouns and verbs characterize spoken and written texts and/or differences among registers. The distinction between spoken and written texts is naturally intertwined with register variation. However, it would be desiderable to discern modality constraints from situational constraints, and evaluate more accurately the weight of different variables (Voghera 1992; Voghera in press b). Therefore, data coming from ten different kinds of texts are analysed here, and Italian data are compared with those coming from other languages to provide some tentative explanations of the role played by nouns and verbs in spoken and written texts. In fact, there are many signs that a different distribution of nouns and verbs could correspond to a different salience of nouns and verbs in speaking and writing. Studies on register variation have shown that spoken and written texts diverge in the use of nouns and verbs, and that different patterns of noun and verb usage can be a sign of different discourse strategies (Halliday 1985). In his multifactorial analysis of oral and written English registers, Biber (1988) finds that a high frequency of nouns is associated with written information-focused texts and a low frequency of nouns is associated with oral discourse. These findings are basically confirmed by Biber (1995) in a cross-linguistic register analysis of spoken and written texts of English, Tuvalan, Korean and Somali. Miller and Weinert (1998) have shown that NPs are much more complex in written than in spoken texts in German, Russian and English. On the contrary, spoken texts exhibit fewer and simpler NPs. From these studies, we draw the general suggestion that spoken and written texts show different frequency patterns of nouns and verbs, and that these patterns reflect different textual strategies. Other sources of evidence on the different salience of nouns and verbs in speaking and writing come from psycholinguistic, neuropsychological and neuroimaging studies. It has been found that patients with acquired linguistic deficits can selectively be impaired in processing nouns or verbs, and the impairment can be restricted to one modality of output (speaking or writing) or input (listening or reading), i.e. to speaking or writing. There are patients who have difficulty in producing or comprehending nouns or verbs in written, but not in spoken contexts, and patients who behave in the opposite way (Hillis and Caramazza 1995; Rapp and Caramazza 1997). This means that patients can be impaired in processing only nouns or only verbs, and that the impairment can affect speaking but not writing, or vice versa. These results can be interpreted as indicating that grammatical category information is represented redundantly in written and spoken modalities (Laudanna 2000). 2. METHODOLOGY AND DATA I have collected data on the distribution of nouns and verbs in ten different kinds of spoken and written texts. The spoken data come from the Lessico di frequenza dell’italiano parlato (LIP) (De Mauro et al. 1993), a frequency dictionary of spoken Italian. LIP is based on a corpus consisting of about 500.000 word tokens representing five different types of texts: a) face-to-face conversations; b) telephone conversations; c) nonfree dialogical interactions; d) monologues (lectures, sermons…); e) radio and TV programmes. The third register requires some explanation. It has been decided to include in the same group texts in which the dialogical interaction does not proceed without restraint, but is guided by one of the speakers: typical texts are interviews, classroom interactions, and debates. Each register is represented by samples of texts amounting 2 approximately to 100.000 word tokens; texts were recorded in four different cities to represent the diatopic variety of Italian language 1 . The written data come from the Lessico di frequenza della lingua italiana contemporanea (LIF) (Bortolini et al. 1971.), a frequency dictionary of written Italian 2 . The LIF is based on a corpus consisting of about 500.000 word tokens representing five different types of texts: a) newspapers and magazines; b) textbooks for primary school; c) novels; d) theatre plays; e) movie scripts. Before coming to the exposition of the data here considered, some preliminary terminological explanation is needed. I will use the following terms: 1) word type to refer to the lexical item to which every word is assigned; 2) word form to refer to the different flexional forms which a word type can assume; 3) word token to refer to the occurrence of a word type. An example is given below: I The bambini children vanno go a to scuola. school In sentence (1) the underlined word is a token of the word type BAMBINO 3 , which occurs in its plural form bambini. Both LIP and LIF present a frequency lists, which provide counts of word types, word forms, and word tokens. A category tag is assigned to each word type. The frequency lists are ranked by frequency of usage, that is the frequency of tokens of a word type divided by the dispersion (Mancini 1993). From now on, I will refer to frequency of usage simply as frequency. 1 Italian presents wide and deep regional differences, which however can be ignored for the aim of this paper. 2 We must remind that the LIF was not specifically conceived to represent written Italian. Therefore, it includes texts, such as theatre plays and movie scripts, to represent the variety of contemporary Italian on the whole. However, it is the only corpus of written Italian comparable to the LIP one, as far as extension and internal variety are concerned. 3 It is clear, but not relevant here, that lexical items can have different citation forms in different languages. 3 I extracted the frequency of nouns and verbs included in the first 2000 word types of both frequency dictionaries. This range of frequency represents the so-called basic vocabulary (vocabolario fondamentale), which is supposed to be the core of the lexical knowledge. Summing the two corpora, the word tokens corresponding to the first 2000 word types are 905.551/1.000.000, and they cover 91.1% of the LIP corpus and 90.7% of the LIF corpus. I assume that noun and verb frequency in the basic spoken and written vocabulary is representative of the general trend in speaking and writing. 3. NOUNS AND VERBS IN SPOKEN AND WRITTEN CORPORA Table 1 illustrates the number of noun and verb types over the first 2000 types in the LIP and LIF corpora. WORD-TYPES LIP LIF Nouns Verbs 44% 21% 45% 28% Table 1. Number of noun and verb types over the first 2000 types of LIP and LIF corpora. There is an unexpected similarity in the distribution of noun and verb types in spoken and written corpora. The only noticeable difference at this level is the greater numerosity of verbs in the LIF, which should be probably attributed to the general richness of the vocabulary of written registers (Plag et al. 1999). In both corpora nouns are much more numerous than verbs. The greater quantity of nouns compared to verbs is well known, and it is also known that the distance between the numbers of noun and verb types increases if we take into account not only the first 2000 word types, but the entire vocabulary (Voghera in press a). In the Grande dizionario italiano dell’uso (De Mauro 1999), the largest Italian dictionary, nouns are about 58% of all the entries and 4 verbs about 13% 4 . The same proportion of noun and verb types has been found in spoken and written French (Gouggenheim et al. 1964; Greidanus 1990), and strikingly similar percentages are found in a frequency dictionary of spoken German, based on a corpus of 500.000 word tokens (Ruoff 1990), as we can see in table 2. WORD TYPES HWB Nouns Verbs 59.95% 28.16% Table 2. Number of noun and verb types in Häufigkeitswörterbuch gesprochener Sprache (Ruoff 1990) 5 . The figures reported reflect a basic similarity in the structures of spoken and written lexicons, as far as the quantity of nouns and verbs are concerned. However, this is not sufficient to understand whether nouns and verbs are used differently in spoken and written texts. We must make a distinction between the numerosity of types and the frequency of word tokens: the former reflects the structure of the lexicon; the latter reflects the process of textual construction. They represent two different linguistic measures, which might not be directly proportional. This means that we can have a basic similarity in the number of noun and verb types, but a different distribution of noun and verb tokens, i. e. different usage of nouns and verbs in structuring the message. The divergence between count of word types and count of word tokens is clear comparing the figures in table 1 with those in table 3. WORD TOKENS LIP LIF Nouns Verbs 12.3% 18.2% 14.2% 18.5% 4 In the number of nouns is included the number of adjectives, which in the Gradit are registered both as adjectives and nouns. 5 The data in tables 2 and 4 are related to the entire corpus. 5 Table 3. Frequency of noun and verb tokens over the first 2000 word types of LIP and LIF corpora. The distribution of noun and verb tokens shows an opposite tendency compared with that of types. Verbs are used more frequently than nouns, that is each noun type occurs much less frequently than each verb type. Spoken and written corpora show the same percentage of verb tokens, while noun tokens are more frequent in written texts. The corpus of spoken German shows the same divergence between the distribution of noun and verb types and that one of noun and verb tokens, as we can see in table 4. WORD TOKENS Nouns Verbs HWB 10.81% 21.19% Table 4. Number of noun and verb tokens in Häufigkeitswörterbuch gesprochener Sprache (Ruoff 1990).. It is interesting to observe that while the sum of noun and verb types in Italian and German corresponds respectively to 65% and 88% of all the word types, the sum of noun and verb tokens corresponds to about 30% of all the tokens in in both lexicons. This means that the lexicon presents a great variety of noun and verb types, but most of the words in our discourses are not nouns and verbs. The distance between the counts of types and tokens becomes greater, if we examine the count in the ten different registers. LIP CORPUS NOUN TOKENS Face-to-face conversations Telephone conversations Radio and TV Non-free dialogues Monologues 6 10.4% 8.8% 12.1% 14% 14.7% VERB TOKENS 19.1% 20.8% 17.1% 16% 15.1% Table 5. Frequency of noun and verb tokens over the first 2000 word types in the five registers of LIP corpus. As we can see in table 5, in all the spoken registers verbs are more used than nouns. The variation in the frequencies does not seem fortuitous, but rather varies according to different production/reception circumstances. The distribution of nouns and verbs in the five spoken registers is such that verbs tend to decrease as we pass from dialogical to more monological texts, and nouns tend to decrease as we pass from monological to dialogical texts. Thus, nouns and verbs vary their frequencies according to the amount of dialogue. However, the curve of nouns and verbs distribution does not progress gradually, but it rather plots a division between conversations and the other kinds of texts. In both face-to-face and telephone conversations verbs are twice as many as nouns, while in the other registers the divergence between noun and verb frequencies is much reduced. The difference can not be attributed only to the amount of dialogue, because both interviews and radio and TV programmes are dialogical. The factor, which better explains the difference between conversations and the other registers, is the degree of text planning. Conversations represent on-line production, i. e. a totally unplanned type of text, while interviews, radio and TV programmes, sermons or lectures imply a certain degree of linguistic planning. In table 6 the five types of texts are ordered on the basis of the criteria: a) amount of dialogues and b) amount of planning. Dialogue + + + +/- Planning Face-to-face conversations Telephone conversations Non-free dialogues Radio and TV programmes Monologues +/+/+ Table 6. Spoken types of texts ordered on the basis of the criteria: a) amount of dialogue and b) amount of planning. 7 Conversations and monologues occupy the two extremes; less definite is the position of the other two registers. The non-free dialogical interactions can cover different degrees of planning: classroom interactions can be quite spontaneous, but interviews and debates are relatively planned texts. Radio and TV programmes can include both dialogical and monological parts, and carefully planned discourses together with totally spontaneous conversations, for instance in talk shows. Thus, from these data we can conclude that both factors affect the distribution of nouns and verbs, but it is not possible to recognize which one is more relevant between the two. Similar differences among registers can be observed in the distribution of noun and verb tokens in written registers. LIF CORPUS NOUN TOKENS Newspapers and magazines Textbooks Novels Theatre plays Movie scripts 14.5% 16.0% 14.0% 12.8% 11.8% VERB TOKENS 13.8% 13.6% 16.8% 21.5% 23.7% Table 7. Frequency of noun and verb tokens over the first 2000 word types in the five registers of LIF corpus. In table 7, we note a sort of division between newspapers, magazines and textbooks, on the one hand, and novels, theatre plays, and movie scripts on the other hand. In the formers nouns are more frequent than verbs, in the latter verbs are more frequent than nouns, similarly to spoken texts. Since theatre plays and movie scripts mimic spontaneous dialogues, it is not surprising that the distribution of nouns and verbs is very similar to that one of conversation: verbs are twice as many as nouns. Less obvious is the distribution of nouns and verbs in the novels. Since the novels included in the corpus are by different authors, it is unlikely that the distribution of nouns and verbs could depend on simi- 8 lar styles 6 . I rather propose to attribute the diversity of novels, compared with periodicals and textbooks, to two other factors. First of all, novels differently from informative prose can have dialogical parts, which presumably present a distribution of nouns and verbs analogous to that one of spoken registers. Secondly, novels have long narrative parts in which the progression of the text leads to chains of verbal clauses, and this is crucial to determine a higher frequency of verbs. On the contrary, it has been found that informative prose tends to pack the information in a minor number of clauses, comparing with spoken and narrative texts (Voghera in press a). This tendency leads to use heavier nominal phrases, i. e. a higher number of nouns and nominalizations per clause. The difference between the two types of texts can be seen in (1) and (2). (1) Sembra /che l’invenzione degli scacchi sia legata ad un fatto di V N N V N sangue./ N Narra infatti una leggenda /che quando il gioco fu presentato per la V N N V prima volta a corte /il sultano volle premiare l’oscuro inventore / N N N V N esaudendo ogni suo desiderio. V N P. Maurensig, La variante di Lunenburg, Adelphi, 1993 7 . (2) Oltre 150 pagine, divise in 20 schede tecniche, dal mercato del N V N N 6 For the list of novels, as well as for those of all the other subcorpora, see LIF: XIX-XX. It seems that the invention of chess is drenched in blood. A legend tells that when the game was presented at court for the first time, the sultan wanted to reward the obscure inventor by granting him whatever he wished. (Translation M.V.) 7 9 lavoro al fisco, dal Sud alle pensioni. /In ciascuna scheda un’analisi N N N N N N dettagliata delle situazioni e le azioni/ che la Confindustria sollecita/ N N N V per restituire competitività al paese./ V N N “D’Amato, offensiva sul lavoro”, La Stampa, 7/2/2001 8 . In (1), the first lines of a novel, there are 10 nouns and 6 verbs distributed over 6 clauses, that is there is an average of 1.6 nouns per clause. In (2), the first lines of a newspaper article, there are 14 nouns and 3 verbs distributed over only 4 clauses, that is there is an average of 4.6 nouns per clause. The high number of nouns and nominalizations in (2) allows to compact a great amount of information in a small number of clauses 9 . The relevance of the distinction between informative prose and noninformative prose in determining different patterns in the distribution of nouns and verbs is confirmed by data extracted from other Italian and English corpora. Table 8 shows the distribution of nouns and verbs in three corpora of Italian informative texts (Voghera in press a). INFORMATIVE PROSE NOUNS VERBS DR VELI DP 21.7% 20.2% 27.8% 10.4% 15.5% 17.3% Table 8. Frequency of noun and verb tokens in two corpora of Italian informative prose. 8 More than 150 pages, grouped into 20 technical files, from labor market to taxation, from South to pensions. In each file a detailed analysis of the issue and the actions the Industry Confederation is rushing the government into to revamp the country competitiveness. 9 For a general description of the usage of nominalization in contemporary Italian, see Policarpi, Rombi 1985. 10 The DR, Dizionario di riferimento, is based on a corpus of 3.8 millions word tokens collected by IBM (Mancini 1993), and the VELI, Vocabolario elettronico della lingua italiana, is based on a corpus of 26.2 millions of word tokens. The VELI data have been extracted by Rizzi (1994). The DP data come from a frequency dictionary based on four years of Due Parole, a monthly periodical for people with deficits in comprehension or low educational level. DP is based on a corpus of 176.060 word tokens (Piemontese 1996). DR and VELI include newspapers and press agency texts and DP includes informative/news articles. In all the three corpora nouns are much more frequent than verbs, confirming that a higher number of nouns is a typical feature of informative prose. Moreover, it is interesting to note that even DP articles, which are written to be easily understood, display more nouns than verbs. These data support the hypothesis that the number of nouns is not necessarily a sign of complexity, but it rather features texts in which much information is compressed into few clauses. The Italian data are consistent with English data. In the LancasterOslo/Bergen Corpus (LOB) of written English, verbs are more used in imaginative prose than in informative prose, as we see in table 9. According to Johansson and Hofland (1989: 16), the amount of dialogue is responsible for the different behaviour of the two groups of texts 10 . LOB CORPUS NOUNS VERBS Informative prose Imaginative prose 26.9% 20.0% 16.4% 21.9% Table 9. Frequency of nouns and verb tokens in the English LOB corpus. The correlation between noun usage and informative texts is clearly stated in Biber (1995). In all four languages analysed (English, Somali, 10 Informative prose includes newspaper articles, reviews articles, biographies, essays, governement document, learned and scientific writings. Imaginative prose includes general fiction, mistery, detective and science fiction, novels and short stories, see Johansson e Hofland (1989: 3-6). 11 Tuvaluan an Korean), he finds that frequent nouns, derived nominal and nominalizations are features “to package extensive information into relatively few words” (Biber 1995: 259), and therefore are mainly used in monologues and information focused production. On the contrary, interactive, interpersonally focused on-line production shows a negative correlation to these ‘nominal’ features. 4. CONCLUSION To summarize, it has been found that: a) spoken and written Italian texts diverge in noun and verb distribution; b) the major difference between the two modalities has been registered in noun tokens frequency; c) spoken spontaneous texts are constant in showing a lower frequency of nouns, while written texts may have a major internal variety. However, differences in noun and verb distribution do not produce a single partition between all spoken and all written texts. Using the terms proposed by Nencioni (1983) in a classical essay on the difference between spoken and written Italian, the clearest partition is between the parlato-parlato, i.e. conversations, and scritto-scritto, i.e. formal informative written texts, as we can see in tables 10 and 11. PARLATO-PARLATO NOUNS VERBS LIP Corpus 11.3% 20.5% Table 10. Frequency of noun and verb tokens over the first 2000 word types in conversations. SCRITTO-SCRITTO 12 NOUNS VERBS LIF Corpus 19.7% 16.5% Table 11. Frequency of noun and verb tokens over the first 2000 word types in monological formal texts. Two important differences can be noted. Parlato-parlato and scrittoscritto diverge in the internal distribution of nouns and verbs: in the former verbs are nearly twice as many as nouns, while in the latter nouns are more frequently used than verbs. Besides, the frequency of nouns in the two groups is markedly different: in the scritto-scritto nouns are nearly twice as many as nouns in the parlato-parlato. Thus, the typical spoken texts and the typical written texts diverge in the absolute frequency of nouns and in the relationship between noun and verb frequency. These findings can be connected to different explanations. First of all, I think it is necessary to take into account the ideational and productive/receptive processes in the two modalities. Spoken language exhibits an intrinsic functional discontinuity. There is here an apparent paradox: the on-line production of speech is physically continuous, but actually produces texts, which are deeply discontinuous. This is true because dialogue is the primary model of speech, and dialogue is by definition fragmented: interruptions, project changes, overlapping of the speakers, insertions of receiver are normal features of spontaneous dialogues In contrast, writing is a physically discontinuous process, which normally produces continuous texts 11 . This difference between speaking and writing determines different possibilities of planning and structuring the message. A spoken text is the result of a multiparty activity to which both speaker and receiver contribute, so the speaker knows in advance that her/ his utterance can be interrupted, and that the initial textual strategy can dramatically be altered (Anderson, 1995). To reduce the potential disruption of communication an interruption can cause, we tend to produce short portions of texts at a time, which can easily be controlled and recalled. The need to recall portions of text without the support of an external memory determines another typical feature of spoken utterances: they 11 We are here speaking of typical spoken and typical written texts. We can obviously have spoken texts, such as lectures or formal speeches, and written texts, such as notes and family messages, which do not display these features. 13 tend to reproduce the sequence of events structuring information in serial patterns. Thus, the quantity of information develops through an additive process, which can easily be reconstructed, even in case of project changes or interruptions. Syntactically, this means short clauses that are not hierarchically structured, but rather adjoined to one another. A serial structure permits to both speaker and hearer to progress step by step without overloading the memory and reducing the potential loss of information. On the contrary, a hierarchical structure needs a complex planning, and, above all, a long-term calculation that is not practical in speaking. Short clauses often imply absence of nominal constituents or very simple nominal constituents. It has been noted that in spoken sentences verb valences can be saturated by pronouns or simple NPs, because the semantic and syntactic relations can easily be reconstructed by making appeal to contextual cues (Miller and Weinert 1998). This is one of the most important factors to determine a lower frequency of nouns. In speaking we need fewer nouns because we do not need to refer explicitly to elements, which are present in the situational context or can be infer from it. Therefore, speakers can substitute the nouns with deictic expressions, or even with null constituent. Concluding, the low frequency of noun in spoken texts depends on textual strategies. Noun distribution varies according to production/reception constraints: the more the process is linguistically discontinuous, the more we structure the message like a series of short portions with light nominal constituents. A low noun frequency is just related to a basic feature of spoken modality, which determine a typical way of planning and organizing the verbal sequence. This relationship between the noun and verb and modality constraints explains why differences in noun and verb distribution are greater in conversation and informative prose, and why differences in noun and verb distribution in spoken and written texts are registered in many different languages. Conversation and informative prose are respectively typical spoken and written products, and therefore display more clearly than other kind of texts what is basic in each modality: less nouns than verbs in speaking, and more nouns than verbs in writing. Cultural and communicative practices permit to control and reduce modality constraints in some circumstances, such as formal monologues 14 and pre-planned dialogical texts, as for spoken language. However, this does not contradict the idea that in normal conditions we can predict that the parlato-parlato and the scritto-scritto would differ, as far as the noun and verb usage is concerned. The data by Biber (1995) support and strengthen this hypothesis since different languages, with different cultural and literacy tradition, display nouns distribution like a sign of spoken/written basic distinction. This means that noun and verb distribution can partially vary according to register variation, but is deeply connected to a fundamental distinction between speech and writing. REFERENCES Anderson, Anne H. (1995): “Negotiating coherence in dialogue“, in: Gernsbacher, Morton Ann/Givon, Talmy (eds.): Coherence in Spontaneous Text. Amsterdam/Philadephia: J.Benjamins Publishing Company 1-58. Biber, Douglas (1988): Variation across Speech and Writing. Cambridge: Cambridge University Press. Biber, Douglas (1995): Dimension in Register variation. A Cross-Linguistic Comparison. Cambridge: Cambridge University Press. De Mauro, Tullio (1999): Grande dizionario italiano dell’uso. Torino: Utet. Greidanus, Tine (1990): Les constructions verbales en français parlé. Tübigen: Niemeyer. LIF: Bortolini, Umberta/Tagliavini, Carlo/ Zampolli, Antonio (1972): Lessico di frequenza della lingua italiana contemporanea. Milano: Garzanti-IBM. LIP: De Mauro, Tullio/ Mancini, Federico/ Vedovelli, Massimo/ Voghera, Miriam (1993): Lessico di frequenza dell’italiano parlato. Milano: Etaslibri. Gougenheim, George/ Michéa, René/ Rivenc, Paul/ Sauvageot, Aurélien (1964): L'élaboration du français fondamental. Paris: Didier. Halliday, Michael A.K. (1985): Spoken and Written Language. Oxford: Oxford University Press. Hillis, Argye E./ Caramazza, Alfonso ( 1995): “Representation of Grammatical Categories of Words in the Brain”, in: Journal of Cognitive Neuroscience, 7: 3: 396-407. Johansson, Stig/ Hofland, Knut (1989): Frequency analysis of English vocabulary and grammar. Oxford: Clarendon Press. Laudanna, Alessandro (2000): L’elaborazione dei nomi e dei verbi nel parlare e nello scrivere. Unpublished. 15 Mancini, Federico (1993): “L’elaborazione automatica del corpus”, in De Mauro et el.: 54-85. Miller, Jim/Weinert, Regina (1998): Spontaneous Spoken Language. Syntax and Discourse. Oxford: Clarendon Press. Nencioni, Giovanni (1983): “Parlato-parlato, parlato-scritto, parlato-recitato“, in Di scritto e di parlato. Bologna: Zanichelli: 133-141. Piemontese, Emanuela (1996), Capire e farsi capire. Napoli: Tecnodid. Plag, Ingo, Dalton-Puffer, Christiane, Baayen, Harald (1999): “Morphological productivity across speech and writing”, in: English Language and Linguisitics: 3, 2: 209-228. Policarpi, Gianna/ Rombi, Maggi (1985):”Usi dell’italiano. La nominalizzazione”, in Franchi De Bellis, Alberto/ Savoia, Leonardo M. (eds.), Sintassi e morfologia della lingua italiana d’uso. Teorie e applicazioni descrittivie. Atti del XVII Congresso della SLI. Roma: Bulzoni 396-406. Rapp, Brenda/ Caramazza, Alfonso (1997): “The Modality-Specific Organization of Grammatical Categories: Evidence from Impaired Spoken and Written sentence Production”, in: Brain and Language: 56: 248-286. Rizzi, Alfredo (1994): La distribuzione delle forme grammaticali nella lingua italiana. Publication by "Dipartimento di statistica, probabilità e statistiche applicate" Roma, Università La Sapienza, 2: 1-26. Ruoff, Arno(1990): Häufigkeitswörterbuch gesprochener Sprache. Tübingen: Niemeyer. Thornton, Anna Maria/ Iacobini, Claudio/ Burani, Cristina (1994): BDVDB-Una base di dati sul vocabolario di base della lingua italiana. Roma: Bulzoni. VELI (1989): Vocabolario elettronico della lingua italiana. Il vocabolario del 2000. Milano: IBM Italia. Voghera, Miriam ( 1992): Sintassi e intonazione nell’italiano parlato. Bologna: il Mulino. Voghera, Miriam (in press a): La frequenza dei nomi e dei verbi nello scritto e nel parlato. Paper presented at the Conference «Langue écrite et langue parleé dans le passé et dans le présent». Università di Napoli Federico II (10-15 marzo 1997). Voghera, Miriam (in press b): Teorie linguistiche e dati del parlato. Paper presented at the XXXIII Congresso Internazionale di Studi della Società di linguistica italiana, “Teorie e dati linguistici” (Napoli, ottobre 1999): Bulzoni. 16 Biografia Insegna Linguistica generale all’Università di Salerno. Si occupa del parlato e più in generale dei problemi teorici connessi alla variabilità linguistica. E’ membro di progetti nazionali di ricerca sulle varietà italiane e il parlato (AVIP, API, CLIPS). Attualmente coordina un gruppo di ricerca su Rappresentazione e uso di nomi e verbi. E’ autrice tra l’altro di Sintassi e intonazione nell’italiano parlato (1992) e coautrice del Lessico di frequenza dell’italiano parlato (1993) e Esercizi di Linguistica (2000). 17