If we want to realize a rule-based translation eco-system (with many language pairs), a disambiguation module for grammatical types is necessary. Indeed, for the French language, such a module performs disambiguation with respect to about 100 categories. The number of pairs (or 3-tuples, 4-tuples, etc.) of disambiguation, for French, is about 250. We need a module that disambiguates each of these n-tuples. So these n-tuples are different from one language to another and necessitates a language-specific module. However, in the context of an eco-system, it is too complex and time-consuming to implement a disambiguation module for each language pair and each ambiguous type. What is needed is a disambiguation module that is not language-specific and that can be simply adapted to a given language pair. A versatile grammatical disambiguator seems to be the crux of the matter here.
If we consider indefinite pronouns, and in particular: ‘ceci, cela, les autres, quelques-uns, tous, toutes’ (this, that, the others, some, all, all), it turns out that they are likely to appear in sequences such as: ‘tout ceci, tout cela, tous les autres, toutes les autres, au moins quelques-uns, quasiment tous, presque tous, quasiment toutes, presque toutes’ (all this, all that, all the others, all the others, at least some, almost all, almost all, almost all, almost all‘; tuttu quissu, tuttu quissa, tutti l’altri, tutti l’altri, alminu calchiadunu, guasgi tutti, guasgi tutti, guasgi tutti, guasgi tutti) and followed by a verb. In the present context, ‘tout’ (all) in ‘tout ceci’ (all this; tuttu quissu) and ‘tout cela’ (all that; tuttu quissa), ‘tous’ (all) in ‘tous les autres’ (all others; tutti l’altri), ‘toutes’ (all) in ‘toutes les autres’ (all others; tutti l’altri), ‘au moins’ (at least) in ‘au moins quelques-uns’ (at least some), ‘quasiment’ (almost; guasgi) in ‘quasiment tous’ (almost all; guasgi tutti) and ‘quasiment toutes’ (almost all; guasgi tutti), ‘presque’ (almost; guasgi) in ‘presque tous’ (almost all; guasgi tutti) and ‘presque toutes’ (almost all; guasgi tutti) play the role of a modifier of an indefinite pronoun, as they change its meaning and scope.
According to our analysis, the word ‘très’ is likely to occur in the following grammatical types:
- Adjective modifier: here, ‘très’ modifies the meaning of an adjective: très beau (very beautiful, biddisimu), très content (very happy, cuntentissimu)
- Adverb modifier: ‘très’ here modifies the meaning of an adverb: ‘très rarement’ = very rarely, raramenti; ‘très souvent’ = very often, mori à spessu
- Adverb (i.e. in our terminology, a Verb modifier): ‘very’ modifies here the meaning of a verb: ‘j’ai très faim’ = I am very hungry, t’aghju mori fami; ‘il avait très soif’ = he was very thirsty, t’aia mori seti: where the verb is here the verbal locution ‘avoir faim’ = to be hungry, avè a fami; avoir soif = to be thirsty, avè a seti
Let’s look at the translation of the word ‘whose’. Depending on the case, ‘whose’ can be a
- relative pronoun: ‘la difficulté dont je t’ai parlé’ (the difficulty I told you about), ‘voilà le professeur dont j’apprécie beaucoup les cours’ (this is the teacher whose classes I really enjoy.)
- or, more rarely, a preposition: ‘il y avait cinq couleurs, dont le rouge et le bleu’. (there were five colours, including red and blue.)
It is the latter case that we will be looking at. In this case, ‘dont’ is translated into English as ‘including’. In Corsican, the translation is: c’eranu cinque culori, frà i quali u rossu è u turchinu. But if we translate ‘il y avait cinq plantes, dont le ciste et la bruyère’ (‘there were five plants, including cistus and heather’), we get: c’eranu cinque piante, frà e quale u muchju è a scopa. Thus the translation of ‘dont’ (including) as a preposition is either frà i quali (masculine plural, culore being masculine in Corsican) or frà e quale (feminine plural), depending on which noun ‘dont’ refers to.
Thus ‘dont’ is translated into the masculine plural or the feminine plural, depending on the noun – either masculine or feminine – to which it refers. This casts doubt on the ‘prepositional’ nature of ‘dont’, and leads to further analysis to determine whether there might not be a more suitable grammatical type.
It is worth noting that ‘dont (including) can be replaced by ‘parmi lequels’ (among which, frà i quali) or ‘parmi lesquelles’ (among which, frà e quale) depending on whether the noun to which ‘whose’ refers is in the masculine plural or the feminine plural. This suggests that ‘whose’ could be conceived of as a preposition followed by a pronoun. In the spirit of this analysis, the BDL site notes: ‘Dont’ is probably the relative pronoun whose use is the most delicate. To use it correctly, one must know that dont always ‘hides’ the preposition ‘de’; ‘dont’ is equivalent to ‘de qui’, ‘de quoi’, ‘duquel’, etc. This link between ‘dont’ and ‘de’ goes back to the Latin origin of ‘dont’, which is from ‘unde’ “from where”.
More generally, this suggests that further analysis of some prepositions may be needed.
Italian has ‘prepositions followed by articles’ (preposizione articolate). This is a specific grammatical type, which refers to a word (e.g. della) that replaces a preposition (di) followed by an article (la):
il lo l’ la i gli le di del dello dell’ della dei degli delle a al allo all’ alla ai agli alle da dal dallo dall’ dalla dai dagli dalle in nel nello nell’ nella nei negli nelle su sul sullo sull’ sulla sui sugli sulle
This specific grammatical type also corresponds to:
- in French: du = de le, des = de les
- in Corsican and especially in the Sartenese variant: ‘llu = di lu, ‘lla = di la, etc.
This raises the general problem of the number of grammatical types we should retain. Should we create new grammatical types beyond the classical ones, in order to optimise translators and NLP in general? What is the best grammatical type to retain for ‘prepositions followed by an article’: a new primitive one or a compound one (always keeping Occam’s razor in mind)? A preposition followed by an article behaves like a preposition for words on its left, and like an article for words on its right.
We will consider again a category of words such as ‘very’, when they precede an adjective. Traditionally, this category is termed ‘adverbs’ or ‘adverbs of degree’, but we prefer ‘adjective modifier’, because (i) analytically, they change the meaning of an adjective and (ii) synthetically, an adjective modifier followed by an adjective is still an adjective. A more complete list is: almost, absolutely, badly, barely, completely, decidedly, deeply, enormously, entirely, extremely, fairly, fully, greatly, hardly, highly, how, incredibly, intensely, less, most, much, nearly, perfectly, positively, practically, pretty, purely, quite, rather, really, scarcely, simply, somewhat, strongly, terribly, thoroughly, totally, utterly, very, virtually, well.
If we look at sentences such as: il est bien content (he is very happy, hè beddu cuntenti), ils étaient bien contents (they were very happy, erani beddi cuntenti), elle serait bien contente (she would be very happy, saria bedda cuntenti), elles sont bien contentes (they are very happy, sò beddi cuntenti), we can see that the modifier of the adjective ‘bien’ is rendered as very in English and in Corsican as:
- bellu/beddu: singular masculine
- belli/beddi: plural masculine
- bella/bedda: feminine singular
- belle/beddi: feminine plural
This shows that the adjective modifier is invariable in French and English, but varies in gender and number in Corsican. Thus, in Corsican grammar, it seems appropriate to distinguish between:
- singular masculine adjective modifier
- plural masculine adjective modifier
- singular feminine adjective modifier
- plural feminine adjective modifier
On the other hand, such a distinction does not seem useful in English and French, where the category of ‘adjective modifier’ is sufficient and there is no need for further detail.
Let’s continue to rethink the gruesome (so is it argued here) category of adverbs (in the classical sense). Let’s now turn our attention to the category of ‘adverb modifiers’. Adverbs are understood here in a restricted sense: they are either verb modifiers or proposition modifiers. In this context, we are likely to encounter adverb modifiers. In general, the adverb modifier precedes the adverb. Thus, very (‘très’) is an adverb modifier in the sequence he was eating very rarely (il mangeait très rarement’, manghjava mori raramenti).
Likewise more (‘plus’, più) is in some cases an adverb modifier. This is the case in the sequence he was drinking more frequently (‘il buvait plus fréquemment’, biia più suventi).
Let’s consider again the case of adjective modifiers (in classical grammar, this category of words are considered as degree adverbs). These include the following: peu, très, extrêmement, surtout, étonnamment, à peine, vraiment, assez, bien, trop, tellement, … = pocu, assai, estremamente, sopratuttu, in modu stunante, appena, propriu/propria/proprii/proprie, abbastanza, bellu/bella/belli/belle, troppu/troppa/troppi, troppe, tantu/tanta, tanti/tante, … = not very, very, extremely, especially, surprisingly, hardly, really, enough, all/very, too, so,… We have argued that this category of words are ‘adjective modifiers’, when they precede an adjective. But is such an assertion likely to be proven, or is there some form of evidence available? Grammar, like other disciplines, requires that assertions be justified, and if possible proven. The notion of proof in grammar, however, is uncommon. Let’s see if we can provide such proof or justification?
Consider the case of ‘tellement’ (so much), which we consider to be an adjective modifier when it precedes an adjective. Now, let us consider the following translations, where ‘tellement’ is used:
- in French: il est tellement beau, ils sont tellement petits, elles est tellement belle, elles sont tellement intelligentes
- in English: it is so beautiful, they are so small, they are so beautiful, they are so smart
- in Corsican: hè tantu bellu, sò tanti chjuchi, hè tanta bella, sò tante intelligente (an alternative translation hè: hè cusì bellu, sò cusì chjuchi, hè cusì bella, sò cusì intelligente)
- in Italian: è così bello, sono così piccoli, sono così belli, sono così intelligenti
It is patent here that ‘tellement’ preceding an adjective is translated in Corsican by:
- tantu, when the adjective is singular masculine
- tanti, when the adjective is plural masculine
- tanta, when the adjective is singular feminine
- tante, when the adjective is plural feminine
Thus ‘tellement’ (so much, tantu/tanti/tanta/tante), employed in this usage, i.e. preceding an adjective, accords with the adjective to which it refers. This sounds as a justification of its classification as an adjective modifier.
What is the status of adjective modifiers (tant, tout juste, un rien, un tantinet, très, extrêmement, … = so much, just a little, a little, a little, very, extremely, …) in the present grammatical typology? Adjectives are defined as noun modifiers. So adjective modifiers would be modifiers of noun modifiers? This sounds intriguing. In reality, we do not have the concept of ‘modifiers of modifiers’. In fact, we have the following rules:
- a verb modifier followed by a verb is a verb
- a determinant modifier followed by a determinant is a determinant
- and generally speaking, a modifier of an X followed by an X is an X (where X is a given grammatical type)
So a noun modifier followed by a noun is a noun, i.e. an adjective followed by a noun is a noun. For example: ‘un très beau livre’ (a very nice book), where ‘very’ is an adjective modifier, ‘nice’ is an adjective, i.e. a noun modifier, and ‘book’ is a noun.
Hence finally, ‘an adjective modifier is a modifier of a noun modifier’ reads as follows: an adjective modifier is a modifier of [noun modifier].
Let’s focus on analyzing the following phrases:
- à force de courage (bravely)
- à force de courage et de persévérance (by dint of courage and perseverance)
- avec beaucoup d’abnégation (selflessly)
- d’une manière ou d’une autre (in any way)
- d’une façon vraiment admirable (in a very admirable way)
- au moment le plus opportun (when most appropriate)
What is their grammatical nature? From the point of view of two-sided grammar, what are they?
From a synthetic standpoint, first of all, they are adverbs. Let us turn now to their nature from an analytical point of view.
- à force de courage (bravely): analytically, it is a preposition, followed by a common noun, then another preposition, then another common noun: PS-NC-PS-NC.
- à force de courage et de persévérance (by dint of courage and perseverance): analytically, it is a preposition, followed by a common noun, then another preposition, then another common noun, then a conjunction, then another preposition and then another common noun: PS-NC-PS-NC-CONJ-PS-NC.
- and so on