If we consider indefinite pronouns, and in particular: ‘ceci, cela, les autres, quelques-uns, tous, toutes’ (this, that, the others, some, all, all), it turns out that they are likely to appear in sequences such as: ‘tout ceci, tout cela, tous les autres, toutes les autres, au moins quelques-uns, quasiment tous, presque tous, quasiment toutes, presque toutes’ (all this, all that, all the others, all the others, at least some, almost all, almost all, almost all, almost all‘; tuttu quissu, tuttu quissa, tutti l’altri, tutti l’altri, alminu calchiadunu, guasgi tutti, guasgi tutti, guasgi tutti, guasgi tutti) and followed by a verb. In the present context, ‘tout’ (all) in ‘tout ceci’ (all this; tuttu quissu) and ‘tout cela’ (all that; tuttu quissa), ‘tous’ (all) in ‘tous les autres’ (all others; tutti l’altri), ‘toutes’ (all) in ‘toutes les autres’ (all others; tutti l’altri), ‘au moins’ (at least) in ‘au moins quelques-uns’ (at least some), ‘quasiment’ (almost; guasgi) in ‘quasiment tous’ (almost all; guasgi tutti) and ‘quasiment toutes’ (almost all; guasgi tutti), ‘presque’ (almost; guasgi) in ‘presque tous’ (almost all; guasgi tutti) and ‘presque toutes’ (almost all; guasgi tutti) play the role of a modifier of an indefinite pronoun, as they change its meaning and scope.
To begin with, there is (i) the class of autonomous personal pronouns: ‘moi, toi, lui/elle, nous, vous, eux/elles’ (me, you, he/she, we, you, they).
There is also (ii) the class of personal pronouns as direct object complements: ‘me te le/la nous vous les’ (me you him/her us you them; mi ti u/a ci vi i/e), as in ‘il me comprend, elle te comprend, je le comprends, nous nous comprenons, ils vous comprennent, je les comprends’: (he understands me, she understands you, I understand him, we understand us, they understand you, I understand them; mi capisci, ti capisci, u capiscu, ci capimu, vi capiscini, i capiscu).
There are also (iii) the so-called ‘tonic personal pronouns’: ‘moi toi lui/elle nous vous eux/elles’ (me you him/her we you them), used after a preposition: ‘de moi, à toi, devant lui, après elle, par nous, chez vous, à eux, à elles’ (of me, to you, in front of him, after her, by us, at your place, to them, to them; di mè, à tè, davanti ad eddu, dopu ad edda, da no, ind’è vo, ad eddi, ad eddi).
Finally, there is the class of person pronouns as indirect object complement: ‘me te lui nous vous leur’ (not applicable to english; mi ti li ci vi li). For example: ‘il me parle, elle te parle, je lui parle, il nous parle, elle vous parle, je leur parle’ (he talks to me, she talks to you, I talk to her, he talks to us, she talks to you, I talk to them; mi parla, ti parla, li parlu, ci parla, vi parla, li parlu). If we now analyse the personal pronouns as indirect object complements, it turns out that each of them is equivalent to the preposition followed by the tonic personal pronoun: ‘il me parle = il parle à moi, elle te parle = elle parle à toi, je lui parle = je parle à lui/elle, il nous parle = il parle à nous, elle vous parle = elle parle à vous, je leur parle = je parle à eux/elles’ (mi parla = parla à mè, ti parla = parla à tè, li parlu = parlu ad eddu/edda, ci parla = parla à no, vi parla = parla à vo, li parlu = parla ad eddi). Therefore: ‘me = à moi, te = à toi, lui = à lui/elle, nous = à nous, vous = à vous, leur = eux/elles’ (mi = à mè, ti = à tè, li = ad eddu/edda, ci = à no, vi = à vo, li = ad eddi). Thus, in the so-called ‘tonic personal pronoun’, the preposition placed before the personal pronoun is included. It is therefore a preposition+personal pronoun group. In the present context, ‘tonic personal pronouns’ cannot be considered as a category of personal pronouns: in the present model, it is it is a contraction, i.e. a group consisting of a preposition followed by a personal pronoun: PS+PRPERS.
Let’s focus on the word class of determiner modifiers: they also include indefinite determiner modifiers. Indefinite determiners are thus: ‘tous les, aucun, aucune, quelques’ (all, none, none, some ; tutti i, nisciunu, nisciuna, calchì).
Here are some examples of these indefinite determiner modifiers: ‘presque tous les, quasi tous les, quasiment tous les, quasiment aucun, quasiment aucune, presque aucun, presque aucune, au moins quelques, au plus quelques, tout au plus quelques’ (almost all, almost all, almost all, almost none, almost none, at least some, at most some, at most some). Thus, ‘presque tous les’ (almost all; guasgi tutti i) in the sentence ‘presque tous les soldats’ (almost all the soldiers; guasgi tutti i suldati) is an indefinite determiner modifier.
Let’s focus on the class of autonomous personal pronouns: moi, toi, lui/elle, nous, vous, eux/elles (me, you, he/she, we, you, they). On rencontre parfois les formes: ‘moi particulièrement, toi aussi, toi notamment, moi spécialement, moi surtout, vous également, toi de même, toi pareil, lui en particulier, elle notamment, moi aussi’ (me especially, you also, you especially, me especially, me especially, you also, you likewise, you the same, he especially, she especially, me too). As in the sequence ‘moi aussi, j’aime cela’ (me too, I like that). Classically, ‘particulièrement, aussi, notamment, spécialement, surtout, également, de même, pareil, en particulier’ (particularly, also, especially, especially, especially, also, likewise, same, in particular) are considered adverbs. However, in the present context, they modify the meaning of autonomous personal pronouns. When used in this way, it is therefore logical to consider them as modifiers of autonomous personal pronouns.
In the present construction, the question arises of whether or not a determiner is a modifier. More specifically, is a definite or indefinite article (i.e. a determiner) a modifier of a noun. In the present model, an adjective is indeed a noun modifier. Is this not also the case for a definite article (the definite article ‘le’ (the), for example)? The answer is no. In fact, a modifier only modifies the meaning of the word to which it is applied. The consequence is that if the modifier is removed from the sentence in question, the sentence still conveys meaning and remains correctly formed. For example, in the sentence ‘le cheval blanc courait’ (the white horse was running), if we remove the noun modifier ‘blanc’ (white), the sentence ‘le cheval courait’ (the horse was running) remains correct. On the other hand, if we remove the determiner ‘le’ (the), we get the sentence ‘cheval blanc courait’ (white horse was running) which is incomplete and whose structure is not valid.
The distinction between rule-based and statistically-based translation may well be artificial and obscure what is really the interesting distinction in machine translation modules. The latter may well lie in the fact that some methods capture (at least partially) the semantics of a text, and are for example able to enumerate lemmas in the text, change the person of verbs or the gender of nouns, etc. In contrast, other translation methods do not capture the semantics of the text and only perform the translation. At least this type of classification seems to be relevant to artificial intelligence.
According to our analysis, the word ‘très’ is likely to occur in the following grammatical types:
- Adjective modifier: here, ‘très’ modifies the meaning of an adjective: très beau (very beautiful, biddisimu), très content (very happy, cuntentissimu)
- Adverb modifier: ‘très’ here modifies the meaning of an adverb: ‘très rarement’ = very rarely, raramenti; ‘très souvent’ = very often, mori à spessu
- Adverb (i.e. in our terminology, a Verb modifier): ‘very’ modifies here the meaning of a verb: ‘j’ai très faim’ = I am very hungry, t’aghju mori fami; ‘il avait très soif’ = he was very thirsty, t’aia mori seti: where the verb is here the verbal locution ‘avoir faim’ = to be hungry, avè a fami; avoir soif = to be thirsty, avè a seti
Disambiguation is an essential process in machine translation. Sometimes, however, it seems more rational and logical to leave an ambiguity in the translation. This is the case when (i) there is an ambiguous word in the sentence to be translated; and (ii) the context does not provide an objective reason to choose one of the two occurrences. It seems that in this case, the best translation is the one that leaves the ambiguity intact.
Let’s take an example. Consider the following French sentence: ‘Son palais était en feu.’. The French word ‘palais’ is ambiguous, because it corresponds in English and in Corsican to two different words (palace, palazzu and palate, palatu).
Thus, we have 3 possibilities of translation:
- His palate was on fire
- His palace was on fire
- His palace/palate was on fire
The third translation, in my opinion, is better, because it points out that the context is insufficient to choose one of the two alternatives.
Consider now, on the one hand, the following sentence: ‘Il avait mangé du piment fort. Son palais était en feu.’ Now the context provides an objective motivation to choose one of the two occurence. This yields the following translation: He had eaten some hot pepper. His palate was on fire.
On the other hand, consider the following sentence: ‘Les ennemis du prince avaient lancé des engins incendiaires. Son palais était en feu.’ We also have here an objective reason to choose the other alternative. It translates then: The prince’s enemies had thrown incendiary devices. His palace was on fire.
As far as machine translation is concerned, it seems that the best thing is to combine the best of the two approaches: rule-based or statistic-based. If it were possible to converge the two approaches, it seems that the benefit could be great. Let us try to define what could allow such a convergence, based on the two-sided grammatical approach. Let us try to illustrate this with a few examples.
To begin with, u soli sittimbrinu = ‘le soleil de septembre’ (the sun of September). In Corsican language, sittimbrinu is a masculine singular adjective that means ‘de septembre’ (of September). In French, ‘de septembre’ is–from an analytic perspective–a preposition followed by a common masculine singular noun. But according to the two-sided analysis ‘de septembre’ (of September) is also–from a synthetic perspective–a masculine singular adjective. This double nature, according to this two-sided analysis of ‘de septembre’, allows in fact the alignment of ‘de septembre’ (of September) with sittimbrinu.
More generally, if we define words or groups of words according to the two-sided grammatical analysis in the dictionary, we also have an alignment tool, which can be used for a translation system based on statistics, in the same way as a corpus. Thus, if it is sufficiently provided, the dictionary is also a corpus, and even more, an aligned corpus.
Let’s look at the translation of the word ‘whose’. Depending on the case, ‘whose’ can be a
- relative pronoun: ‘la difficulté dont je t’ai parlé’ (the difficulty I told you about), ‘voilà le professeur dont j’apprécie beaucoup les cours’ (this is the teacher whose classes I really enjoy.)
- or, more rarely, a preposition: ‘il y avait cinq couleurs, dont le rouge et le bleu’. (there were five colours, including red and blue.)
It is the latter case that we will be looking at. In this case, ‘dont’ is translated into English as ‘including’. In Corsican, the translation is: c’eranu cinque culori, frà i quali u rossu è u turchinu. But if we translate ‘il y avait cinq plantes, dont le ciste et la bruyère’ (‘there were five plants, including cistus and heather’), we get: c’eranu cinque piante, frà e quale u muchju è a scopa. Thus the translation of ‘dont’ (including) as a preposition is either frà i quali (masculine plural, culore being masculine in Corsican) or frà e quale (feminine plural), depending on which noun ‘dont’ refers to.
Thus ‘dont’ is translated into the masculine plural or the feminine plural, depending on the noun – either masculine or feminine – to which it refers. This casts doubt on the ‘prepositional’ nature of ‘dont’, and leads to further analysis to determine whether there might not be a more suitable grammatical type.
It is worth noting that ‘dont (including) can be replaced by ‘parmi lequels’ (among which, frà i quali) or ‘parmi lesquelles’ (among which, frà e quale) depending on whether the noun to which ‘whose’ refers is in the masculine plural or the feminine plural. This suggests that ‘whose’ could be conceived of as a preposition followed by a pronoun. In the spirit of this analysis, the BDL site notes: ‘Dont’ is probably the relative pronoun whose use is the most delicate. To use it correctly, one must know that dont always ‘hides’ the preposition ‘de’; ‘dont’ is equivalent to ‘de qui’, ‘de quoi’, ‘duquel’, etc. This link between ‘dont’ and ‘de’ goes back to the Latin origin of ‘dont’, which is from ‘unde’ “from where”.
More generally, this suggests that further analysis of some prepositions may be needed.