Tag Archives: French to Corsican

Update to priority pairs for endangered languages

If we were to update the priorities for language pairs to be achieved, from the point of view of endangered languages, the result would be as follows:

  • Corsican language: French to Corsican (already done)
  • Sardinian Gallurese: Italian to Gallurese
  • Sardinian Sassarese: Italian to Sassarese
  • Sicilian: Italian to Sicilian: sicilian language is close to Corsican sartinesu or taravesu
  • Munegascu: French to Munegascu: munegascu language bears some similarities with Corsican language

Pairs such as French to Gallurese, French to Sassarese, English to Gallurese, English to Sassarese, English to Sicilian do not have priority, as they can be resolved using an intermediate pair. French to Gallurese is done with the French to Italian pair (e.g. with Deepl) and then with the Italian to Gallurese pair, etc.

A specific kind of superlative

Let us consider a specific kind of superlative. Such form specific to Corsican language is notably mentioned by grammarian and author Santu Casta, in his  Punteghju, who recommends the following translation of “C’était le village le plus riche du canton” (It was the richest village of the canton):  Era u più paese riccu di stu cantone (pages 26 & 54-55). The structure is original in the sense that the comparative (più) precedes the noun (campanile, bell tower) that precedes the adjective (altu, high).

A Special Case of Anaphora Resolution

After improper anaphora resolution

Anaphora resolution usually refers to pronouns. But we face here a special case of anaphora resolution that relates to an adjective. The following sentence: ‘un vase de Chine authentique’ (an authentic vase of China) is translated erroneously as un vasu di China autentica, due to erroneous anaphora resolution. In this sample, the adjective ‘authentique’ refers to ‘vase’ (English: vase) and not to ‘Chine’ (China).

The same goes for ‘une chanson du Portugal mythique’, where ‘mythique’ refers to ‘chanson’ and not to ‘Portugal’.

After appropriate anaphora resolution

Four consecutive ambiguous words


Translating the following sentence: ‘ce fait est unique’ is not as easy as it could seem at first glance. In effect, it is made up of four consecutive ambiguous words:

  • ‘ce’: ‘ssu (demonstrative pronoun, this) or ciò (it, relative pronoun)
  • ‘fait’: fattu (masculine singular noun, fact), fattu (past participe, done) or faci (does, third person singular of the verb to do at the present tense)
  • ‘est’: estu (masculine singular noun, east) or (is, third person singular of the verb to be at the present tense)
  • ‘unique’: unicu (masculine singular adjective, unique in English) or unica (feminine singular adjective, unique in English)

Efficiency open test: updating the protocol (on may 2018)

Now updating the efficiency open test’s protocol, targeted at measuring the accuracy of the French to Corsican translation. It is based on the wikipedia first 100 words of the ‘article of the day’, which changes everyday. The scoring will result from the 10 last tests performed. The test will be performed every saturday. Here is the first one. Scoring = 1 – (6/138) = 95,65%. One vocabulary error: the verb ‘égaler’ (to equal). It should also read ‘da u 1923 à u 1939‘, ‘u 16u‘ instead of ‘u 16e’, ‘a prova maestra‘.

The test is termed ‘open’ in the sense that:

  • its protocol is clearly described
  • the ‘article of the day’ of the french wikipedia is pseudo-random (there is room for discussion here ; arguably, it does not consist here of perfect randomness, but it can be considered that it provides  an acceptable level of randomness)
  • it is ‘open’ since it can be verified via the ochakko translator for android app

Another case of firstname ambiguity: ‘Noël’

Translation of the French word ‘Noël’ yields another case of ambiguity. For ‘Noël’ can translate:

  • either into Natali (Christmas, Christmas Day): the annual festival commemorating Jesus Christ’s birth
  • or into, identically, Natali (‘Noel‘): the firstname

Now it seems there is no case of disambiguation, since in either case, ‘Noël’ in French translates into Natali (Natali in sartinese and taravese variants; Natale in cismuntincu variant). But ambiguity lurks when one considers some sentences including ‘Noël’. Let us consider then the following sentence: ‘Je l’ai donné à Noël.’ Now it can be translated:

  • either into: L’aghju datu in Natali. (I gave it at Christmas.)
  • or into: L’aghju datu à Natali (I gave it to Noel.)

since French preposition ‘à’ translates differently in both cases. A phenomenon of the same nature occurs when one considers translation from French to English.

Interestingly, when the two ambiguous consecutive words are repeated, ambiguity vanishes. Since ‘Je l’ai donné à Noël à Noël.’ translates unambiguously into L’aghju datu à Natali in Natali (I gave it to Noel at Christmas.). For we can ignore the order: L’aghju datu in Natali à Natali (I gave it at Christmas to Noel.) amounts to the same. In this last case, the  translation is meaning-preserving.

Word-sense disambiguation: first test of new engine

Now testing the new engine with the semantically ambiguous French ‘échecs’ = fiaschi/scacchi (failures/chess).

What is interesting here is that semantic disambiguation transfers successfully into English (although the French/English engine is still in its infancy as there are still a lot of grammatical errors):

Now further tests are needed with some other semantically ambiguous words:

  • ‘défense’: defense/tusk; Corsican: difesa/sanna
  • ‘fils’: sons/wires; Corsican: figlioli/fili
  • ‘comprendre’:
    understand/comprise; Corsican: capisce/cumprende
  • ‘vol’: flight/theft; Corsican: bulu/arrubecciu
  • ‘voler’: fly/steal; Corsican: bulà/arrubà
  • ‘échecs’: chess/failures; Corsican: scacchi/fiaschi
  • ‘palais’: palace/palaces/palate/palates; Corsican: palazzu/palazzi/palate/palates

In the background, the unresolved threefold ambiguity of French ‘partie’ = parti/partita/partita (part/game/gone) is lurking…