Monthly Archives: January 2017

False positives

There are sometimes false positives. Some words should remain untranslated, notably proper names. Interestingly, it is due to the fact that the english word ‘transport’ is the same in french: transport (fr) = transport (en) = trasportu (co).

Proper noun elision

Testing #machine translation now facing new elision problem:

Riventosa (fr) = A Riventosa(co)
proper noun (fr) = definite article + proper noun (co)
it should read: in u paese di A Riventosa (without elision)
Elision rules are not trivial:
le village d’Arbellara (fr) = the village of Arbellara, should be translated as:
u paese d’Arbellara (with elision)

Adjective accordance

Rule-based translation : adjective accordance : interesting stuff:
sur les réseaux japonais et américain (fr) = annantu à e rete sgiappunesa è americana (co) = on the japanese and american networks (en)
noun (networks) is plural but adjectives (japanese and american) are singular

The excerpt refers to the ‘single) japanese network and the (single) american network
I guess (to be confirmed?) that the following sentence is ambiguous in english:
‘the japanese and american networks’: are there one or several japanese network(s)?
are there one or several american network(s)?

Gender reversal

Now handling gender reversal:

– mer (FR, feminine) = sea (EN) = mare (CO, masculine)
– saveur (FR, feminine) = flavor (EN) = sapore (CO, masculine)
– liqueur (FR, feminine) = liquor (EN) = licore (CO, masculine)
‘c’est une bonne liqueur’ (FR) = ‘it is a good liquor’ (EN) = hè un bonu licore (CO) requires gender reversal of : definite article ‘un’ (instead of ‘una‘) + adjective bonu (instead of  ‘bona‘)
‘la mer est belle’ (FR) = ‘the sea is beautiful’ (EN) = u mare hè bellu (CO) requires gender reversal of : definite article ‘u‘ (instead of ‘a‘) + adjective bellu (instead of ‘bella‘)

u mare hè bellu requires uppercase and should be written: U mare hè bellu.

Verbal locutions

Introducing new feature for #MachineTranslation:
some verbal locutions:
prendre d’assaut = assaltà
mettre à sac = sacchighjà
prendre au collet = incappià

Semantic disambiguation

Now considering the issue of Semantical disambiguation.
Some instances For French to Corsican are:
– ‘défense’ = sanna/difesa = tusk/defense
– ‘vol’ = bulu/furtu = flight/theft
– ‘comprend’ = capisce/cumprende = understands/comprises
– ‘palais’ = palatu/palazzu = palate/palace
– ‘expérience’ = sperienza/sperimentu = experience/experiment

Threefold ambiguity: French ‘nouvelle’

Let us mention the issue of threefold ambiguity: french ‘nouvelle’ can translate into:
nutizia‘ (‘piece of news’)
or ‘nuvella‘ (short story’)
or ‘nova‘ (‘new’)

The disambiguation between ‘nutizia‘ (‘piece of news’)
or ‘nuvella‘ (short story’) is semantic (hard)
while the disambiguation between ‘nutizia‘/’nuvella‘ (noun)
and ‘nova‘ (adjective) is grammatical (medium difficulty)

Further reflections on the definition of ‘above human level’ translation

Some further reflections on the definition of ‘above human level’ translation:
– the answer may not be based solely on the quantitative side, being of the type: ‘above 96%’, “above 97%’, ‘above 98%’, etc.
– it seems the answer should also incorporate insights from the qualitative side, i.e. not containing gross translation errors.
Semantic disambiguation errors would most oftenbe termed ‘gross errors’: for example, translating ‘défense d’éléphant’ into ‘elephant’s defense’ instead of ‘elephant’s tusk’
– to fix ideas, it could be proposed: ‘above human level’ =
above 98% AND without gross errors

Defining an instance of Feigenbaum test

Defining an instance of Feigenbaum test (from wikipedia: generally defined as a variant of the Turing test where a computer software attemps to imitate a human expert in a given field): Translating French into Corsican.
We expect 98% accuracy and lack of gross errors in order to pass this Feigenbaum test.