Feigenbaum test and semantic disambiguation

Now it is patent that there cannot be successful  Feigenbaum test (i.e. not only occasional Feigenbaum hits, but regular and average performance) without an adequate treatment of semantic disambiguation. Arguably, it is one hard problem of machine translation. Here are some typical instances:

  • ‘défense’: defense/tusk; Corsican: difesa/sanna
  • ‘fils’: sons/wires; Corsican: figlioli/fili
  • ‘comprendre’:
    understand/comprise; Corsican: capisce/cumprende
  • ‘vol’: flight/theft; Corsican: bulu/arrubecciu
  • ‘voler’: fly/steal; Corsican: bulà/arrubà
  • ‘échecs’: chess/failures; Corsican: scacchi/fiaschi
  • and the fourfold ambiguous ‘palais’: palace/palaces/palate/palates; Corsican: palazzu/palazzi/palate/palates

In short: no successful semantic disambiguation = no genuine successful  Feigenbaum test. Semantic disambiguation engine needs to be rewritten.

Double adjective accordance: scoring 98.43%

Now scoring 1 – 2/128 = 98.43%. There are only two related errors, of a special case of adjective accordance: ‘aux xxie et XXe siècles’  (in the 21st and 20th centuries) should translate into: à i XIXu è XXu seculi. There are 3 ambiguous words here:

  • ‘aux’ i.e. ‘à les’ (in the): à i (masculine plural)/à e (feminine plural)
  • ‘xxie’ i.e. ‘vingt-et-unième’ (21st): XIXu (masculine singular)/XIXa (feminine singular)
  • ‘xxe’ i.e. ‘vingtième’ (20th): XXu (masculine singular)/XXa (feminine singular)

Proper accordance should be performed as follows:

  • ‘aux’ : à i (masculine plural): depends on ‘siècles’ (centuries), masculine plural
  • ‘xxie’ i.e. vingt-et-unième (21st): XIXu (masculine singular)
  • ‘xxe’ i.e. vingtième (20th): XXu (masculine singular)

Of the same type are:

  • ‘les langues italienne et française’: e lingue taliana è francesathe Italian and French languages (English is ambiguous in this case, since ‘les langues italiennes et françaises’ translate the same, although the meaning is different, referring explicitly to the several varieties of Italian anf French languages. In French, the ambiguity only concerns oral text, since the written sentence is unambiguous. In Corsican language, both written and oral sentences are unambiguous.)
  • ‘les codes pénal et civil’: i codici penale è civilethe penal and civil codes

Now should it be considered an instance of a successful Feigenbaum test? Arguably, yes (although this is debatable). These two errors can not be considered as gross errors, from a Feigenbaum test perspective. They can be considered as some errors a human could do.

But caution: at present time, this is only one exceptional case of successful instance. Call it Feigenbaum hit. What we are intested in is regular successful  Feigenbaum test. For the moment the software is not capable of that. New target: 99% and/or more frequent successful Feigenbaum hits.

The disambiguation of French ‘fils’ again: scoring 98.42%

Scoring 1 – 2/127 = 98.42%. Of interest:

  • ‘de 839 à sa mort’ (from 839 to his death) should read: da u 839 à a so morte. French ‘de’ translates either into di or into da in Corsican language (to simplify matters, since in certain cases, being a partitive article, it translates into nothing).
  • now we face again the multi-ambiguous French ‘fils’, which can translate into: i) figliolu, masculine, singular (son) ii) figlioli, masculine, plural (sons) iii) fili, masculine, plural (wire/wires). In the present case, ‘Fils du roi…’ should translate Figliolu di u rè… (Son of King…).

To notice: five consecutive 100% sentences.

With regard to the Feigenbaum test: failed again. Arguably, the first error is of an acceptable kind, in this context. But the ‘fils’ error is a gross one, that a human would not do…