Tag Archives: nlp

Taxonomy of concepts project

Just opensourced a ‘taxonomy of concepts‘ project. The aim of this project is to create a library in Python based on a taxonomy of commonly used concepts, so that it can be used in projects related to artificial intelligence (AI) and natural language understanding and processing. The library that is part of the module is intended to allow an AI to answer a broad range of questions such as:

  • What is the opposite of generosity?
  • What is the contrary of prodigality?
  • Can you complete the following sentence, where the last word is missing: cowardice and cautiousness are in the same type of relationship as stinginess and …
  • Can you complete the following sentence, where the last word is missing: prodigality and avarice are in the same type of relationship as inclemency and …

Creating new grammatical types

Italian has ‘prepositions followed by articles’ (preposizione articolate). This is a specific grammatical type, which refers to a word (e.g. della) that replaces a preposition (di) followed by an article (la):

	il	lo	l’	la	i	gli	le
di	del	dello	dell’	della	dei	degli	delle
a	al	allo	all’	alla	ai	agli	alle
da	dal	dallo	dall’	dalla	dai	dagli	dalle
in	nel	nello	nell’	nella	nei	negli	nelle
su	sul	sullo	sull’	sulla	sui	sugli	sulle

This specific grammatical type also corresponds to:

  • in French: du = de le, des = de les
  • in Corsican and especially in the Sartenese variant: ‘llu = di lu, ‘lla = di la, etc.

This raises the general problem of the number of grammatical types we should retain. Should we create new grammatical types beyond the classical ones, in order to optimise translators and NLP in general? What is the best grammatical type to retain for ‘prepositions followed by an article’: a new primitive one or a compound one (always keeping Occam’s razor in mind)? A preposition followed by an article behaves like a preposition for words on its left, and like an article for words on its right.

Prototype of text search with optional grammatical type

Inconditional search

Let us expand the idea of text analysis derived from rule-based translation. Above is an example of a classic word-based search. In this particular case, it is the French word ‘été’. This word is ambiguous because it can be a common noun (‘summer’), or a past participle (‘been’). Below is an example of a search for the word ‘summer’ associated with the grammatical type ‘common noun’.

Conditional search based on ‘noun’ grammatical type

Finally, we have below an example of a search for the word ‘summer’ associated with the grammatical type ‘past participle’.

Conditional search based on ‘past participle’ grammatical type

New: Part-of-speech tagger for French language API

I have just published the POS-tagger for French language API, on RapidAPI. The use of the API is free for 1000 requests / month. No training necessary, it works immediately.