Strategic observation and extraction of knowledge: towards an ontological approach

Corpus analysis strategy

In his research work, [SID 02] elaborated a morphosyntactic analysis platform for information retrieval (IR) and automatic indexing. It is made up of an indexing core (i.e. indexing process) that uses the model of noun phrases (NPs) (or SN, “nominal syntagma”, in French) as descriptors (i.e. indexing concept) of textual information. To use Michel Le Guern’s definition of a NP [LEG 89], placing the word from the lexicon in a discourse universe that places it, de facto, in an extensional logical sequence gives the NP a referential status, segment of the reality that is associated with it. In our context, the NP shows itself to be a carrier of a semantic load that makes it a central element, pertinent for the analysis of informational contents and the identification and then the extraction of knowledge entities. Through this, we regard ourselves as developers of an approach in the field of KO. Our corpus analyses are oriented towards these desired semantics.

Therefore, the grammar of NP recognition is articulated on three logical levels, which are: (1) the intensional level (i.e. properties of the language) is represented by the level N: the considered units are free predicates, simple (i.e. the properties of the noun) or complex (i.e. the properties of the noun modified by other elements: adjectives A’, prepositional expansions (PE), verbs V-inf, etc.); (2) the intermediary level or N’ level (i.e. the consideration of the discourse universe in question) is the transition from the intensional to the extensional; (3) the extensional level or N’’ level (i.e. the NP and its complexity) is the closing operation by means of a quantifier that selects a precise element in the N class of nouns. These are the world’s existing objects, those referred to, or those constructed by thought.

Corpus experiments to verify a NP grammar in its regularity and to consider an interoperable exploitation, if possible, have led us to multiple fields of study: INA, health, nanosciences and material science.

