The paper proposes formal inductive definitions for linguistic feature structures (FSs) taking values within a class of value types or sorts: single, disjunctive, (ordered) lists, multisets (or bags), po-multisets (multisets embedded into a partially ordered set), and indexed (re-entrance) values. The linguistic realization (semantics) of the considered sorts is proposed. The FSs having these multi-sort values are organized as (rooted) directed acyclic graphs. The concrete model of the FSs we had in mind for our set-theoretic definitions are the FSs used within the well-known HPSG linguistic theory. Set-theoretic general definitions for the proposed multi-sort FSs are defined. These constructive definitions start from atomic values and build recurrent multi-sorted values and structures, providing naturally a fixed-point semantics of the obtained FSs as a counterpart to the large class of logical semantics models on FSs. The linguistic unification algorithm based on tableau-subsumption is outlined. The Prolog code of the unification algorithm is provided and results of running it on some of the main multi-sort FSs is enclosed in the appendices. We consider the proposed formal approach to FSs definitions and unification as necessary steps to set-theoretical implementations of natural language processing systems.
Neculai Curteanu, Paul-Gabriel Holban,
Research Institute of Computer Science,
Romanian Academy, Iasi Branch
B-dul Carol I, No. 22A, 6600, Iasi, Romania
– 0.25 Mb