E-MELD Homepage Wayne State University Homepage Eastern Michigan University Homepage

More on COPEs

What does a COPE do?

A COPE provides computational tractability between a representation of the linguistic knowledge instantiated in GOLD and a representation more easily accessible to a given community of practitioners. It allows practitioners to relabel GOLD concepts with suitable labels, express constraints on the combinatorics of GOLD concepts as appropriate (eg binding person/number). A COPE may have more community specific terms not in GOLD. A COPE may equally collapse some of the distinctions or not express all of the possibilities of GOLD.

Is this the only reason we need COPEs?

No, COPEs will also allow the ontology to handle competing definitions of concepts and concepts which have different degrees of importance for different sets of users.

If a linguistic ontology such as GOLD is to be useful for the linguistic community, it must provide an explicit semantics for every concept that is necessary for the kinds of linguistic descriptions that researchers wish to be make available for worldwide collaborative research. This does not mean that every linguistic concept that is used in descriptive work must be defined in GOLD. For the GOLD Community as a whole, certain concepts are so fundamental and common across linguistic descriptions that we must seek to define them as carefully as possible within GOLD, obtain consensus from the whole community that we've got them basically right, and disseminate them so that they are available to every researcher both for preparing their own descriptions and for understanding and evaluating others. Other concepts may feature prominently in some, even many, descriptions, but may not be as fundamental as others. It is important that explicit semantics be developed for them as well, but it may be useful to allow competing definitions to be available in case consensus cannot be quickly achieved, as long as a mechanism is in place to make clear what definition is being associated with what concept. We envision the development of the semantics for this second class of concepts to take place within specific communities of practice within the field. The results of these efforts would be codified in community of practice extensions (COPEs) to GOLD. If the entire GOLD community subsequently agrees that a particular COPE can be accepted for the field as a whole, it can then be incorporated as part of GOLD itself.

What concepts should be handled in GOLD itself, and what should be handled in a COPE?

Among the fundamental linguistic concepts requiring characterization in GOLD is that of GrammaticalFeature. GrammaticalFeatures (GFs) are properties of LinguisticExpressions (LEs) that are manifested by the roles they play in the grammatical analysis of the language containing those LEs; alternatively GFs account for the occurrences of those LEs in that language viewed as a set of LEs. Each GF is associated with a domain (e.g. Number, Gender, Case, Tense, Aspect), which models a natural (but possibly abstract) domain (e.g. PositiveInteger, Sex, Role, Time, Event). In a particular language, each GF defined for it has a set of values that we tentatively assume (pending confirmation or disconfirmation from the community) partition its domain; that is, the values taken together exhaust the domain, and do so without overlap. For any GF in a language, its set of values may be called a grammatical feature system (GFS). A GFS, then, is a partition of a particular GF space. We consider the role of GOLD to define the class of GFs for language in general, including identifying the natural domains that they model, as well as the concept of a GFS, but leave it to individual communities of practice to characterize using extensions (COPEs) and profiles which particular GFSs occur and how they behave.

How would COPEs handle particular grammatical feature systems (GFS)?

Suppose languages A and B each have two distinct values for the GF Number. In A, the values are identified as Singular and Plural, and in B as Paucal and Multal. Without reference to natural domain to model, such as that provided by the PositiveInteger domain, we would not be able to distinguish GFSs, except by name. However we do not have to go so far as to identify these values with any particular semantics in the arithmetic domain, such as that Singular means 'one', Paucal means 'a small number', etc., in order to justify the choice of names. Instead, we may say that the GF values retain the same structural (logical) relationships as their natural counterparts, so that the Number value Singular entails Paucal but not conversely, and Multal entails Plural but not conversely. These relations can be diagrammed as follows as a partial description of the Number space for language in general.

The GFSs consisting of {Singular, Plural} and {Paucal, Multal} are the only non-trivial partitions of the Number space as described so far ({HasNumber} is a trivial partition), and so are likely candidates for inclusion in COPEs.

Continuing this illustration one step further, suppose in language A, both the binary GFS {Singular, Plural} and the ternary GFS {Singular, Dual, Multal} occur, and that Plural serves as the disjunction of Dual with Multal, a property that emerges from the fact that a verb marked Plural agrees with a nominal that is marked either Dual or Multal. This observation results in the slightly fuller, but presumably still partial, description of the Number space for language in general.

In the PositiveInteger domain, there is also a concept corresponding to the disjunction of Singular with Multal, which might be called NonDual, but that does not mean that the concept is also required for the Number domain. If languages don't use it, it can be omitted from the description of the Number space for language in general. Not every PositiveInteger concept has to have a grammatical Number counterpart. In fact, very few do.

This brief account is meant only as an illustration of the kind of work that will be needed to result in the creation and maintenance of a knowledge base for linguistic description, that has at its core an ontology that explicitly defines the central and essential concepts of linguistic description for the field as a whole, and extensions that define the remaining concepts for use, at least initially, within specific communities of practice.