Classification Theory


After a long analysis of the over hundred questions at UKLO, we have developed a theory of classification that can logically and precisely sort and organise them. There are theoretical assumptions that have been made:

  • Linguistics as a main subject can be divided into smaller subjects.
    • We have observed this not only in linguistics, but in fact every teachable subject. These smaller subjects in linguistics are better known by their names: E.g. Morphology, Phonetics, Phonology, Pragmatics, Semantics and Syntax. They form the main branches (smaller subjects) of Linguistics.
    • In its very fine details from original implementation, a few borderline cases had been observed where certain subject matter may sit in-between two smaller subjects or more. We had found these cases are not a theoretical consequence from the subject (which was originally hypothesised) but due to the constraints and construction of exam questions that allows for a borderline case. In such cases decision sorting trees have been proposed for each of the smaller subjects to run through each question. This results in a branching where borderline cases may be treated as belonging to all of those smaller subjects, or just some of them, or just one of them. We have found this to no longer be a problem anymore.
  • The theory assumes that every question from UKLO is and will be related to linguistics. There are consequences from this:
    • Every question at least relates to one linguistics smaller subject. This follows from the previous assumption that linguistics is divisible into its smaller counterparts.
    • Therefore every question may relate to more than one linguistics smaller subject
  • A classification system should describe the unchangeable components of its members
    • This makes the classification system a description of the question itself. It removes ambiguity and fluctuation of descriptions.
    • If applied to this chair, the unchangeable components (without exterior alteration) of this chair is its height, width, weight (ignoring negligible natural decay), number of legs, its material (wood), time taken for completion. A classification system of this sort would not be interested in changeable components ( e.g. age, the number of people who have sat in it) or changeable exteriors (e.g. public appeal, voting or ratings, ‘shininess’, ‘how comfortable it is’). In the case of a physical alteration, it would be a permanent change, and a reclassification would be made, discarding its previous.
    • We have found this sturdy-style classification system is most appropriate for UKLO questions that are stable, largely unchangeable, non-animate, and discuss logical and description-driven conclusions.
    • It is possible that once all unchangeable descriptions and foundations are specified for UKLO questions ( which is the current challenge), different classifications are possible, but at this stage of research and development, it would not be recommended. General maintenance of the classification system must also be taken into account, so that what remains is undebatable and ultimately helpful.
    • This does not eliminate creative descriptors that would be outside the classification system
  • Therefore a classification system for UKLO questions should include its linguistics subject, as well as other unchangeable aspects that are relevant enough to be helpful for selection.
    • We have found 3 other unchangeable aspects: Question Format, Volume and Texture, as well as Theme

Classification Process

To successfully classify a question, a judgement is made on 4 key aspects found in every UKLO question in this order:

  1. Linguistics Formula from the UKLO Periodic Table ( 63 elements theoretically possible, 17 found)
  2. Question Format (3 found)
  3. Volume and Texture (4 found)
  4. Theme (7 found)

1.Linguistics Formula

Every UKLO question has a linguistics formula which states the presence of (or application of) a linguistics subject(s) in its question. They are grouped together in alphabetical order e.g. MoPhePho = The question has applied Morphology, Phonetics and Phonology in its problem solving aspect. See the Glossary of Terms below for a full description of these subjects and how they have been classified for UKLO questions.

Some formulae will have a bolded section ( e.g. MoPhe), this means that the question mostly concerns application of the bolded section (in this example, the question is mostly focused on morphology).

For search box purposes, the formula has changed syntactically, so every linguistics subject must have an asterisk before it ( *Mo*Phe), then an underscore is placed before and after it for direct searches (_*Mo*Phe_)

2. Question Format

There are three different question formats for UKLO, they ultimately fall into 2 different categories: (One where the answer is given on the question paper, and one where the answer is not given on the question paper).

The answer is not given on the question paper:

Answer: This means that the student must write in their own answer to a question ( usually in an empty box) using the data given to them.

The answer is given on the question paper:

Match-up: Questions that involve matching a known Set A with another known Set B. Here the answers are given, but jumbled up, the question asks you to match them up.

Multiple-Choice: Questions that involve ticking/crossing/selecting a correct answer out of a list of possible answers. It could also be selecting an option out of a variety of given options.

N.b. Match-up and Multiple-Choice Questions are different by design,  a correctly answered match-up cannot have any unpaired elements ( it is a bijection), a multiple-choice leaves out wrong answers.

Some UKLO Questions have a mix of these 3 question formats:

Mixed: Where the exam paper includes more than one question format ( 4 possible)

Answer and Match-up

Answer and Multiple-Choice

Match-up and Multiple-Choice

Answer, Match-up and Multiple-Choice

3. Volume and Texture

This answers two questions: The first answers the question of texture: Does the question data use the Latin alphabet ( including IPA, and numerals in this classification) or not? The second answers the question of volume: If it’s written using the Latin alphabet ( including IPA, and numerals in this classification), how long is the data set and the answers required by students to give?

It uses the latin alphabet, IPA, or numerals

These categories have no further embellishment in terms of its script use.

Words: The question data and the required Answer is 1 – 3 words long. This category also includes determiner phrases (e.g. your houses) and their translations. These also include affixes (e.g. answer: -s for plural). As we define for UKLO questions purposes only: 1 word becomes 2 words when there is a space between them.

Sentences: The question data and the required answer is sentences (3+ words). N.b. the definition of a sentence is not discussed here, the word ‘sentence’ has been used to mean ‘long string’ where we define a string as long at 3+ words. This threshold is hypothesised and more meant to highlight the change in problem solving aspect as you approach past 3 words, problem solving becomes more sentential and more involved in different cognitive tasks, as well as when strings are generally recognised as sentences.  

Words & Sentences: The question and required answer has a mixture of both words and sentences n.b. if there were 20 instances of sentences and one word, it may be stated as sentences and not be picked up by the classification. It usually classifies a substantial mixture of both.

It does not use the latin alphabet, IPA, or numerals

Writing: Characterised by a foreign script. The question or the required answer is written in a foreign script. This is a sensitive measure, any data set with a foreign script gets assigned Writing.

So far, there has not been a need for a volume measure for writing questions, this may change in the future


Some UKLO Questions are characterised by a theme. A theme is another subject area or topic area being applied to the question. N.b. a theme must have some linguistic significance ( e.g. a question about fishing won’t have ‘fishing’ qualified as a theme, unless there was something linguistically relevant, e.g. fishing technique grammar by Narwhals (for argument sake) ). It can be seen as a form of application to real-life circumstances. There are many:

Encrypted: These questions are thematic, involving decoding a secret message. These questions have been found to not easily fall into any other category.

Maps: Questions that deal with topological space, diagrammatical representation by relation or distance, networks and navigation. This is a cognitive category that deals with spatial reasoning and orientation in a topological map or space. This includes questions that have locational maps (like train maps, maps of cities), with varying degrees of involvement with spatial reasoning, given the general label ‘Maps’. There are two other subcategories of Maps:

  • Maps (Grid): Questions involving moving on a grid map ( or topological grid) and deals with navigation and perspective
  • Maps (Family): Questions involving family tree maps and networks (kinships), deals with relational reasoning up and down the tree, and the language terminology to represent them.

Numbers: Tasks that involve calculation or semantic matching of Numbers that are required in the question. These also include numbers in equations as well as number bases. The vast majority of Numbers questions are Sentences

Senses and Feelings: Linguistic content to describe emotional concepts or senses ( like smells, sounds)

No Theme (N/A): A question centered on the main subjects of linguistics

Updated Report

The new classification system offers a more dynamic and detailed classification of questions. It’s also simpler and breaks the questions down to its most basic forms. It removes grey areas from the previous system ( e.g. hybrids and hybrid mutation, Combination Tables have been removed, Multiple-choice not being given a volume etc.. ). We have tested the system and it has successfully classified all questions comfortably. Many holes in the previous system have been patched ( e.g. the indeterministic C Class), and the system is now theoretically complete. It also drafts out all possibilities and gives room for question diversity expansionism.

Glossary of Terms

Linguistics Subjects:

Disclaimer: definitions in ” ” are formal definitions of the subject, the following sentences is how they have been interpreted in the context of UKLO questions and how they have been classified in the table.

Morphology (Mo): “the study of the forms of words, in particular inflected forms.” . A question that involves working with morphological rules or morpheme translation to solve questions OR is totally thematic, apparent and based in the study of morphology either strongly or loosely.

Phonetics (Phe): “the study and classification of speech sounds.”. A question that involves working with the articulation of speech sounds to solve questions OR is totally thematic, apparent and based in the study of phonetics either strongly or loosely.

Phonology (Pho): “the branch of linguistics that deals with systems of sounds (including or excluding phonetics), within a language or between different languages.” A question that involves working with phonological rules to solve questions OR is totally thematic, apparent and based in the study of phonology either strongly or loosely.

Pragmatics (Pr): “the branch of linguistics dealing with language in use and the contexts in which it is used, including such matters as deixis, the taking of turns in conversation, text organization, presupposition, and implicature.”. A question that involves observing and formulating conclusions based on the use of language in particular social contexts and scenarios to solve questions OR is totally thematic, apparent and based in the study of pragmatics either strongly or loosely.

Syntax (Sy): “the arrangement of words and phrases to create well-formed sentences in a language.”. A question that involves working with syntactic/grammatical rules to solve questions OR is totally thematic, apparent and based in the study of syntax expressed strongly or loosely.

Semantics (Se): “the branch of linguistics and logic concerned with meaning. The two main areas are logical semantics, concerned with matters such as sense and reference and presupposition and implication, and lexical semantics, concerned with the analysis of word meanings and relations between them”. A question that involves word/ symbol-symbol translation, postulating set sizes of the meanings of words/morphemes, or rules that change the metaphysical meaning of a word (this may include, but not exclusive to, word identity, changes in word extension or intension, cause-effect or actionable change)to solve questions OR is totally thematic, apparent and based in the study of semantics either strongly or loosely.

Specific Categorisations

Disclaimer: in attempt to keep categories separate to ease classifications, certain adjustments have been made to accommodate independent classification of categories. We are confident that the classifications here do accurately resemble the common encounters and subject topics you will find in each linguistic subject, but some theoretical adjustments have been made to create a more vibrant and diverse classification. Some topic areas that come with the style of UKLO questions and cognitive reasoning have also been classified under these categories. Subject boundaries in linguistics are heavily discussed, and for more information on the theoretical accuracy of their boundaries in relation to the information below, please contact your local linguist.

Phonology: Consonants, ( Long/Short) Vowels, Syllable Stress, Diacritics, Nasalisation, Assimilation, Word Forming ( through consonant-vowel-consonant deduction), Mutations, dissimilation, deletion, insertion, Tones and Tonal Patterns, Onset and Coda, Vowel Harmony, Phonotactics

Phonetics: Articulation of consonants and vowels from the IPA alphabet.

Pragmatics: Deciphering meaning from the appropriate usage of pronoun in a particular context. Applying word semantics to a mixture of scenarios. 

Morphology: Affixes, Infixes, Noun Compounding, Affixal/Morphemes (a morpheme, as defined here, does not include singular consonants (or diphthongs)/vowels that do not have an associated semantic meaning or grammatical marking attached to it in the language, it is the simplest unit of grammatical meaning or other meaning exterior to its own marking).  Word Formation (through morpheme compounding). Word segmentation. Picture segmentation (in scripts). N.b. if target language is presented in only single words with morphemes and semantic meaning attached, it is categorised as morphology and not syntax.

Semantics: Noun/Verb Translation, Morpheme Translation, Semantic Matching ( e.g. symbol – IPA transliteration or tasks in match-ups/writing scripts), Decoding (Cryptography), Word Formation ( through cognitive/pattern reasoning), Animate/Inanimate Nouns, Classifiers (Shape, Verb), Correspondences in meanings with English (either similar words or pronunciations), Polarity ( positive/negative). Different types of Verbs/Nouns, inherent plurality, polysemy.

Syntax: Word Order, Grammatical Functions of Subjects, Objects and Verbs, Grammatical Case, Focus, Transitivity, Affixes (does not include Affixal/Morpheme Order), Tense, Word formation ( through grammatical affixes), Normalisation, Noun Phrases, Inflectional Rules, Grammatical Gender, Singular and Plural. Chunking, Clauses and Phrases, Adjectives. Definite and Indefinite Articles n.b. if grammatical/morphological rules apply to more than one word, it’s also classed as syntax.


Some questions sit in-between two question styles, here are decision processes that we’ve found help to categorise them (N.b. research is ongoing and may change)

Syntax or Morphology (for classification purposes only)?

Current working hypothesis for questions that are difficult to classify ( only):

One word? Y >> Morphology

N: Affixes only? Y >> Syntax, N >> ?

Sentences or Words?

Does the data set have both words and sentences? Y > Words & Sentences

N> Are the data set and the answers given words? Y > Words

N> Are the data set and the answers given sentences? Y > Sentences

N> Words & Sentences

Morphology or Semantics?

Is there any mention of word parts or working out the parts of words? N > Semantics

Y > Does each data set contain more than one word to translate? Y > Semantics and Morphology

N> Morphology

Phonetics or Phonology?

Is there any application of phonological rules? Y> Phonology (continue)

N>Are there detailed phonetic transcriptions ( characterized by square brackets)? Y > Phonetics/Phonetics and Phonology (if 1st question Y)

N> Is there a mention of the IPA, or a drawing of the IPA Chart? Y > Phonetics /Phonetics and Phonology (if 1st question Y)

N> Is there a mention of the physical pronunciation usually accompanied with diagrams? ( how to articulate sounds in the language, n.b. not simple descriptions like ‘sh like in shoe’ Y > Phonetics/Phonetics and Phonology (if 1st question Y)

N> not Phonetics or Phonology/ just Phonology ( if 1st question Y)