109K516-TÜBİTAK-SOBAG
The Control of Verbs’ Corpus and Corpus Based Dictionary in Turkish Vocabulary

SUMMARY

Today, parallel with the technology, the practice of lexicology has reached to the point that elaborates lexemes faster and safer. Herein, that lexicology benefits from corpus linguistics’ principles and procedures for enlarging upon these lexemes has been an important reason. That is, today lexicology and corpus linguistics which are engaged has been regarded as two different linguistics branches.

When published dictionaries which include Turkish Language Vocabulary have been analyzed in terms of lexicology’s principles and procedures, it is seen that these dictionaries are “general purpose and encyclopedic” due to the reasons such as the spellings of lexemes, the deficiency of headword and intra-word explanations, the inadequacy of lexemes’ examples and many of lexemes’ being without examples, the deficiency for labeling word classes, words’ not being in real time and etc.

The reason is that it has not been composed of such a qualified, comprehensive and standard corpus for Turkish yet. As a matter of this, dictionary studies are beyond being corpus based. From this perspective, it has been observed that dictionary studies are also far from the perspective that takes the basis of corpus based studies’ results, regards lexemes and their semantic frequencies, and that considers being real-time and user-friendly.

The aim of this study is both to control head-verbs and intra-verbs real-timely that have been compiled from dictionaries so far in terms of their orthographies from qualified, comprehensive, and standard corpus, head-explanations, examples (synonyms), their labels as word classes and recompose our dictionary entries in the light of the data obtained from this corpus. The results expected to be achieved are as follows:

1. To determine head-verbs and intra-verbs based on their frequencies from corpus which has high representativeness
2. To specify head-verbs and intra-verbs exactly in terms of their orthographic features in Turkish Language Vocabulary.
3. To overcome the deficiencies of head-verbs’ intra-explanations.
4. To overcome the inadequacies of dictionary examples which exist in head-verbs and intra-verbs.
5. To label head-verbs and intra-verbs exactly as word classes
6. To compile the materials which are real-time lexemes for dictionary studies with special purpose (teaching language, collocations, synonymy and homoionym, antonyms, and etc.)
7. Morphologically to compile of head-verbs and intra-verbs real-timely
8. To contribute to studies of lexicology through Corpus Based Dictionary of Verbs in Turkish, and to present the lexicological materials gathered throughout this study in virtual environment for further studies.

The project has three steps as: to form corpus basically, to compile of almost 6.400 head-verbs and intra-verbs which are defined in Turkish vocabulary in all parts, and to publish the results and conclude the study.

As a first step, various texts belonging to Turkish written language and the corpus chosen from internet and used for 30 million wording-sensitivity at total will be digitized via soft wares developed accordingly our purposes, and will be brought to the point that they will be compiled in a lexicological way through a data base application. At the second step, the verbs in Turkish Vocabulary will be gathered together through being controlled from the corpus mentioned above. As a third step, the results will be published in virtual environment as Corpus Based Dictionary of Verbs in Turkish.

(Translated by Nurdan ARMUTÇU)

Place for some configs