There are 10 columns in the total lexicon that correspond to 10 variables: Frequency Counts (F), Zipf, Frequency per million(F/m), Log Frequency per million(LogF/m), Zipf, Dispersion (D), Estimated frequency per million (U), Standard Frequency Index (SFI), Contextual Diversity (CD), orthographic and phonological Levenshtein distance (OLD20), orthographic and phonological neighbourhood (based on Coltheart’s N), number of Letters, syllables, number of syllables, stress position, phonemic transcription, number of phonemes, unstressed orthographic and phonological form, pronunciation (sound files), translation in English.
The users can apply search criteria and choose filter variables, paste lists of words and extract results to csv files or do word by word search. They can also download the lexicons as excel files.
Frequency Counts (F): raw number of word occurrences in the textbooks.
Zipf: This is a standardised frequency value expressed in a logarithmic scale first introduced by vanHeuven et al (2014). The scale ranges from 1 (very low frequency) to 7 (very high frequency).
Frequency per million (Freq_pm): is calculated as Frequency Counts * 1,000,000/Number of tokens.
Orthographic Levenshtein Distance (OLD20 Orth): This is the average number of substitutions, deletions and additions between a word and its 20 closest orthographic neighbours that are necessary to turn one word into another.
Phonological Levenshtein Distance (OLD20 Phon): This is the average number of substitutions, deletions and additions between a word and its 20 closest phonological neighbours that are necessary to turn one word into another.
Orthographic Neighborhood (ColtNorth): This is the number of words produced when replacing any letter in a word with another letter on the same position (Coltheart et al., 1977).
Phonological Neighborhood (ColtPhon): This is the number of words produced when replacing any phoneme in a word with another phoneme on the same position (Coltheart et al., 1977).
Number of letters (letter_length): The number of letters within a word (word length)
Syllables: Syllabified wordsNumber of syllables (syl_length): Number of syllables per syllabified word
Stress position (stress_pos): numbered position of stress (1 = stress on ultimate position, 2 = stress on penultimate position, 3 = stress on ante-penultimate position)
Phonemes: IPA Phonemic transcription of wordsNumber of Phonemes (phon_length): Number of phonemes per word
Unstressed orthographic and phonological form (unstressed_orth, unstressed_phon): Orthographic and phonological form of the word without stress
Sounds: words pronounced by Greek native speakers.
Translations: The first three translations as they appear in Oxford Greek-English Learner's Dictionary (Stavropoulos, 2008)