Explaining computerized English testing in plain English

ɫèAV Languages
a pair of hands typing at a laptop

Research has shown that automated scoring can give more reliable and objective results than human examiners when evaluating a person’s mastery of English. This is because an automated scoring system is impartial, unlike humans, who can be influenced by irrelevant factors such as a test taker’s appearance or body language. Additionally, automated scoring treats regional accents equally, unlike human examiners who may favor accents they are more familiar with. Automated scoring also allows individual features of a spoken or written test question response to be analyzed independent of one another, so that a weakness in one area of language does not affect the scoring of other areas.

was created in response to the demand for a more accurate, objective, secure and relevant test of English. Our automated scoring system is a central feature of the test, and vital to ensuring the delivery of accurate, objective and relevant results – no matter who the test-taker is or where the test is taken.

Development and validation of the scoring system to ensure accuracy

PTE Academic’s automated scoring system was developed after extensive research and field testing. A prototype test was developed and administered to a sample of more than 10,000 test takers from 158 different countries, speaking 126 different native languages. This data was collected and used to train the automated scoring engines for both the written and spoken PTE Academic items.

To do this, multiple trained human markers assess each answer. Those results are used as the training material for machine learning algorithms, similar to those used by systems like Google Search or Apple’s Siri. The model makes initial guesses as to the scores each response should get, then consults the actual scores to see well how it did, adjusts itself in a few directions, then goes through the training set over and over again, adjusting and improving until it arrives at a maximally correct solution – a solution that ideally gets very close to predicting the set of human ratings.

Once trained up and performing at a high level, this model is used as a marking algorithm, able to score new responses just like human markers would. Correlations between scores given by this system and trained human markers are quite high. The standard error of measurement between ɫèAV’s system and a human rater is less than that between one human rater and another – in other words, the machine scores are more accurate than those given by a pair of human raters, because much of the bias and unreliability has been squeezed out of them. In general, you can think of a machine scoring system as one that takes the best stuff out of human ratings, then acts like an idealized human marker.

ɫèAV conducts scoring validation studies to ensure that the machine scores are consistently comparable to ratings given by skilled human raters. Here, a new set of test-taker responses (never seen by the machine) are scored by both human raters and by the automated scoring system. Research has demonstrated that the automated scoring technology underlying PTE Academic produces scores comparable to those obtained from careful human experts. This means that the automated system “acts” like a human rater when assessing test takers’ language skills, but does so with a machine's precision, consistency and objectivity.

Scoring speaking responses with ɫèAV’s Ordinate technology

The spoken portion of PTE Academic is automatically scored using ɫèAV’s Ordinate technology. Ordinate technology results from years of research in speech recognition, statistical modeling, linguistics and testing theory. The technology uses a proprietary speech processing system that is specifically designed to analyze and automatically score speech from fluent and second-language English speakers. The Ordinate scoring system collects hundreds of pieces of information from the test takers’ spoken responses in addition to just the words, such as pace, timing and rhythm, as well as the power of their voice, emphasis, intonation and accuracy of pronunciation. It is trained to recognize even somewhat mispronounced words, and quickly evaluates the content, relevance and coherence of the response. In particular, the meaning of the spoken response is evaluated, making it possible for these models to assess whether or not what was said deserves a high score.

Scoring writing responses with Intelligent Essay Assessor™ (IEA)

The written portion of PTE Academic is scored using the Intelligent Essay Assessor™ (IEA), an automated scoring tool powered by ɫèAV’s state-of-the-art Knowledge Analysis Technologies™ (KAT) engine. Based on more than 20 years of research and development, the KAT engine automatically evaluates the meaning of text, such as an essay written by a student in response to a particular prompt. The KAT engine evaluates writing as accurately as skilled human raters using a proprietary application of the mathematical approach known as Latent Semantic Analysis (LSA). LSA evaluates the meaning of language by analyzing large bodies of relevant text and their meanings. Therefore, using LSA, the KAT engine can understand the meaning of text much like a human.

What aspects of English does PTE Academic assess?

Written scoring

Spoken scoring

  • Word choice
  • Grammar and mechanics
  • Progression of ideas
  • Organization
  • Style, tone
  • Paragraph structure
  • Development, coherence
  • Point of view
  • Task completion
  • Sentence mastery
  • Content
  • Vocabulary
  • Accuracy
  • Pronunciation
  • Intonation
  • Fluency
  • Expressiveness
  • Pragmatics

More blogs from ɫèAV

  • A teacher sat with students reading

    5 STEAM myths debunked

    By Sarah Hillyard
    Reading time: 4 minutes

    STEAM (Science, Technology, Engineering, Art and Maths) sounds like an overwhelming combination of subjects to teach – and only suitable for expert educators. But the reality is doing STEAM is simpler than you think. Here are 5 common STEAM myths and the truth behind them. We also outline a number of simple activities you try with your students.

    1. STEAM requires a lot of time

    STEAM projects encourage curiosity, creativity and collaboration in the classroom – but they have a reputation for being preparation-heavy and for requiring a lot of teaching time and energy.

    But to get the full benefit of STEAM, there’s no need to plan out a full-blown project that lasts a whole month. In fact, you might integrate just one STEAM lesson into your syllabus. Or a lesson could contain a one-off 10-minute STEAM challenge.

    Here are some easy, low-preparation challenges your classes can take part in:

    10-minute STEAM challenges:

    • Winter unit: How tall can you build a snowman using paper cups?
    • Shapes theme: Using five toothpicks make a pentagon, two triangles, a letter of the alphabet.
    • Bug project: Can you create a symmetrical butterfly?

    2. You need fancy materials to do STEAM

    The biggest misconception is around technology. When you think of STEAM, you might imagine you need apps, computers, tablets and robots to teach it successfully. It’s true that you will certainly find STEAM challenges out there that involve extensive supply lists, expensive equipment, knowledge of programming and robotics.

    However, in reality, you probably have everything you need already. Technology doesn’t have to be expensive or complicated. It can refer to simple, non-electronic tools and machines, too. Think funnels, measuring cups and screwdrivers, for example. You can use low-cost regular classroom or household items and recyclable materials that learners' families can donate. Toilet paper rolls and cardboard boxes are very popular items in STEAM.

    Here is a low-tech activity you can try:

    Combine engineering, art and math using cardboard and a pair of scissors

    This challenge involves creating 3D self-portrait sculptures in the using only cardboard. First, teach about parts of the face by observing and analyzing some Cubist portraits (eg, explore Georges Braque and Pablo Picasso). Then have learners cut out cardboard shapes and make slits in them to attach together. They create their self-portrait sculptures by fitting the pieces together using the slits so that the final product will stand by itself. Display the self-portraits and talk about them.

    3. STEAM is targeted to older learners

    Young children are naturally curious about the world around them, and STEAM experiences begin very early in life. They explore with their senses and test their hypotheses about the world, just like scientists do. Much of their play is based on engineering skills, such as building houses with LEGO® bricks. They learn to manipulate tools while they develop their fine motor skills and their awareness of non-electronic technology. They use dramatic play and enjoy getting their hands full of paint while engaged in art. They learn about maths concepts very early on, such as size (big and small toys), quantities of things, and even babies start using the word “more” if they’re still hungry.

    Check out this simple STEAM experiment to learn about plants and their needs.

    How do plants eat and drink?

    Have students put water and food dye in a pot. Put a white flower in the water. Ask students to guess what will happen.

    After a few days, students should check their flowers and observe how it has changed color. They must then record their results. Extend the experiment by asking if they can make their flowers two colors.