UXit Documentation

Methodology

Overview

This methodology defines how the system evaluates an interface against a structured set of usability and design rules. Each rule has a clear outcome; either it passes or it fails. The resulting score shows the percentage of rules that were met during evaluation. It reflects how well the interface aligns with established standards and expectations. This approach does not rely on opinion or interpretation. It provides a stable, repeatable measure of conformance. The results help identify where the design meets requirements, where it falls short, and how it changes over time.

Scoring Model

Each guideline is marked with one of the following:

  • Pass = 1
  • Fail = 0
  • N/A and Unanswered = excluded from scoring

Only Pass and Fail are counted in the score calculation.

Category Scoring

For each category:

Let:

  • Pass = number of guidelines marked Pass
  • Fail = number of guidelines marked Fail

Then: CategoryScore = Pass / (Pass + Fail)

Example:

Pass = 6  
Fail = 2  
Factored: CategoryScore = 6 / (6 + 2) = 0.75 = 75%

If Pass + Fail = 0, the category is excluded from aggregation.

Overall Score

If there are k valid category scores:

OverallScore = (Sum of all CategoryScores) ÷ k

The result is expressed as a percentage.

Flat Score Variant

Without categories:

Let:

  • TotalPass = total number of Pass across all guidelines
  • TotalFail = total number of Fail across all guidelines

If categories are ignored:

Score = TotalPass ÷ (TotalPass + TotalFail)

Category scores:

A = 4 / (4 + 1) = 0.80
B = 2 / (2 + 2) = 0.50
C = 3 / (3 + 0) = 1.00
OverallScore = (0.80 + 0.50 + 1.00) / 3 = 0.7667 = 76.67%

Worked Example

CategoryPassFailN/A
A410
B221
C302

Grade Thresholds

GradeInterval
A90 to 100%
B80 to 89.9%
C70 to 79.9%
D60 to 69.9%
F0 to 59.9%

Evaluation Method

A reviewer goes through each guideline and marks whether the current interface satisfies it. Each guideline is treated as a yes-or-no check:

gᵢ(flow) ∈ {0, 1}

The final score is the average of these binary outcomes across the set.

Binary Rules

Using binary values keeps results consistent, traceable, and easy to compare. There’s no subjectivity, scaling ambiguity, or weighting. Every failure can be traced back to a specific guideline. This allows full traceability to individual failures and enables linear aggregation over time. As your evaluations adhere to similar guideline sets for similar flows, adhearance improvements can be tracked for refactoring where needed.

What the Score Represents

  • Percentage of defined rules that the interface satisfies
  • Clear signal of how closely the interface matches standards
  • Stable metric for comparing the same system over time
  • Directional trend to track improvement or regression

What the Score Does Not Represent

  • User satisfaction or emotional response
  • Perceived ease of use or aesthetic appeal
  • Efficiency, speed, or task success
  • Cognitive demand or user behavior patterns

Using Results

  • Focus on changes over time rather than single scores
  • Review failed items to understand specific gaps
  • Keep older evaluations to track trends and regressions
  • Use the score to guide decisions, not to define success or failure

On this page