COURSE

AI Reliability and Critical Evaluation

Strengthen logic, scepticism and judgement for working with AI-generated code, documentation and analysis through hands-on exercises.

  • 1 Day
  • All Levels
  • In-person / Online
  • £ On Request

Your team will learn...

Assess the reliability and accuracy of AI-generated outputs systematically

Identify bias, errors and hallucinations in AI responses

Apply structured thinking to complex problems with AI assistance

Make sound decisions when working alongside AI tools

Build critical evaluation reflexes for AI-generated code and documentation

Distinguish between what AI does well and where human judgement excels

Develop scepticism and verification habits for AI outputs

Overview

AI tools produce increasingly convincing outputs. Code that looks correct but contains subtle bugs. Documentation that reads well but misses critical details. Analysis that sounds authoritative but rests on flawed assumptions. As these tools become more sophisticated, the ability to think critically about their outputs becomes more essential, not less.

This intensive one-day workshop develops the reasoning skills engineers need to work effectively with AI systems. Through hands-on exercises that mirror real-world scenarios, participants learn practical techniques for assessing the reliability and bias of AI outputs, applying structured thinking to complex problems and making sound decisions when working alongside AI tools.

This is not a course about testing AI applications - that's covered in our Testing in a GenAI World workshop. This is about building the critical thinking capabilities that enable you to evaluate AI-generated code, documentation, analysis and recommendations effectively. It's about strengthening the human judgement that automated tests cannot replace.

By the end of this workshop, you'll have practical frameworks for critical evaluation, reflexes for spotting issues in AI outputs and the confidence to trust your own judgement when AI confidently suggests something incorrect.

Outline

Foundation: What AI is and isn't

Foundations of critical thinking with AI

  • Why critical evaluation matters more as AI improves
  • The cognitive biases that make us accept AI outputs
  • Building scepticism whilst maintaining productivity
  • The human judgement that remains irreplaceable

Understanding AI capabilities and limitations

  • What LLMs actually do vs what they appear to do
  • The probabilistic nature of LLMs
  • Understanding how context windows work
  • Pattern matching vs true understanding
  • When AI excels and where it struggles
  • Understanding confidence vs correctness

Evaluation skills: Assessing outputs

Systematic evaluation of AI-generated code

  • Framework for assessing code quality
  • Identifying subtle bugs and security vulnerabilities
  • Evaluating performance and maintainability implications

Evaluating AI-generated documentation

  • Assessing technical accuracy
  • Identifying missing context and edge cases
  • Verifying examples and usage patterns

Critical analysis of AI recommendations

  • Evaluating architectural suggestions
  • Assessing design pattern appropriateness
  • Validating best practice claims
  • Building intuition for sound advice

Structured problem-solving with AI

  • Applying first principles thinking
  • Breaking complex problems into verifiable steps
  • Using AI for exploration whilst maintaining rigour

Detection skills: Identifying issues

Detecting hallucinations and fabrications

  • Recognising when AI invents information
  • Identifying non-existent APIs and libraries
  • Verifying technical claims systematically

Bias identification in AI outputs

  • Understanding different types of AI bias
  • Recognising biased assumptions in code and analysis
  • Mitigating bias through prompt engineering

Decision-making: Verification and judgement

Verification strategies

  • Techniques for validating AI claims
  • Using multiple sources and perspectives
  • Running experiments to verify suggestions
  • Building efficient verification workflows

Decision-making with uncertain information

  • Making sound decisions with AI assistance
  • Evaluating trade-offs systematically
  • Assessing confidence levels and risk

Praxis: Building reflexes and workflows

Building critical evaluation reflexes

  • Developing automatic quality checks
  • Pattern recognition for common issues
  • Creating personal evaluation frameworks

Complementing automated testing

  • What tests catch and what they miss
  • Human judgement in code review
  • Evaluating AI-generated tests
  • Integration with quality assurance processes

Organisational and ethical considerations

  • Building teams with strong critical thinking
  • Establishing organisational standards
  • Ethical use of AI in decision-making
  • Maintaining and growing expertise

Requirements

This course is designed for engineers and technical professionals at all levels who work with AI tools in their daily work. No specific technical prerequisites are required beyond general engineering experience.

The course complements our technical testing and prompt engineering workshops but focuses on a different skill set - the human judgement and critical thinking that automated processes cannot replace. Participants who have taken Testing in a GenAI World will find this course addresses the "what tests can't catch" aspects of quality assurance.

Participants should bring laptops with internet access and their preferred development environment. Access to AI tools like ChatGPT, Claude or Copilot is beneficial for hands-on exercises.

Bringing examples of AI-generated code, documentation or analysis from your own work significantly enhances the practical value of the course. The most effective learning comes from applying critical evaluation techniques to real outputs you encounter.

COURSE

AI Reliability and Critical Evaluation

Strengthen logic, scepticism and judgement for working with AI-generated code, documentation and analysis through hands-on exercises.

  • 1 Day
  • All Levels
  • In-person / Online
  • £ On Request

image/svg+xml
image/svg+xml