Back to Portfolio

EEG-Based ADHD Classification

Machine Learning for Objective Neurodiagnostics Using Traditional ML Models

EEG-ADHD Classification Project

Project Type

Academic Research

Domain

Neuroscience & Machine Learning

Duration

4 months

Project Overview

This project explores the use of traditional machine learning models to classify Attention Deficit Hyperactivity Disorder (ADHD) versus neurotypical controls using electroencephalogram (EEG) signals. The goal is to develop accessible, objective tools to support clinical diagnoses, which are currently reliant on subjective behavioral assessments.

Research Motivation

ADHD diagnoses are often based on subjective reports, leading to potential misdiagnosis. EEG offers a non-invasive, cost-effective approach to capture neural signatures. This project investigates whether EEG-derived features like theta/beta ratio, alpha power, and slow wave activity can serve as reliable biomarkers to distinguish ADHD from neurotypical individuals.

Dataset & Preprocessing

Data Source

IEEE dataset containing EEG recordings from children aged 7–12 years, with balanced representation of ADHD and neurotypical control groups.

EEG Setup

19 EEG channels (Fz, Cz, Pz, C3, T3, etc.) recorded at 128 Hz sampling rate using standard 10-20 electrode placement system.

Signal Processing

Bandpass filtering (1–50 Hz), Independent Component Analysis (ICA) for artifact removal, and 2-second epoching with 50% overlap.

Feature Extraction

Three neurophysiologically relevant feature types: Theta/Beta Ratio, Alpha Power (8–13 Hz), and Slow Wave Activity (<12.5 Hz).

Data Quality

Rigorous quality control including artifact detection, channel validation, and statistical outlier removal to ensure clean datasets.

Machine Learning Approach

We implemented and compared three traditional machine learning models, each chosen for their proven effectiveness in biomedical signal classification:

Support Vector Machine (SVM)

Linear and RBF kernels for both linear and non-linear pattern recognition, with hyperparameter optimization via grid search.

Random Forest

Ensemble method providing feature importance insights and robust performance across different feature types.

K-Nearest Neighbors (KNN)

Instance-based learning with optimized k-value selection and distance metric evaluation for local pattern recognition.

Model Validation

  • Cross-Validation: 5-fold stratified cross-validation to ensure robust performance estimates
  • Hyperparameter Tuning: Grid search optimization for each model's parameters
  • Performance Metrics: Accuracy, precision, recall, and F1-score evaluation
  • Statistical Significance: Multiple runs with different random seeds for statistical validation

Results & Performance Analysis

Key Performance Summary

Best Result: SVM with Theta/Beta Ratio features achieved 80.6% accuracy and 77% precision, demonstrating strong potential for clinical application. This performance significantly exceeds chance level (50%) and approaches clinically relevant thresholds.

Feature-Specific Results

Theta/Beta Ratio Classification (Best Performance)
Model Accuracy Precision Clinical Relevance
KNN 73.2% 75% Good baseline performance
Random Forest 77.3% 71% Strong feature importance insights
SVM 80.6% 77% Best overall performance
Alpha Power Classification
Model Accuracy Precision Performance Note
KNN 63.0% 60% Moderate discriminative power
Random Forest 60.6% 54% Lower performance on alpha features
SVM 68.0% 67% Most consistent across features
Slow Wave Activity Classification
Model Accuracy Precision Clinical Insights
KNN 62.7% 57% Captures local neural patterns
Random Forest 67.6% 69% Good precision for slow waves
SVM 68.3% 67% Balanced classification performance

Key Findings & Insights

Best Performance

SVM with Theta/Beta Ratio features achieved the highest accuracy (80.6%), demonstrating the clinical relevance of this biomarker.

Feature Ranking

Theta/Beta Ratio > Slow Wave Activity > Alpha Power in terms of discriminative power for ADHD classification.

Model Consistency

SVM demonstrated the most consistent performance across all feature types, making it the most reliable classifier.

Neural Signatures

Frontal and central EEG channels showed the highest discriminative power, aligning with neurobiological understanding of ADHD.

Clinical Significance

The 80.6% accuracy achieved with traditional machine learning models demonstrates that EEG-based biomarkers can effectively support ADHD diagnosis. This performance level approaches the reliability needed for clinical decision support tools, potentially reducing diagnostic subjectivity and improving access to objective assessment methods.

Tools & Technologies

Python Scikit-learn SciPy NumPy Pandas Matplotlib EEG Analysis Signal Processing Machine Learning Cross-Validation

Future Directions & Clinical Impact

Immediate Next Steps

  • Feature Enhancement: Explore connectivity measures, nonlinear features, and time-frequency analysis for improved discrimination
  • Deep Learning Integration: Compare traditional models with CNN, LSTM, and EEGNet architectures for potential performance gains
  • Dataset Expansion: Address current limitations in dataset size and demographic diversity

Clinical Applications

  • Diagnostic Support: Develop screening tools to supplement traditional behavioral assessments
  • Treatment Monitoring: Track therapeutic response through EEG biomarker changes

Broader Impact

This research contributes to the development of objective, data-driven diagnostic tools for ADHD, potentially improving diagnostic accuracy and accessibility. The methodology can be extended to other neuropsychiatric conditions, advancing the field of computational psychiatry and precision medicine.