EEG-Based ADHD Classification

Project Overview

This project explores the use of traditional machine learning models to classify Attention Deficit Hyperactivity Disorder (ADHD) versus neurotypical controls using electroencephalogram (EEG) signals. The goal is to develop accessible, objective tools to support clinical diagnoses, which are currently reliant on subjective behavioral assessments.

Research Motivation

ADHD diagnoses are often based on subjective reports, leading to potential misdiagnosis. EEG offers a non-invasive, cost-effective approach to capture neural signatures. This project investigates whether EEG-derived features like theta/beta ratio, alpha power, and slow wave activity can serve as reliable biomarkers to distinguish ADHD from neurotypical individuals.

Dataset & Preprocessing

Data Source

IEEE dataset containing EEG recordings from children aged 7–12 years, with balanced representation of ADHD and neurotypical control groups.

EEG Setup

19 EEG channels (Fz, Cz, Pz, C3, T3, etc.) recorded at 128 Hz sampling rate using standard 10-20 electrode placement system.

Signal Processing

Bandpass filtering (1–50 Hz), Independent Component Analysis (ICA) for artifact removal, and 2-second epoching with 50% overlap.

Feature Extraction

Three neurophysiologically relevant feature types: Theta/Beta Ratio, Alpha Power (8–13 Hz), and Slow Wave Activity (<12.5 Hz).

Data Quality

Rigorous quality control including artifact detection, channel validation, and statistical outlier removal to ensure clean datasets.

Machine Learning Approach

We implemented and compared three traditional machine learning models, each chosen for their proven effectiveness in biomedical signal classification:

Support Vector Machine (SVM)

Linear and RBF kernels for both linear and non-linear pattern recognition, with hyperparameter optimization via grid search.

Random Forest

Ensemble method providing feature importance insights and robust performance across different feature types.

K-Nearest Neighbors (KNN)

Instance-based learning with optimized k-value selection and distance metric evaluation for local pattern recognition.

Model Validation

Cross-Validation: 5-fold stratified cross-validation to ensure robust performance estimates
Hyperparameter Tuning: Grid search optimization for each model's parameters
Performance Metrics: Accuracy, precision, recall, and F1-score evaluation
Statistical Significance: Multiple runs with different random seeds for statistical validation

Results & Performance Analysis

Key Performance Summary

Best Result: SVM with Theta/Beta Ratio features achieved 80.6% accuracy and 77% precision, demonstrating strong potential for clinical application. This performance significantly exceeds chance level (50%) and approaches clinically relevant thresholds.

Feature-Specific Results

Theta/Beta Ratio Classification (Best Performance)

Model	Accuracy	Precision	Clinical Relevance
KNN	73.2%	75%	Good baseline performance
Random Forest	77.3%	71%	Strong feature importance insights
SVM	80.6%	77%	Best overall performance

Alpha Power Classification

Model	Accuracy	Precision	Performance Note
KNN	63.0%	60%	Moderate discriminative power
Random Forest	60.6%	54%	Lower performance on alpha features
SVM	68.0%	67%	Most consistent across features

Slow Wave Activity Classification

Model	Accuracy	Precision	Clinical Insights
KNN	62.7%	57%	Captures local neural patterns
Random Forest	67.6%	69%	Good precision for slow waves
SVM	68.3%	67%	Balanced classification performance

Key Findings & Insights

Best Performance

SVM with Theta/Beta Ratio features achieved the highest accuracy (80.6%), demonstrating the clinical relevance of this biomarker.

Feature Ranking

Theta/Beta Ratio > Slow Wave Activity > Alpha Power in terms of discriminative power for ADHD classification.

Model Consistency

SVM demonstrated the most consistent performance across all feature types, making it the most reliable classifier.

Neural Signatures

Frontal and central EEG channels showed the highest discriminative power, aligning with neurobiological understanding of ADHD.

Clinical Significance

The 80.6% accuracy achieved with traditional machine learning models demonstrates that EEG-based biomarkers can effectively support ADHD diagnosis. This performance level approaches the reliability needed for clinical decision support tools, potentially reducing diagnostic subjectivity and improving access to objective assessment methods.

Tools & Technologies

Python Scikit-learn SciPy NumPy Pandas Matplotlib EEG Analysis Signal Processing Machine Learning Cross-Validation

Future Directions & Clinical Impact

Immediate Next Steps

Feature Enhancement: Explore connectivity measures, nonlinear features, and time-frequency analysis for improved discrimination
Deep Learning Integration: Compare traditional models with CNN, LSTM, and EEGNet architectures for potential performance gains
Dataset Expansion: Address current limitations in dataset size and demographic diversity

Clinical Applications

Diagnostic Support: Develop screening tools to supplement traditional behavioral assessments
Treatment Monitoring: Track therapeutic response through EEG biomarker changes

Broader Impact

This research contributes to the development of objective, data-driven diagnostic tools for ADHD, potentially improving diagnostic accuracy and accessibility. The methodology can be extended to other neuropsychiatric conditions, advancing the field of computational psychiatry and precision medicine.

Project Type

Domain

Duration