Models and Training¶
Dual-Model Severity Classification Framework¶
The system employs a dual-model architecture to independently predict depression and anxiety severity from Arabic text.
The design prioritizes robustness, interpretability, and reproducibility while maintaining clear non-clinical boundaries.
1. Model Architecture¶
Two independent multi-class classifiers are trained:
- Depression Severity Prediction Model
- Anxiety Severity Prediction Model
Each model predicts severity on a four-level ordinal scale:
| Score | Severity |
|---|---|
| 0 | None |
| 1 | Mild |
| 2 | Moderate |
| 3 | Severe |
Independent modeling enables each classifier to learn condition-specific semantic patterns without conflating overlapping symptom expressions.
2. Text Representation¶
All Arabic text inputs are transformed into dense numerical vectors using embedding-based representation learning.
Embedding Model¶
- Model: EmbeddingGemma-300M
- Output dimension: 768
- Language coverage: Multilingual with Arabic support
Each text entry is mapped to a 768-dimensional vector in semantic space, preserving contextual and syntactic relationships without manual feature engineering.
This representation reduces sparsity compared to traditional bag-of-words or TF-IDF methods and supports non-linear decision boundaries.
3. Classification Algorithm¶
Support Vector Machine¶
Support Vector Machines are used as the primary classification algorithm due to their effectiveness in high-dimensional embedding spaces and strong generalization properties.
For a training set ((x_i, y_i)), SVM seeks a decision boundary that maximizes the margin between classes while minimizing classification error.
The optimization objective balances:
- Margin maximization
- Regularization strength
This balance improves generalization performance on unseen data.
Kernel Selection: Radial Basis Function (RBF)¶
The RBF kernel enables non-linear separation in embedding space.
Motivation for using RBF:
- Semantic overlap between adjacent severity levels
- Non-linear linguistic boundaries
- Morphological richness of Arabic
The kernel maps input vectors into a higher-dimensional feature space where linear separation becomes feasible.
4. Training Procedure¶
The training workflow follows a structured pipeline:
- Load synthetic labeled dataset
- Generate 768-dimensional embeddings for each entry
- Perform stratified 70/30 train–test split
- Train separate SVM models for depression and anxiety
- Tune hyperparameters such as regularization parameter (C) and kernel coefficient (\gamma)
- Evaluate on held-out test data
- Serialize trained models for reuse
Hyperparameter selection is performed to optimize generalization performance rather than training accuracy.
5. Reproducibility and Deployment¶
- Stratified sampling preserves severity balance
- Trained models are stored as serialized
.pklartifacts - The embedding pipeline is deterministic for identical inputs
- Evaluation is performed on a fixed held-out test set
This ensures consistent and reproducible results.
6. Design Rationale¶
The architecture reflects deliberate trade-offs:
- Classical ML is preferred over deep sequence models to improve interpretability and reduce opacity.
- Independent classifiers avoid conflating depression and anxiety patterns.
- Embedding-based features reduce reliance on handcrafted linguistic rules.
- The framework remains computationally lightweight and deployable without GPU acceleration.
7. Known Constraints¶
- Predictions are generated at the sentence level without sequence-aware modeling.
- Temporal relationships are handled through post-prediction alert logic rather than integrated sequential learning.
- Embeddings are not fine-tuned on domain-specific mental health corpora.
These constraints inform future extensions toward sequence modeling and domain adaptation.