Models and Training¶

Dual-Model Severity Classification Framework¶

The system employs a dual-model architecture to independently predict depression and anxiety severity from Arabic text.

The design prioritizes robustness, interpretability, and reproducibility while maintaining clear non-clinical boundaries.

1. Model Architecture¶

Two independent multi-class classifiers are trained:

Depression Severity Prediction Model
Anxiety Severity Prediction Model

Each model predicts severity on a four-level ordinal scale:

Score	Severity
0	None
1	Mild
2	Moderate
3	Severe

Independent modeling enables each classifier to learn condition-specific semantic patterns without conflating overlapping symptom expressions.

2. Text Representation¶

All Arabic text inputs are transformed into dense numerical vectors using embedding-based representation learning.

Embedding Model¶

Model: EmbeddingGemma-300M
Output dimension: 768
Language coverage: Multilingual with Arabic support

Each text entry is mapped to a 768-dimensional vector in semantic space, preserving contextual and syntactic relationships without manual feature engineering.

This representation reduces sparsity compared to traditional bag-of-words or TF-IDF methods and supports non-linear decision boundaries.

3. Classification Algorithm¶

Support Vector Machine¶

Support Vector Machines are used as the primary classification algorithm due to their effectiveness in high-dimensional embedding spaces and strong generalization properties.

For a training set ((x_i, y_i)), SVM seeks a decision boundary that maximizes the margin between classes while minimizing classification error.

The optimization objective balances:

Margin maximization
Regularization strength

This balance improves generalization performance on unseen data.

Kernel Selection: Radial Basis Function (RBF)¶

The RBF kernel enables non-linear separation in embedding space.

Motivation for using RBF:

Semantic overlap between adjacent severity levels
Non-linear linguistic boundaries
Morphological richness of Arabic

The kernel maps input vectors into a higher-dimensional feature space where linear separation becomes feasible.

4. Training Procedure¶

The training workflow follows a structured pipeline:

Load synthetic labeled dataset
Generate 768-dimensional embeddings for each entry
Perform stratified 70/30 train–test split
Train separate SVM models for depression and anxiety
Tune hyperparameters such as regularization parameter (C) and kernel coefficient (\gamma)
Evaluate on held-out test data
Serialize trained models for reuse

Hyperparameter selection is performed to optimize generalization performance rather than training accuracy.

5. Reproducibility and Deployment¶

Stratified sampling preserves severity balance
Trained models are stored as serialized .pkl artifacts
The embedding pipeline is deterministic for identical inputs
Evaluation is performed on a fixed held-out test set

This ensures consistent and reproducible results.

6. Design Rationale¶

The architecture reflects deliberate trade-offs:

Classical ML is preferred over deep sequence models to improve interpretability and reduce opacity.
Independent classifiers avoid conflating depression and anxiety patterns.
Embedding-based features reduce reliance on handcrafted linguistic rules.
The framework remains computationally lightweight and deployable without GPU acceleration.

7. Known Constraints¶

Predictions are generated at the sentence level without sequence-aware modeling.
Temporal relationships are handled through post-prediction alert logic rather than integrated sequential learning.
Embeddings are not fine-tuned on domain-specific mental health corpora.

These constraints inform future extensions toward sequence modeling and domain adaptation.