TruthSeek is an advanced email classifier that leverages a multi-stage architecture combining RoBERTa with CNN and a hierarchical attention network. This approach ensures high accuracy in distinguishing spam from legitimate emails. It is a multi-modal system that takes into consideration email protocols like DMARC, SPF, and DKIM. Additionally, it uses web agents to verify the authenticity of email senders, enhancing the reliability of the classification process.
- RoBERTa: Utilized for initial text encoding.
- CNN: Applied for feature extraction from encoded text.
- BiLSTM: Used for sentence-level embedding to capture context from both directions.
- Hierarchical Attention Network: Enhances the model's focus on important words and sentences.