Use AI to improve spam detection in Gmail and find legitimate emails that are wrongly marked as spam by Google algorithms.
False positives in Gmail are uncommon but can happen, meaning an important email could mistakenly end up in your spam folder. When you are dealing with hundreds of spam messages daily, identifying these legitimate emails becomes even more challenging.
You can create filters in Gmail to ensure that emails from specific senders or with certain keywords are never marked as spam. But these filters obviously don’t work for emails from new or unknown senders.
Find misclassified messages in Gmail spam
What if we used AI to analyze our spam emails in Gmail and predict which ones might be false positives? With this list of misclassified emails, we can automatically move these emails to the inbox or generate a report for manual review.
Here is a sample report generated from Gmail. It contains a list of emails with a low spam score that are probably legitimate and should be moved to the inbox. The report also includes a summary of email content in your preferred language.
To get started, open this Google Script and copy it to your Google Drive. Switch to the Apps script editor and provide your email address, OpenAI API key, and preferred language for the email summary.
select reportFalsePositives
Act from the dropdown and click the play button to run the script. It will look for unread spam emails in your Gmail account, analyze them using OpenAI’s API, and send you a report of emails with a low spam score.
If you want to run this script automatically at regular intervals, go to the “Triggers” menu in the Google Apps Script Editor and set up a time-driven trigger to run this script once every day as shown below. You can also choose the time of day you want to receive the report.
How AI Spam Classification Works – The Technical Part
If you are curious to know how the script works, here is a brief overview:
The Gmail script uses the Gmail API to search for unread spam emails in your Gmail account. It then sends the email content to OpenAI’s API to categorize the spam score and generate a summary in your preferred language. Emails with a low spam score are likely false positives and may be moved to the inbox.
1. User Configuration
You can provide your email address where the report should be sent, your OpenAI API key, your preferred LLM model, and the language for the email summary.
const USER_EMAIL = 'email@domain.com';
const OPENAI_API_KEY = 'sk-proj-123';
const OPENAI_MODEL = 'gpt-4o';
const USER_LANGUAGE = 'English';
2. Find unread emails in Gmail spam folder
We use epoch time to find spam emails that have arrived in the last 24 hours and have not yet been read.
const HOURS_AGO = 24;
const MAX_THREADS = 25;
const getSpamThreads_ = () => {
const epoch = (date) => Math.floor(date.getTime() / 1000);
const beforeDate = new Date();
const afterDate = new Date();
afterDate.setHours(afterDate.getHours() - HOURS_AGO);
const searchQuery = `is:unread in:spam after:${epoch(afterDate)} before:${epoch(beforeDate)}`;
return GmailApp.search(searchQuery, 0, MAX_THREADS);
};
3. Create a prompt for the OpenAI model
We create a prompt for the OpenAI model using an email message. The prompt asks the AI ββmodel to analyze the email content and assign a spam score on a scale of 0 to 10. The response must be in JSON format.
const SYSTEM_PROMPT = `You are an AI email classifier. Given the content of an email, analyze it and assign a spam score on a scale from 0 to 10, where 0 indicates a legitimate email and 10 indicates a definite spam email. Provide a short summary of the email in ${USER_LANGUAGE}. Your response should be in JSON format.`;
const MAX_BODY_LENGTH = 200;
const getMessagePrompt_ = (message) => {
const body = message
.getPlainBody()
.replace(/https?:\/\/(^\s>)+/g, '')
.replace(/(\n\r\t)/g, ' ')
.replace(/\s+/g, ' ')
.trim();
return (
`Subject: ${message.getSubject()}`,
`Sender: ${message.getFrom()}`,
`Body: ${body.substring(0, MAX_BODY_LENGTH)}`,
).join('\n');
};
4. Call the OpenAI API to get the spam score
We pass the message prompt to the OpenAI API and get the spam score and email content summary. A spam score is used to determine if an email is a false positive.
The tokens
The variable keeps track of the number of tokens used in OpenAI API calls and included in the email report. You can use this information to monitor your API usage.
let tokens = 0;
const getMessageScore_ = (messagePrompt) => {
const apiUrl = `https://api.openai.com/v1/chat/completions`;
const headers = {
'Content-Type': 'application/json',
Authorization: `Bearer ${OPENAI_API_KEY}`,
};
const response = UrlFetchApp.fetch(apiUrl, {
method: 'POST',
headers,
payload: JSON.stringify({
model: OPENAI_MODEL,
messages: (
{ role: 'system', content: SYSTEM_PROMPT },
{ role: 'user', content: messagePrompt },
),
temperature: 0.2,
max_tokens: 124,
response_format: { type: 'json_object' },
}),
});
const data = JSON.parse(response.getContentText());
tokens += data.usage.total_tokens;
const content = JSON.parse(data.choices(0).message.content);
return content;
};
5. Process spam emails and email report
You can run this Google Script manually or set up a cron trigger to run automatically at regular intervals. It marks spam emails as read so they are not processed again.
const SPAM_THRESHOLD = 2;
const reportFalsePositives = () => {
const html = ();
const threads = getSpamThreads_();
for (let i = 0; i < threads.length; i += 1) {
const (message) = threads(i).getMessages();
const messagePrompt = getMessagePrompt_(message);
const { spam_score, summary } = getMessageScore_(messagePrompt);
if (spam_score <= SPAM_THRESHOLD) {
html.push(`${message.getFrom()} ${summary} `);
}
}
threads.forEach((thread) => thread.markRead());
if (html.length > 0) {
const htmlBody = (
``,
'Email Sender Summary ',
html.join(''),
'
',
).join('');
const subject = `Gmail Spam Report - ${tokens} tokens used`;
GmailApp.sendEmail(USER_EMAIL, subject, '', { htmlBody });
}
};
See also: Authenticate your Gmail messages