Grab Certain Texts From Gmail via Google Spreadsheet (Script)

I had worked on this a few weeks back, and without any prior knowledge of using GmailApp call on Google Spreadsheet, I had to struggle for a while to get it working. The code to do this is fairly simple and can be understood by just about anyone. The problem I was trying to solve was also fairly simple. First, some background : there’s a new task initiated by my college, although as ridiculous as it might sound to you, a student (based on the order of the roll number) has to send an email blast each day to other students. The email blast should contain a word, it’s meaning, example of its usage etc. There’s a certain format that needs to be followed, and what I wanted to do was grab just the words within the email and store them so that at some point, I’d have a database full of every words that have ever been sent. (Not really my idea, though.)

So, here’s how I tackled the problem, and I hope this would be helpful to you in case you want to do something similar. What I did at first was that I went to my Gmail and assigned all the emails with the subject “word of the day”, the label : “Word”. Since all the email blasts followed the same format, they all had the same subject as well. The advantage of assigning a label to emails based on subject is that, it works for new emails as well. That means, any new email with the same subject, will also automatically be assigned the label “Word”. I now only needed to focus on retrieving mails from the label “Word” via Google Script. And this code exactly does that :

The first line does nothing more than focusing on the Spreadsheet named “MySpreadsheet” (just a name I gave to the spreadsheet). The second and third line uses the GmailApp function provided by Google Script in order to retrieve the list of emails under label “Word”. The loop takes in individual mail and gets its raw content.  It’s similar to clicking “Show original” option on Gmail. Doing this will show the email along with its HTML source. Then, I have used Regular Expression to parse the HTML source, and match the word between “*” and “*”.

While grabbing raw content, any bold, italic, underlined or formatted texts are put inside two asterisks. And according to the format that everyone had to follow, the first bold text on the email would be the word of the day, so, simply grabbing the word between the first occurrence of a  pair of asterisks would get me the word that I want. The For-loop would execute for as many emails as there are under “Word” labels (threads.length). The if condition adds the retrieved word to the spreadsheet.

There’s a certain limit imposed by Google as to the number of emails you can retrieve at a time, so you might need to make some changes if there’s a huge number of emails that need to be checked in order to retrieve the things you need.

Hope this has given you a fair idea as to how to retrieve words from Gmail using Google Script. Good luck!

Leave a Reply

Your email address will not be published. Required fields are marked *