For many beginners like myself, learning to code for a data science project can be just as intimidating as flying a Boeing 757 for the first time.Fortunately, Python offers a myriad of powerful libraries to get us beginners quickly into the cockpit to do some cool analyses.One of such Python libraries is Textblob, which provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, and much more.In this blog, we are going to learn how to write a simple python program that performs sentiment analysis of Martin Luther King, Jr.’s ‘I Have a Dream’ speech and writes the results to a .csv file.
Before we hit the throttle and leave the runway, I want to give a brief background of myself and how Convergence Consulting Group (CCG) cultivated my interest in data science.I started at CCG fresh out of college in the summer of 2017.Having no professional experience in neither consulting nor data science, CCG put me through their consultant boot camp—a rigorous 3-week program that covers the ins and outs of consulting, data warehousing, cloud services, data modeling, advanced analytics, and many other tools and soft-skills that every CCGer needs to succeed.The experience was immensely useful and enjoyable, but one of the courses that really grabbed my attention was the course on advanced analytics.That course was my first introduction to python as a data science language.Our instructor, Ahmed Sherif, gave us real data sets to work with and really pushed us to “think in python”.By the end of that 2-hour session, I knew that I wanted to do some more cool analyses with python, which led me to write the program I’m about to walk you through.
So let’s get started, shall we?
First, I chose to use Notepad++ to write my code, but feel free to use whatever IDE or text-writing tool you’re comfortable with (FYI Jupyter is great if you’re a beginner).
In lines 4 and 5, we are importing the Textblob and csv libraries.The former is how we will invoke the NLP sentiment analysis functions.The latter is how we will invoke the functions necessary to write our sentiment analysis results to a .csv file.
In line’s 9 and 10, we have declared two file path variables. File_path is the location of the “I Have a Dream” speech and sentiment_csv_path is the [eventual] location of the sentiment analysis results .csv file.
In line 15, we create array fieldnames that will be used to populate the headers of our .csv file. A quick note about each of these 5 headers. Sentence_ID represents a unique identifier number to identify each sentence of the speech. Polarity and Subjectivity represent the respective sentiment analysis scores for each uniquely identified sentence.Sentence is to be populated with the text from each uniquely identified sentence. Lastly, Strong Opinion? is to represent a Boolean value (0=F, 1=T) if the program determines a sentence to be a ‘strong opinion’—I’ll elaborate on this shortly.
In line 19, we declare the sentence_ID variable which is set to 0. It will later be incremented as the program traverses the text file and identifies each sentence.
In lines 22-25 we do three things.Frist, we create the .csv file in the file path location stored in sentiment_csv_path. Then, we write the headers to the .csv file with the array fieldnames. Then finally, we close the file.
Lines 29 is where we open the speech file found in the location stored in file_path.Line 30 read the open speech file through the TextBlob library.
Line 33 takes the open speech file and splits it up into sentences.
Now we get to the meat and potatoes of this program. Line 38 is the beginning of a for loop that loops through each sentence of the speech file. Lines 39-40 assign polarity and subjectivity scores to a sentence coming in from the speech file.
Then in lines 45-48, the program determines if the sentence is a strong opinion based on the polarity and subjectivity scores—If the absolute value of the polarity is greater than 0.7 and the subjectivity is greater than 0.7, then assign the sentence a value of 1 (true)—Otherwise assign a value of 0 (false).
Lines 51-54 open up that .csv file once more, then populate it with data corresponding to each of the 5 headers (Sentence_ID, Polarity, Subjectivity, Sentence, and Strong Opinion?).
In line 56, we reach the end of the loop.If there is another sentence to be processed, sentence_ID is incremented by 1. Once there are no more sentences left, the loop completes (breaks).
Finally, once the program has looped through each sentence of the speech file, analyzed it through textblob, and published the results to a .csv file, we close the file in line 59. Although you’ll still be able to view the complete results if you skip this step, it is still a programming best practice—especially if you need to access the speech file later in the program via a different library.
Once your program successfully completes, you should be able to open the .csv file (I opened in Excel here) and have all the analysis results for easy consumption.Now that the results are neatly organized onto a .csv file, there are a multitude of options to store, consume, and creatively visualize this data!
To those of you reading this post who are new to programming and programming for data science—I hope you’ve found this post as helpful and maybe just as inspiring my first exposure to it in CCG’s consultant boot camp.
To read more on CCG’s capabilities click here, or contact a CCG representative at (813) 265-3239.
CCG understood our project needs very well, they are very responsive and we could not ask for anything more. The solution they provided fit perfectly with our expectations and business goals.GOP Data TrustChief Data Officer
I cannot overstate the delight we experienced from the outcome of our project. I would not only recommend CCG to any company, but question why they would engage with anyone but CCG.PgiDirector of Customer Success
Working with CCG is like working with extended team members. Consultants become an integral part of the work bringing expertise for cutting edge design and development.Hillsborough County Public SchoolsChief Information and Technology Officer
CCG's team is positive and eager. They are a great big bunch of wonderful people trying to make a difference.Hillsborough County Public SchoolsDepartment Manager
I knew CCG's technical expertise and dedication to quality results would be invaluable to our project success based on our past partnerships. We could not have implemented in the short timeframe like we did without their assistance. CCG is #1 on my speed dial for successful project implementation.InCommDirector, Financial Information Systems
It was evident from the onset of negotiations through the implementation that CCG took their role in the partnership to heart and we believe it has been instrumental in our success.Interval InternationalDirector of Marketing
CCG works very hard to understand and align with our needs. It truly feels as though we are on the same team!Fortune 500 HomebuilderBI Manager
CCG came to our company in a time of much change. Their team partnered with ours, continually delivering with professionalism and efficiency. We would not be where we are today without the expertise CCG brought to the project.PSCU Financial ServicesSenior Program Manager
CCG has a good industry knowledge, we are very happy that we chose to work with CCG. They have been a great help strategically and are helping us make important decisions.Minneapolis Public SchoolsHuman Capital Coordinator
Other Vendors use the word Partnership, but CCG actually means what they say. I can’t thank them enough for their professionalism and willingness to work with us as a true Partner, not just another vendor.PODSCIO
Our CCG Consultants are total rock stars: very thorough with a solid knowledge of the financial services industry. As a bonus, they are very easy to get along with – a great fit for our team.Raymond James Financial ServicesSenior Manager of Enterprise Data
CCG's team are all amazing. Thank you, CCG, for all that you do to make us great and keep our credit unions moving forward!PSCU Financial ServicesVP Enterprise Analytics & BI
Other Vendors use the word Partnership, but CCG actually means what they say. I can’t thank them enough for their professionalism and willingness to work with us as a true Partner, not just another vendor.PODSChief Information Officer
CCG's Team is very professional and responsive. They are making our job very easy.Rollins, Inc.Senior BI Analyst
CCG did an excellent job! Their team was very flexible. They gave us everything we asked for and then some.Rooms To GoSenior BI Architect
I'm amazed at the talent at CCG, not just the skillset - they're really good people. We've already referred them once and will do so again!Ruth's Chris Hospitality GroupCIO
CCG did a great job! We're extremely impressed with what was built in a short time. CCG has delivered ahead of time and with best practices, it's been a pleasure to work with them.VologyVP of Analytics
2502 N. Rocky Point Drive, #650, Tampa, FL 33607
Phone: 813.968.3238 | Fax: 813.200.1357
8000 Avalon Blvd. Suite #100, Alpharetta, GA
Phone: 404.328.7298 | Fax: 813.200.1357