What is the Impact of Capstone Courses on Students’ Employability?

by Swapnil Lokhande

Supervised by Dr. Julia Ivy

May 14, 2020


The research involves the analysis of the impact of the Capstone program on the employability of the students or job seekers. The fundamental analysis in this research involves the identification of the keywords used by the people (authors) while discussing the benefits of capstone in the academic curriculum. The goal is to identify the keywords and analyze them based on the BE-EDGE methodology and see how far the capstone project helps students in developing their Personal, Social and Professional Capital.


Article selection and filtering

For the analysis of the keywords, the main source of data is the publicly available articles related to capstone. The analysis involves a dataset comprising the articles related to Benefits of capstone.

  1. Benefits of capstone: The articles chosen for this analysis are selected based on the following google searched articles and other recommended articles: What is capstone, Benefits of capstone projects, Importance of capstone project, Why capstone projects are important. 

Only those articles are selected which are generic to capstone project benefits. This is done to avoid any biases for a particular program or degree and the findings must represent a generic result for Capstone project and not focused on a specific capstone project required in a particular degree or university program.

It was observed that first 5 pages of the google search gave relevant articles and after that the articles were more specific to a particular type of program or course. Thus, only the articles present on the first 5 pages of google search were selected.

Data Extraction Method

In order to analyze the articles and get insights from the selected articles content needs to be extracted from the articles which is present in the HTML format, stored it in a simple text format (CSV or JSON) and use the text for further processing and analysis. Thus, our first approach in this project is to build an application which can be used to extract the desired content from different websites and store the content in the required format, and to accomplish this, different web scrapers are deployed to gather the data.


Web scraper for articles – Purpose and Technique

  • Designed a simple web scraper using python programming that can be used to pull the content from an article.
  • In this application, user need to pass the URL of an article which are freely available (Example: articles from The Conversation or The New York Times etc.)
  • The application uses requests and BeautifulSoup package of python which are used to extract the HTML code from the given article and process it to pull the required content. 



  • Unable to extract the content from the articles which requires a mandatory login on to the portal.
  • Example: articles available on the Northeastern Library portal can only be accessed after the login into the student’s account through myneu. The Chronicle of Higher Education also requires a login for accessing the articles.


The main goal of this research is to identify keywords that are frequently used and are highly relevant to the topics – Benefits of Capstone program. Thus, to identify such keywords a Machine Learning algorithm for Natural Language processing is used which is Tf-IDF (Term frequency and Inverse document frequency). This algorithm is generally used when processing human readable language and is used to convert words into numerical format where each word is represented in form of a matrix (Gajare, n.d.)

How to calculate Tf-Idf score

TF-IDF for a word in a document is calculated by multiplying two different metrics:

  • The term frequency (Tf) of a word in a document is a raw count of instances a word appears in a document. 
  • The inverse document frequency (Idf) of the word across a set of documents. This can be calculated by taking the total number of documents, dividing it by the number of documents that contain a word, and calculating the logarithm. The IDF is calculated to identify how common or rare a word is in the entire document set. The closer it is to 0, the more common a word is and more it is closer to 1 shows how rare it is.
  • Multiplying these two numbers results in the TF-IDF score of a word in a document. The higher the score, the more relevant that word is in that particular document (Stecanella, 2019).


The result consists of the bi-grams and tri-grams associated with the articles – Benefits of and importance of Capstone. The keywords are ordered in the descending order of their rank and frequency. Here, rank is the Tf-Idf score which shows the importance of the word or relevance of the word in the given article. For example, a keyword “sponsor organization” will have higher score for the articles related to benefits of capstone since it is assumed that a capstone project involves a sponsor organization with which students works to accomplish the desired goal of the project as well as the organization.

Classification of words based on EDGE

Words related to Personal capital

EDGE Required WordsOther meaningWords related to the analysis
IdentitySelf-esteem, individualityknowledge and skills, academic work, field of study, intellectual property
FocusCenter of interest or activityDevelop expertise specific to problem
StrategyPlan of actionDeep understanding, geared towards working
ChoiceMaking a decision when faced with two or more possibilities or well chosen/good fit
visionThe ability to think or plan about the futureLearn leadership relevant courses
goalsAim or desired resultAcademic work
ownershipRight of possessing somethingReceive academic credit
EmpowermentPower given to someone to do something or becoming stronger and more confidentknowledge and skills gained, serve as culminating academic

Words related to Social capital

EDGE Required WordsOther meaningWords related to the analysis
trustQuality of being true, reliability, reliable, shared understanding
EmpathyAbility to understand the feeling and share the feelings of othersReceive strong support
RelationshipsThe state of being connectedFaculty advisor supervises teams
rapportUnderstand each other feelings and share ideas

Words related to Professional capital

EDGE Required WordsOther meaningWords related to the analysis
Justification Action of showing something reasonableLiterature review, conduct research, facing host/sponsor organization, develop research plan
proofEvidence to help establish a factOral presentation, practical experience, provide opportunity, view exciting opportunity
Design thinking (preferably used by designers and design teams)cognitive, strategic and practical processes by which design concepts are developed by designersCritical thinking skills, real world problems, solve problems, apply skills, apply knowledge, develop expertise specific to problem, develop and use public speaking, address strategic challenges, receive objective study of critical issue

Capstone Dashboard

The frequency of occurrence of words having high Tf-Idf score is used to compare the words present in different categories. From the above analysis, it can be clearly seen that majority of the words belong to Personal and Professional capital, however only two words belong to Social capital and even the frequency of the words is very low. According to the analysis of Social Capital, Capstone program helps in building relationship with the organization which is involved in the project or helping the team in the project and collaborating with the advisor for his/her guidance in the project or seek advisors’ support.


Note: The analysis is done on a sample of data and the results may vary if more articles are collected for analysis. Further research and analysis can be performed using other Machine Learning algorithms for Natural Language Processing to find more accurate results between the words and collaborate and classify words based on their relationship with other words. This can be done to further expand this research in future.

About the Author

Previewing Swapnil_Lokhande.jpgSwapnil Lokhande, a graduate student from Northeastern University, Boston, accomplished by Masters in Analytics.

Swapnil’s LinkedIn Profile