What is the Impact of Capstone Courses on Students’ Employability?
by Swapnil Lokhande
Supervised by Dr. Julia Ivy
May 14, 2020
The research involves the analysis of the impact of the Capstone program on the employability of the students or job seekers. The fundamental analysis in this research involves the identification of the keywords used by the people (authors) while discussing the benefits of capstone in the academic curriculum. The goal is to identify the keywords and analyze them based on the BE-EDGE methodology and see how far the capstone project helps students in developing their Personal, Social and Professional Capital.
METHODOLOGY FOR DATA SELECTION AND EXTRACTION
Article selection and filtering
For the analysis of the keywords, the main source of data is the publicly available articles related to capstone. The analysis involves a dataset comprising the articles related to Benefits of capstone.
- Benefits of capstone: The articles chosen for this analysis are selected based on the following google searched articles and other recommended articles: What is capstone, Benefits of capstone projects, Importance of capstone project, Why capstone projects are important.
Only those articles are selected which are generic to capstone project benefits. This is done to avoid any biases for a particular program or degree and the findings must represent a generic result for Capstone project and not focused on a specific capstone project required in a particular degree or university program.
It was observed that first 5 pages of the google search gave relevant articles and after that the articles were more specific to a particular type of program or course. Thus, only the articles present on the first 5 pages of google search were selected.
Data Extraction Method
In order to analyze the articles and get insights from the selected articles content needs to be extracted from the articles which is present in the HTML format, stored it in a simple text format (CSV or JSON) and use the text for further processing and analysis. Thus, our first approach in this project is to build an application which can be used to extract the desired content from different websites and store the content in the required format, and to accomplish this, different web scrapers are deployed to gather the data.
Web scraper for articles – Purpose and Technique
- Designed a simple web scraper using python programming that can be used to pull the content from an article.
- In this application, user need to pass the URL of an article which are freely available (Example: articles from The Conversation or The New York Times etc.)
- The application uses requests and BeautifulSoup package of python which are used to extract the HTML code from the given article and process it to pull the required content.
Drawback
- Unable to extract the content from the articles which requires a mandatory login on to the portal.
- Example: articles available on the Northeastern Library portal can only be accessed after the login into the student’s account through myneu. The Chronicle of Higher Education also requires a login for accessing the articles.
METHODOLOGY FOR DATA ANALYSIS AND KEYWORD DETECTION
The main goal of this research is to identify keywords that are frequently used and are highly relevant to the topics – Benefits of Capstone program. Thus, to identify such keywords a Machine Learning algorithm for Natural Language processing is used which is Tf-IDF (Term frequency and Inverse document frequency). This algorithm is generally used when processing human readable language and is used to convert words into numerical format where each word is represented in form of a matrix (Gajare, n.d.)
How to calculate Tf-Idf score
TF-IDF for a word in a document is calculated by multiplying two different metrics:
- The term frequency (Tf) of a word in a document is a raw count of instances a word appears in a document.
- The inverse document frequency (Idf) of the word across a set of documents. This can be calculated by taking the total number of documents, dividing it by the number of documents that contain a word, and calculating the logarithm. The IDF is calculated to identify how common or rare a word is in the entire document set. The closer it is to 0, the more common a word is and more it is closer to 1 shows how rare it is.
- Multiplying these two numbers results in the TF-IDF score of a word in a document. The higher the score, the more relevant that word is in that particular document (Stecanella, 2019).
Findings
The result consists of the bi-grams and tri-grams associated with the articles – Benefits of and importance of Capstone. The keywords are ordered in the descending order of their rank and frequency. Here, rank is the Tf-Idf score which shows the importance of the word or relevance of the word in the given article. For example, a keyword “sponsor organization” will have higher score for the articles related to benefits of capstone since it is assumed that a capstone project involves a sponsor organization with which students works to accomplish the desired goal of the project as well as the organization.
Classification of words based on EDGE
Words related to Personal capital
EDGE Required Words | Other meaning | Words related to the analysis |
Identity | Self-esteem, individuality | knowledge and skills, academic work, field of study, intellectual property |
Focus | Center of interest or activity | Develop expertise specific to problem |
Strategy | Plan of action | Deep understanding, geared towards working |
Choice | Making a decision when faced with two or more possibilities or well chosen/good fit | |
vision | The ability to think or plan about the future | Learn leadership relevant courses |
goals | Aim or desired result | Academic work |
ownership | Right of possessing something | Receive academic credit |
Empowerment | Power given to someone to do something or becoming stronger and more confident | knowledge and skills gained, serve as culminating academic |
Words related to Social capital
EDGE Required Words | Other meaning | Words related to the analysis |
trust | Quality of being true, reliability, reliable, shared understanding | |
Empathy | Ability to understand the feeling and share the feelings of others | Receive strong support |
Relationships | The state of being connected | Faculty advisor supervises teams |
rapport | Understand each other feelings and share ideas | |
Listening |
Words related to Professional capital
EDGE Required Words | Other meaning | Words related to the analysis |
Justification | Action of showing something reasonable | Literature review, conduct research, facing host/sponsor organization, develop research plan |
proof | Evidence to help establish a fact | Oral presentation, practical experience, provide opportunity, view exciting opportunity |
Design thinking (preferably used by designers and design teams) | cognitive, strategic and practical processes by which design concepts are developed by designers | Critical thinking skills, real world problems, solve problems, apply skills, apply knowledge, develop expertise specific to problem, develop and use public speaking, address strategic challenges, receive objective study of critical issue |
Capstone Dashboard
The frequency of occurrence of words having high Tf-Idf score is used to compare the words present in different categories. From the above analysis, it can be clearly seen that majority of the words belong to Personal and Professional capital, however only two words belong to Social capital and even the frequency of the words is very low. According to the analysis of Social Capital, Capstone program helps in building relationship with the organization which is involved in the project or helping the team in the project and collaborating with the advisor for his/her guidance in the project or seek advisors’ support.
Note: The analysis is done on a sample of data and the results may vary if more articles are collected for analysis. Further research and analysis can be performed using other Machine Learning algorithms for Natural Language Processing to find more accurate results between the words and collaborate and classify words based on their relationship with other words. This can be done to further expand this research in future.
About the Author
Swapnil Lokhande, a graduate student from Northeastern University, Boston, accomplished by Masters in Analytics.