Senior Design Research Project

Project Title: What Are We Teaching? Automated Evaluation of CS Curricula Content Using Topic Modeling

This research was performed under the supervision of Huzefa Rangwala at George Mason University. We explored the application of Topic Modeling to textual university curriculum data, specifically course descriptions. Automatically extracting content from this corpus allows for larger-scale analysis of university curricula than would be possible with manual inspection.

In addition to extracting topics from course documents, I independently developed an easy-to-use, web-based visualization platform to display the results of information extraction and analysis. The visualization tool is called Trajectory and is built on the Python Flask framework.


Paper: pdf (196K)
Poster: pdf (1M)
Slides: pdf (374K)

Web Demo: Trajectory
Code Repository: GitHub

UA REU Empirical Software Engineering

Project Title: Usability and Suitability Survey of Features in Visual IDEs for Non-Programmers

Summer 2014 I was engaged as an undergraduate researcher with the University of Alabama and the National Science Foundation conducting research into machine learning in theory and application with the UA REU ESE program.

The project involved the systematic empirical study of user interfaces in visual language integrated development environments (IDEs). Along with Dr. Eugene Syriani we analyzed a set of visual language IDEs and developed formal metrics for comparison. The goal of this research was to understand the important positive and negative features of visual language IDEs in a formal setting.


Conference Paper: ACM Digital Library
Conference Slides: pdf (1.5M)

Amalthea REU

Project Title: Large-scale Clustering for Big Data Analytics: A MapReduce Implementation of Hierarchical Affinity Propagation

Summer 2013 I was engaged as an undergraduate researcher with the Florida Institute of Technology and the National Science Foundation conducting research into machine learning in theory and application with the Amalthea REU program.

The project involved architecting and implementing a Big Data scale data-mining framework with a research partner and a graduate mentor. The application was built on existing free and open source software frameworks, namely Apache Hadoop for MapReduce and distributed filesystem implementations, Apache Mahout libraries for mathematical support and algorithm benchmarking, and Apache Hive for data warehousing and post-processing data manipulation.

Specifically, we parallelized the exemplar-based clustering algorithm Hierarchical Affinity Propagation using a MapReduce framework. Preliminary results of running the distributed algorithm on's Elastic MapReduce (EMR) service indicate superior performance as compared to competitive algorithms from the Apache Mahout libraries.


Conference Paper: pdf (1M)
Conference Slides: pptx (27M)

Source Package: tar.gz (4.5M)
Git Repository: GitHub
API: JavaDoc

Demo: available
Project Video: YouTube
Project Poster: pdf (3M)
Technical Report: pdf (2M)