- Stakeholder: Instituto Politecnico de Portalegre
- Business Case: Research challenge created by the Instituto Politecnico de Portalegre created to classify students’ academic success. Researchers at the Instituto Politecnico de Portalegre want to reduce the rate of student academic failure in higher education.
Project Details
The aim is to create a dependable model using machine learning techniques to spot students who may be at risk of dropping out of higher education after enrolling. By identifying these students early, we can implement strategies to support them and reduce their likelihood of leaving college.
The model was trained and tested with higher education student data from Portugal, but it is relevant for colleges and universities in the United States as well. We also face the challenge of identifying at-risk students and providing interventions to help them graduate on time and succeed in their studies.
Project Requirements
Data Collection and Cleaning:
- Gather comprehensive data from reliable sources, Machine Learning Inventory via the University of California(Irvine) and publicly available datasets.
- Implement the OSEMN process to ensure accuracy, consistency, and integrity of the collected data.
- Develop clear documentation outlining the data collection methods and cleaning techniques employed.
Exploratory Data Analysis:
- Perform thorough exploratory data analysis to uncover patterns, trends, and predict which students are more likely to dropout of college from the dataset.
- Apply machine learning models to this ternary classification problem.
- Create insightful visualizations, statistical summaries, and interactive charts to communicate key findings effectively.
- Identify feature importances and refine the most performative model to improve with hyperparameter tuning.
Iterative Approach to Modeling:
- Utilize advanced statistical techniques, such as regression analysis and machine learning algorithms, to identify significant predictors of graduation rates.
- Develop at least 3 models to classify students as Enrolled, Graduate or Dropout.
- Document the methodology and rationale behind the evaluation metrics for each of models employed.
Recommendations and Policy Implications:
- Based on the analysis results, provide actionable recommendations for policymakers, educators, and stakeholders to minimize academic failure rates and promote educational equity.
- Propose strategies to address academic failure rates, enhance school resources, and implement evidence-based interventions.
- Clearly articulate the policy implications of the analysis and potential impact on graduation outcomes.
Documentation and Codebase:
- Provide comprehensive documentation that includes clear explanations of the methodology, data sources, and analytical techniques used in the analysis.
- Ensure the codebase is well-documented and organized, facilitating ease of understanding, replication, and further development by other users. -Follow best practices for code readability, efficiency, and maintainability.
Reproducibility and Open Access:
- Structure the repository in a way that allows others to easily replicate the analysis and verify the results.
- Provide clear instructions on how to obtain and preprocess the necessary data for the analysis.
- Ensure the repository and its contents are publicly accessible, promoting open access to the analysis, data, and code.
Collaboration and Feedback:
- Encourage collaboration by welcoming contributions from the open-source community, such as bug fixes, enhancements, and additional analyses.
- Provide guidelines and instructions for contributing to the project, ensuring a streamlined collaborative process.
- Actively engage with users, address inquiries, and consider feedback to improve the repository and its analysis.
- Respect privacy regulations and data protection policies while collecting and analyzing sensitive information.
- Safeguard the anonymity of individuals and schools involved in the dataset.
- Clearly communicate any limitations or ethical implications associated with the analysis.
By adhering to these project requirements, the “Student Academic Success” repository will provide a reliable, comprehensive, and accessible resource for researchers, educators, and policymakers interested in understanding and improving graduation outcomes for students in higher education