Objective
Refactor the Jupyter Notebook from the Titanic Kaggle competition into a well-structured repository with proper software engineering practices.
Tasks
- Getting started:
- Code Modularisation:
- Rewrite code as functions from the Jupyter Notebook.
- Create appropriate Python modules to house these functions.
- Write clear docstrings for all functions.
- Repository Structure:
- Organise the project into a clear directory structure.
- Create an
environment.yml file for dependency management.
- Notebook Refactoring:
- Maintain the Jupyter Notebook to present your analysis.
- But replace code cells with functions you import from your own modules to abstract the implementation details away.
- Implement solutions for cells marked with "TODO": often that means calling a function you will have written before.
- Version Control and GitHub:
- Use Git for version control.
- Create separate branches for different features/modules.
- Create PRs for your changes and conduct thorough code reviews.
Workflow
- Fork and clone the repository.
- Create a new branch for each module or major feature.
- Implement the modularisation and any TODO items.
- Create a pull request for review.
- Address feedback and iterate as necessary.
- Merge approved changes into the main branch.
Remember, the goal is to transform the initial Jupyter Notebook into a well-structured, maintainable, and collaborative Python project while preserving the analysis workflow.