class: center, middle, inverse, title-slide # Research Project Management ### Jinliang Yang ### Jan. 23th, 2020 --- # Research Project Management ## Challenges in a research project management - Spanning over years - Transferability - Data backup -- - Many steps and many rounds of revisions - Version control - Tracing the changes -- - Disseminating to your co-workers, collaborators, etc. - Reproducible - Transparent - Visualization --- # Research Project Management ### 1. Employ __git__ for version control ### 2. Construct your own project __directory system__ ### 3. Some tips regarding best practice for project management --- # Project Management ### 1. Employ __git__ for version control __Git__ is a [free and open source](https://git-scm.com/) distributed __version control system__ designed to handle everything from small to very large projects with speed and efficiency. -- - Github: is a git based repository hosting platform - Gihub Education: [student pack](https://education.github.com/students) - GitLab: is another repository manager which lets teams collaborate on code - GitLab UNL edition: https://git.unl.edu/ #### Clone the template git repo: Type `git` to find out the most commonly used git commands in your terminal. ```bash git ``` --- # Project Management ### 1. Employ __git__ for version control Git [cheat-sheet](https://github.github.com/training-kit/downloads/github-git-cheat-sheet.pdf) ### Synchronize changes ```bash # uploads all local branch commits to Github git push # Downloads all history from the remote branches git fetch # Combines remote branch into current local branch git merge # updates your current local working branch with all new commits from the corresponding remote branch git pull ``` -- ### Make changes ```bash # snapshots all the file in prparation for versioning git add --all # records file snapshots permanently in version history git commit -m "descriptive message" ``` --- # Project Management ### 1. Employ __git__ for version control - `git clone`: clone a local version of a repository, including all commits and branches. - `fork`: a copy of a repository on GitHub owned by a different user. - `remote`: a common repository on GitHub that all team member use to exchange their changes ```bash ### go to github to fork it https://github.com/jyanglab/agro932-lab ### clone a local version git clone git@github.com:"USER"/agro932-lab.git ``` --- # Project Management ### 2. Construct your own project __directory system__ In a typical research project, I will copy the following folders into the project dir. The layout of the dir is based on the idea from [ProjectTemplate](http://projecttemplate.net/architecture.html). - __cache__: Here we store intermediate datasets that are generated during the preprocessing steps. - __data__: Here we store our raw data of small size. - Note that data of large size, i.e., > 100M, will be stored in a `largedata\` folder that has been ignored using `.gitignore`. - __doc__: Documentation codes (i.e. Rmd files) for generating the figures. - __graphs__: Graphs produced during the analysis. - __lib__: Some functions used within this project. - __profilling__: Contain main scripts for hte project. It contains some sub-directories. --- # Project Management ### 2. Construct your own project __directory system__ - __.gitignore__: specifies intentionally untracked files to ignore - __.git/__: git related files. - __TODO__: A todo list, markdown file. - __README__: readme file. -- - __largedata__: Untracked folder contains files with large size, e.g., sequencing data. - __*.Rproj__: RStudio projects make it straightforward to divide your work into multiple contexts, each with their own working directory, workspace, history, and source documents. --- # Project Management ### 3. Some tips regarding best practice for project management A __path__ specifies a unique location in a file system. - An __absolute or pull path__ points to the same location in a file sytem, regardless of the current working directory. > "/Users/jyang/Documents/courses/AGRO-931-2018" - A __relative path__ is a way to specify the location of a directory relative to another directory. > "courses/AGRO-931-2018" --- # Project Management ### 3. Some tips regarding best practice for project management I employ a numbering system to sort the research code. - Scripts were named by number, letter, and other numbers that separated by dots. For example: - `1.A.1_pheno_processing.Rmd` - `1.A.2_pheno_plot.Rmd` --- # Project Management ### 3. Some tips regarding best practice for project management > A commit message shows whether a developer is a good collaborater (to others or to a future yourself) Use informative commit messages. Read the following suggestions: - [How to write a git commit message](https://chris.beams.io/posts/git-commit/) - [On commit messages](http://who-t.blogspot.com/2009/12/on-commit-messages.html)