This is an intermediate level workshop that will teach best practices of using Python and R together within a single research project. I walk participants through steps of a mini project involving data analysis and visualization, highlighting use of free and open source tools and challenges of collaboration on research code. This workshop uses economic, demographic and geographic data characterizing US communities that is freely and publicly available from the US Census Bureau and the USDA Economic Research Service websites. Prior experience with R or Python is recommended.
Workshop topics:
- introduction and challenges of multi-language collaborative projects
- setting up a portable computational environment with
conda
andrenv
- data retrieval and preparation in Python
- using Python from R with
reticulate
- basic regression analysis in R
- using R from Python with
rpy2
- interactive presentation of results with Python in a Jupyter notebook
This workshop will be conducted by Anton Babkin at the 2024 Data Science Research Bazaar, University of Wisconsin-Madison, February 7, 2024.
To follow along with the workshop, you need to have the stack of tools installed on your computer.
-
Install
conda
+mamba
. Instructions. -
Clone this repository to your computer. Open RStudio and select File -> New Project... -> Version Control -> Git, enter this repository URL:
https://github.com/antonbabkin/workshop-pythonr
You can also clone using your preferred Git tool and then open project in RStudio. Ignore the error message about 'renv/activate.R', it will be fixed after the next step. -
Install renv and required R packages. Packages will be installed according to specification listed in the
DESCRIPTION
file. Make sure you have theworkshop-pythonr
project open in RStudio and run the following commands in Console:# install renv install.packages("renv") # initialize renv environment, choose option "1: Use only the DESCRIPTION file." renv::init(bare = TRUE) # install packages listed in DESCRIPTION file renv::install()
-
Install required Python packages into a new conda environment. Using
mamba
for this will typically work better. In a terminal, navigate to the repository folder and create the environment specified in theenvironment.yml
file.cd workshop-pythonr mamba env create -f environment.yml
-
Activate conda environment and start Jupyter Lab to work with Python notebooks.
conda activate workshop-pythonr jupyter lab
Project code is licensed under the MIT license. All other content is licensed under the Creative Commons Attribution 4.0 International license.