This lesson is being piloted (Beta version)
If you teach this lesson, please tell the authors and provide feedback by opening an issue in the source repository

Intermediate Research Software Development: Setup

You will need the following software installed and working correctly on your system to be able to follow the course.

Common Issues & Tips

If you are having issues installing or running some of the tools below, check a list of common issues other course participants encountered and some useful tips for using the tools and working through the material.

Command Line/Git Bash Tool

You will need a command line tool (shell/console) in order to run Python scripts and version control your code with Git.

To test your command line tool, start it up and type:

$ date

If your command line program is working - it should return the current date and time similar to:

Wed 21 Apr 2021 11:38:19 BST

Git Version Control Tool

Git is a program that can be accessed from your command line tool.

To test your Git installation, start your command line tool and type:

$ git help

If your Git installation is working you should see something like:

usage: git [--version] [--help] [-C <path>] [-c name=value]
           [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
           [-p | --paginate | --no-pager] [--no-replace-objects] [--bare]
           [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
           <command> [<args>]

These are common Git commands used in various situations:

start a working area (see also: git help tutorial)
   clone      Clone a repository into a new directory
   init       Create an empty Git repository or reinitialize an existing one

work on the current change (see also: git help everyday)
   add        Add file contents to the index
   mv         Move or rename a file, a directory, or a symlink
   reset      Reset current HEAD to the specified state
   rm         Remove files from the working tree and from the index

examine the history and state (see also: git help revisions)
   bisect     Use binary search to find the commit that introduced a bug
   grep       Print lines matching a pattern
   log        Show commit logs
   show       Show various types of objects
   status     Show the working tree status

grow, mark and tweak your common history
   branch     List, create, or delete branches
   checkout   Switch branches or restore working tree files
   commit     Record changes to the repository
   diff       Show changes between commits, commit and working tree, etc
   merge      Join two or more development histories together
   rebase     Reapply commits on top of another base tip
   tag        Create, list, delete or verify a tag object signed with GPG

collaborate (see also: git help workflows)
   fetch      Download objects and refs from another repository
   pull       Fetch from and integrate with another repository or a local branch
   push       Update remote refs along with associated objects

'git help -a' and 'git help -g' list available subcommands and some
concept guides. See 'git help <command>' or 'git help <concept>'
to read about a specific subcommand or concept.
$ git help

When you use Git on a machine for the first time, you need to configure a few things:

This can be done from the command line as follows:

$ git config --global user.name "Your Name"
$ git config --global user.email "name@example.com"
$ git config --global core.editor "nano -w"

Make sure to use the same email address you used to open an account on GitHub that you will use for this course (see below for GitHub setup instructions).

Proxy Settings for Git

When you run Git commands from the command line/Git Bash, your computer connects to the Internet. For security reasons, AstraZeneca computers block all Internet connections that do not go through the AstraZeneca proxy server. However, on the AstraZeneca wired network and the AZ-Corporate wifi, connections from the command line/Git Bash are not automatically routed through the proxy which often results in error messages that report “Time out” (caused by the command line/Git Bash trying to connect to the Internet but being constantly stopped by the AstraZeneca network, until it reaches the time limit and crashes with the error).

To allow your command line/Git Bash to connect to the Internet when on these networks, you need to specify that the Git commands should use the AstraZeneca proxy, by amending .gitconfig file in your home directory (e.g. C:\Users\abcd057\.gitconfig on Windows or \Users\abcd057\.gitconfig on Mac/Linux). If .gitconfig file does not exist on your system, you can create it yourself. You will need to append the following lines to it:

[alias]
unproxy = config --global --remove-section http
unproxy2 = config --global --remove-section https
proxy = config --global http.proxy <proxy-server>:<proxy:port>
proxy2 = config --global https.proxy <proxy-server>:<proxy:port>

You can obtain your <proxy-server>:<proxy:port> settings from the AstraZeneca workshop organisers ahead or from the trainers at your workshop, or, if they are not available, from the AstraZeneca IT support.

If you are on the AstraZeneca wired network or the AZ-Corporate wifi, you can turn on the proxy by issuing the following command from your command line/Git Bash:

$ git proxy

If you are on the AZ-Guest wifi or your home wifi, turn off the proxy by issuing the following command from your command line/Git Bash:

$ git unproxy

GitHub Account

GitHub is a free, online host for Git repositories that you will use during the course to store your code in so you will need to open a free GitHub account if you don’t already have one.

Secure Access To GitHub Using Git From Command Line

In order to access GitHub using Git from your machine securely, you need to set up a way of authenticating yourself with GitHub through Git. For the purposes of this training, by default and to avoid any issues, you should create and use a “classic” GitHub personal access token by following these instructions. When creating the token, give it a memorable name, and ensure it is set with the following:

What About Using my GitHub Password?

Using GitHub passwords to access GitHub is now prohibited. On 13 August 2021, GitHub strengthened security requirements for all authenticated Git operations. For this reason, and for expediency in delivering this training, by default it’s recommended that you use a personal access token to authenticate yourself to GitHub from the command line (e.g. when you want to push your local changes to your code repository on GitHub).

Sharing Sensitive Information on GitHub

Warning

As part of this course we are not working or sharing any sensitive data. However, you should still be careful not to upload any AstraZeneca-related work to a public repository outside the AstraZeneca Enterprise GitHub, make sure to choose a working directory for this course outside of any AstraZeneca work and follow AstraZeneca guidelines on using GitHub.

Testing your Git and Proxy Set Up

Once you have created a personal GitHub account (if you don’t already have one), installed and configured your Git installation, and configured your proxy settings as above, to save time at the workshop please do the following to verify that you have done this correctly:

  1. Using your personal GitHub account, create a public repository on GitHub.
  2. If you are on an AstraZeneca Windows computer, from GitBash navigate to your user directory (otherwise you may not have the permission to write files):
    $ cd
    
  3. Clone the new repository on your local machine, e.g.:
    $ git clone <github_repository_url>
    
  4. Create a new text file named README.txt in the root directory of the cloned repository, containing anything you like.
  5. Add and commit the new file to the repository, and push the change to GitHub, e.g.

    $ git add README.txt
    $ git commit -m "Initial readme file commit"
    $ git push -u origin main
    
  6. Copy and paste the URL of the GitHub repository, along with your name, into the shared Google Document.

Python Distribution

The material has been developed using the standard Python distribution version 3.8 and is using venv for virtual environments and pip for package management. The material has not been extensively tested with other Python distributions and package managers, but most sections are expected to work with some modifications. For example, package installation and virtual environments would need to be managed differently, but Python script invocations should remain the same regardless of the Python distribution used.

To download a Python distribution for your operating system, please head to Python.org.

For AstraZeneca-managed computers, you can obtain Python 3.9.7 from the AstraZeneca Software Store. Please make sure not to use Anaconda as it is not free for commercial use.

We recommend using at least Python version 3.8+ but any supported version should work (i.e. version 3.7 onward. Specifically, we recommend upgrading from Python 2.7 wherever possible; continuing to use it will likely result in difficulty finding supported dependencies or syntax errors).

You can test your Python installation from the command line with:

$ python3 --version

If all is well with your installation, you should see something like:

Python 3.8.2

To make sure you are using the standard Python distribution and not some other distribution you may have on your system, type the following in your shell:

 $ python3

This should enter you into a Python console and you should see something like:

Python 3.8.2 (default, Jun  8 2021, 11:59:35) 
[Clang 12.0.5 (clang-1205.0.22.11)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 

Press CONTROL-D or type exit() to exit the Python console.

venv and pip

If you are using a Python 3 distribution from Python.org, venv and pip will be automatically installed for you. If not, please make sure you have these two tools (that correspond to your Python distribution) installed on your machine.

PyCharm IDE

We use JetBrains’s PyCharm Python Integrated Development Environment for the course. PyCharm can be downloaded from the JetBrains website. The Community edition is fine, though if you are developing software for the purpose of academic research you may be eligible for a free license for the Professional edition which contains extra features.

For AstraZeneca-managed computers, PyCharm Community Edition is available from the AZ Software Store.