Wednesday, May 1, 2019

GitHub Repo setup

Problem: We needed a handy way to create GitHub repositories, enforce some naming restrictions and add appropriate team access. So, I automated this with Python 3.6+.

Along the way, I pulled some important tips from Bill Hunt's Gist addressing the same need.

I will discuss my code below, in sections, but look for the full code at the end of this post. The first section includes the comments on requirements to satisfy for this script to do its job. Note this is labelled as Python3.6 at the shebang line. You should really be on 3.7 by now, however!


#!/usr/bin/env python3.6

# Creates a new GitHub repo and added a team owner

# Setting up to run this:
#   1) Install python 3.6 (brew install python). Ensure
#      the shebang line of this script works for your
#      system.
#   2) pip3 install PyGithub (maybe pip or pip3.6 - whatever 
#      works for the python3.6 or 3.7 install)
#   3) Create a GitHub personal access token at 
#      https://github.com. Click your user picture in the
#      upper right, choose settings, then Developer settings
#      and Personal access tokens. Create a new token.
#      Scopes only needs to include repo + read:org scope
#      under the admin:org heading.
#   4) Insert that personal access token (PAT) in the git
#      global configuration file with: 
#        git config --global user.pat 
#   5) Also, ensure you have ssh set up to be used for
#      GitHub access
#   6) Edit this script to change the org_name and team_id
#      constants to match your needs
Next come the import statements. Most refer to Python standard libraries. However, there are two others: requests and PyGithub.

import os
import sys
import argparse
import getpass
import subprocess
import re
import json
import datetime
import getpass
import zipfile
from urllib.parse import urlparse

import requests
from github import Github
Some constants are defined. You need to edit org_name and team_id to match your GitHub configuration.

## Constants
org_name = 'dummy_org'
base_uri = f'https://git@github.com/{org_name}/'
repos_uri = f'https://api.github.com/orgs/{org_name}/repos'
teams_uri = f'https://api.github.com/orgs/{org_name}/teams'
# Find team ID via API call:
#   curl -H "Authorization: token "  \
#         https://api.github.com/orgs/{org-name}/teams
team_id = 12345678
Next comes the code related to getting the list of existing repos. This allows us to ensure that the new repo name does not conflict with an existing one. First, the getLastPage() function parses a URL returned by the GitHub api call to list repos, and from that, determines how many pages of repos exist. That number is used to drive a loop over that many iterations in the function getRepos().

# Used to determine how many pages there are on the repo list
def getLastPage(link_header):
  links = link_header.split(',')
  for link in links:
    link = link.split(';')
    if link[1].strip() == 'rel="last"':
      parsed_url = urlparse(link[0].strip(' <>'))
      link_data = parsed_url.query.split('&')
      for ld in link_data:
        if ld.startswith('page='):
          last_page = int( ld.split('=')[1] )
      return last_page
  return 0

# Make a list of all repos in organization
def getRepos(payload):
  print('getting list of all repos, please wait...')
  repos = list()
  head = requests.get(repos_uri, params = payload)
  last_page = getLastPage(head.headers['link'])
  if last_page != 0:
    for page in range(1, last_page+1):
      repo_payload = payload.copy()
      repo_payload['page'] = page
      resp = requests.get(repos_uri, 
                  params = repo_payload, headers = {})
      repos_raw = json.loads(resp.text)
      new_repos = [ r['name'] for r in repos_raw ]
      repos += new_repos
  return repos
Now, executed in the main body, we retrieve a GitHub personal access token, which has been stored in the Git global configuration (see opening comments on setup required). This access works by running a shell command in a subprocess. From that access token, we create a payload for POST transactions with GitHub to follow.

# Get the github personal access token from the git
# global config
proc = subprocess.Popen(["git","config","--global",
                         "--get","user.pat"],
                        stdout=subprocess.PIPE,
                        stderr=subprocess.STDOUT)
access_token = proc.stdout.read().strip().decode()
payload = {'access_token': access_token}
Create the list of repo names. Also parse the command line argument - the repo name.

# Build a list of repo names so we can check for a collision
# repos come in a paged form, so have to loop over them
repos = getRepos(payload)

# Parse command line
parser = argparse.ArgumentParser()
parser.add_argument('repo_name', help='desired repo name')
args = parser.parse_args()

# See if the repo already exists
if args.repo_name in repos:
  print(f'Repository {args.repo_name} already exists!')
  sys.exit(1)
Next, we will actually create the repo via the Python GitHub package. This is part of a large try block.
 
try:
  # create the repo
  g = Github(access_token)
  my_org = g.get_organization(org_name)

  repo_name = args.repo_name
  repo_description = f'repo for {repo_name}'
  new_repo = my_org.create_repo(repo_name,
                               description = repo_description,
                               private = True,
                               auto_init = False)
Now we add team permission to this new repo.

  # add a team to the repo with admin permissions
  teamrepo_uri = f'https://api.github.com/teams/{team_id}' +  \
                 f'/repos/{org_name}/{repo_name}'
  team_payload = payload.copy()
  # choose the permission schema you want for this team
  team_payload['permissions'] = 'admin'
  repo_r = requests.put(teamrepo_uri, params = payload,
                        data = json.dumps(team_payload))
Next, handle any exceptions. This is very bad code, because it is not designed to handle specific exceptions. It handles any exception by printing a message and exiting with a 1 status. It is suitable to run now, but should be refined to handle any specific exceptions that do occur. So far, no exceptions were encountered!

except:
   print("Exception type: %s, Exception arg: " +  \
         "%s\nException Traceback:\n%s" %  \
         (sys.exc_info()[0], sys.exc_info()[1], sys.exc_info()[2]))
   print('\nError in creating or configuring repo.',
         'Check it out and try again')
   sys.exit(1)
That's it. So, here is the full code of this script:

#!/usr/bin/env python3.6

# Creates a new GitHub repo and added a team owner

# Setting up to run this:
#   1) Install python 3.6 (brew install python). Ensure
#      the shebang line of this script works for your
#      system.
#   2) pip3 install PyGithub (maybe pip or pip3.6 - whatever 
#      works for the python3.6 or 3.7 install)
#   3) Create a GitHub personal access token at 
#      https://github.com. Click your user picture in the
#      upper right, choose settings, then Developer settings
#      and Personal access tokens. Create a new token.
#      Scopes only needs to include repo + read:org scope
#      under the admin:org heading.
#   4) Insert that personal access token (PAT) in the git
#      global configuration file with: 
#        git config --global user.pat 
#   5) Also, ensure you have ssh set up to be used for
#      GitHub access
#   6) Edit this script to change the org_name and team_id
#      constants to match your needs

import os
import sys
import argparse
import getpass
import subprocess
import re
import json
import datetime
import getpass
import zipfile
from urllib.parse import urlparse

import requests
from github import Github

## Constants
org_name = 'dummy_org'
base_uri = f'https://git@github.com/{org_name}/'
repos_uri = f'https://api.github.com/orgs/{org_name}/repos'
teams_uri = f'https://api.github.com/orgs/{org_name}/teams'
# Find team ID via API call:
#   curl -H "Authorization: token "  \
#         https://api.github.com/orgs/{org-name}/teams
team_id = 12345678

# Used to determine how many pages there are on the repo list
def getLastPage(link_header):
  links = link_header.split(',')
  for link in links:
    link = link.split(';')
    if link[1].strip() == 'rel="last"':
      parsed_url = urlparse(link[0].strip(' <>'))
      link_data = parsed_url.query.split('&')
      for ld in link_data:
        if ld.startswith('page='):
          last_page = int( ld.split('=')[1] )
      return last_page
  return 0

# Make a list of all repos in organization
def getRepos(payload):
  print('getting list of all repos, please wait...')
  repos = list()
  head = requests.get(repos_uri, params = payload)
  last_page = getLastPage(head.headers['link'])
  if last_page != 0:
    for page in range(1, last_page+1):
      repo_payload = payload.copy()
      repo_payload['page'] = page
      resp = requests.get(repos_uri, 
                  params = repo_payload, headers = {})
      repos_raw = json.loads(resp.text)
      new_repos = [ r['name'] for r in repos_raw ]
      repos += new_repos
  return repos

# Get the github personal access token from the git
# global config
proc = subprocess.Popen(["git","config","--global",
                         "--get","user.pat"],
                        stdout=subprocess.PIPE,
                        stderr=subprocess.STDOUT)
access_token = proc.stdout.read().strip().decode()
payload = {'access_token': access_token}

# Build a list of repo names so we can check for a collision
# repos come in a paged form, so have to loop over them
repos = getRepos(payload)

# Parse command line
parser = argparse.ArgumentParser()
parser.add_argument('repo_name', help='desired repo name')
args = parser.parse_args()

# See if the repo already exists
if args.repo_name in repos:
  print(f'Repository {args.repo_name} already exists!')
  sys.exit(1)

try:
  # create the repo
  g = Github(access_token)
  my_org = g.get_organization(org_name)

  repo_name = args.repo_name
  repo_description = f'repo for {repo_name}'
  new_repo = my_org.create_repo(repo_name,
                               description = repo_description,
                               private = True,
                               auto_init = False)

  # add a team to the repo with admin permissions
  teamrepo_uri = f'https://api.github.com/teams/{team_id}' +  \
                 f'/repos/{org_name}/{repo_name}'
  team_payload = payload.copy()
  # choose the permission schema you want for this team
  team_payload['permissions'] = 'admin'
  repo_r = requests.put(teamrepo_uri, params = payload,
                        data = json.dumps(team_payload))

except:
   print("Exception type: %s, Exception arg: " +  \
         "%s\nException Traceback:\n%s" %  \
         (sys.exc_info()[0], sys.exc_info()[1], sys.exc_info()[2]))
   print('\nError in creating or configuring repo.',
         'Check it out and try again')
   sys.exit(1)

No comments:

Post a Comment