Caching Git Credentials on Remote Hosts

9 min read··~ views

I own several Raspberry Pi's. I have been recently tinkering with Ansible to ease the process of setting up them, and thereby also learning it. One thing I want to set up is an easy Git flow. That is passwordless authentication, caching of credentials, etc. On my daily drive, I use public key authentication with SSH. However, for these machines (remote hosts), I am considering a simpler method. I have used personal access token(PAT) before. And I think that using PAT could be more straight forward way.

PAT can be used in conjunction with GitHub CLI. So GitHub CLI will be downloaded and installed to the hosts, then PAT will be used to authenticate to GitHub, and finally GitHub CLI will be used as Git Credential Helper. To accomplish this task, an Ansible Playbook will be created.

Ansible

Docs defines the function of Ansible as “automates the management of remote systems and controls their desired state.” Having Python installed and SSH configured on remote systems are only requirements for Ansible to control and manage them. Example usages include setting up a web server or deploying an application. These usages can be expressed as the desired state remote hosts should reach. A file written in YAML is used to verbalize this desired state declaratively, then Ansible uses this file to move remote hosts to the desired state.

Ansible ran from Control Node (local host) to manage and control Managed Node (remote host). A list of logically organized remote hosts is stored in an Inventory file. The Inventory file is implicitly referred inside Playbook to assign mappings between remote hosts and roles/tasks. Role implies a behavior to be implemented, or a desired state to be reached. Roles achieve this by using tasks, handlers, variables, plugins, etc. Task is a combination of action (a module and its arguments) and logical statements like loops, and conditionals. A Module is a unit of code (or binary) that Ansible copies to and runs on remote hosts.

Example:

Playbook: Some files should exist on certain hosts.
Role: Files should exist.
Task: Copy a file from a local host to remote host. 
Action: Use copy module with file source and destination as its arguments.

Roles, Playbooks, Variables, etc. can be stored separately in their own files.

├── inventory.yml
├── playbook.yml
├── roles
│   ├── gh
│   │   ├── defaults
│   │   │   └── main.yml
│   │   └── tasks
│   │        └── main.yml

However, in this post, we are going to combine everything in a single playbook file to achieve the forementioned goal. This playbook file will contain variables and an ordered list of tasks.

I attempted to provide an overview of Ansible, though there may be additional details I have missed. Nevertheless, it should be sufficient for following the rest of the post.

Installing GitHub CLI

Here steps listed on GitHub CLI docs will be reproduced for Ansible. I am targeting Debian based machines, but still it's straight forward to adapt it to other distributions.

Let's add the GitHub CLI software repository(package source of GitHub CLI for Apt) to Apt sources. Variables seen inside double braces will be added at last section where whole Playbook is created.

Set machine architecture task writes the result of dpkg --print-architecture command to dpkg_architecture variable which is then registered as fact. Facts are global variables that contains information about remote hosts, where facts include information on operation system, architecture, etc.

Double braces inside strings tells templating engine used by Ansible to print value it contains, thereby injecting it to string. These four separate blocks are tasks, and these tasks performing actions using modules such as get_url and shell. For example the task using module get_url invokes it with arguments: url, dest, and mode.

- name: Copy GitHub CLI Archive Keyring
  become: yes
  get_url:
    url: '{{ github_cli_keyring_url }}'
    dest: '{{ github_cli_keyring_file }}'
    mode: '0644'

- name: Set machine architecture
  shell: dpkg --print-architecture
  register: dpkg_architecture

- set_fact:
    dpkg_architecture: '{{ dpkg_architecture.stdout }}'

- name: Add GitHub CLI repository into sources list
  become: yes
  apt_repository:
    repo: 'deb [arch={{ dpkg_architecture }} signed-by={{ github_cli_keyring_file }}] https://cli.github.com/packages stable main'
    state: present
    filename: github-cli

Now it's available to install via the apt package manager. But before that package lists have to be updated.

- name: Update package list
  become: yes
  ignore_errors: yes
  apt:
    update_cache: yes

- name: Ensure latest GitHub CLI is installed
  become: yes
  apt:
    name: gh
    state: latest

Authentication to GitHub

Mind that GitHub CLI will automatically store the credentials when HTTPS used as preferred protocol for Git operations.

PAT creation steps can be followed from here. I prefer to use a single PAT for all the remote hosts. Now the created token has to be copied/send over to hosts, and then the GitHub authentication process can be initiated. Before copying/sending it over, it has to be stored inside a medium such as file, environment variables, or Ansible vault on the local host. Normally it's recommended to use the vault for tokens, secrets and such. Depending on how it has chosen to be stored, different operations have to be done while transferring to remote hosts. Here in this section, only options of using environment variables and files are going to be explored.

It's not actually copied or send over to remote hosts in traditional sense. GitHub CLI authentication command is expecting the token to be provided via standard input. Ansible going to execute this command on remote hosts using shell module with stdin as the argument. No matter which ever method is used to store the token, the value written to stdin will be provided by the local host. Therefore, before this authentication command is executed on remote hosts, the value for the token will be sent over the network along with the command where it injected to stdin of the command.

Using environment variable

Login to GitHub via PAT using environment variables. Here Lookup plugin is used to access for environment variable on the local host is used. Lookup plugins are executed and evaluated on local host. Lookup plugins can be used to retrieve data from local host.

- name: Authenticate to GitHub
  shell: gh auth login --git-protocol https --with-token
  args:
    stdin: "{{ lookup('env', 'PAT') }}"

In order to use the environment variables on our host machine within Ansible, we either want to export it or define it before executing ansible-playbook or ansible command.

export PAT=TOKEN_FROM_GITHUB
ansible-playbook

# OR

PAT=TOKEN_FROM_GITHUB ansible-playbook

If you add space before commands, they are not going to be added to shell history.

Using file

Typically, storing tokens in plaintext might be a concern for some. However, I think it's not much different from having plaintext SSH keys stored locally. Moreover, GitHub CLI itself caches the PAT inside a file located in $HOME/.config/gh/hosts.yml, if you opt in to cache your credentials.

Here's the process of achieving this. For this task, it's assumed that the file containing the token is located in the user's home directory. In order to expose the contents of the token file to the task, first the file path containing the token will be stored in a variable, then that variable will be used to access the file contents.

- name: Authenticate to GitHub
  shell: gh auth login --git-protocol https --with-token
  args:
    stdin: "{{ lookup('file', token_file) }}"
  vars:
    token_file: "{{ lookup('env', 'HOME') }}/PATH/TO/TOKEN"

Use GitHub CLI as Git Credential Helper

Following task adds lines into your .gitconfig to make Git use GitHub CLI for Git credentials.

Those lines are:

.gitconfig
[credential "https://github.com/"]
  helper =
  helper = !/usr/local/bin/gh auth git-credential
[credential "https://gist.github.com/"]
  helper =
  helper = !/usr/local/bin/gh auth git-credential

gh auth git-credential is command that is designed to be only used by a Git process.

Here is the task for achieving this:

- name: GitHub CLI as Git Credential Helper
  shell: gh auth setup-git

Only, if not authenticated, authenticate

Idempotency refers to having performing a task either once or multiple times will produce the same outcome. For this playbook so far, it's mostly true. However, most modules in Ansible achieve this by checking whether the desired state has already been reached. If it has, they stop without performing further actions. For example, package module will not try to install a package if it's already installed.

Even though currently harmless, running some of these tasks unnecessary can be prevented by putting guards that check whether desired state has been already reached. One of them being "Don't try to reauthenticate, if already authenticated".

This is done by checking the authentication status and if that fails then try to authenticate. By using Blocks, logical groupings of tasks are created, which in turn allows for the implementation of exception handling. If the task in Block group fails, an exception is raised, and that exception can be handled by tasks in Rescue group.

- name: Ensure Authentication to GitHub
  block:
    - name: Check GitHub Authenticatino Status
      shell: gh auth status
  rescue:
    - name: Authenticate to GitHub
      shell: gh auth login --git-protocol https --with-token
      args:
        stdin: "{{ lookup('env', 'PAT') }}"

This task runs authentication action if authentication status fails. There are other ways to achieve something similar, such as registering a variable on authentication status failure, and then running the authentication task only when the variable is set.

All put to together as Playbook

Although when installed, Ansible creates a default inventory on the local host, we can provide a per-project inventory file. Typically, inventory files will have .ini extension, but I prefer to have them in YAML.

Example Inventory file:

inventory.yml
---
all:
  hosts:
    server:
      ansible_connection: ssh
      ansible_host: '192.168.1.90'
      ansible_port: 22
      ansible_user: john
      ansible_ssh_private_key_file: "{{ lookup('env', 'HOME') }}/.ssh/john"

All tasks in scope of the Play will have access to variables in the vars section of the Playbook. Here is the final version of collection of tasks inside a playbook along with variables:

playbook.yml
---
- hosts: all
  vars:
    github_cli_keyring_file: /usr/share/keyrings/githubcli-archive-keyring.gpg
    github_cli_keyring_url: https://cli.github.com/packages/githubcli-archive-keyring.gpg

  tasks:
    - name: Github CLI archive keyring exists
      become: true
      get_url:
        url: '{{ github_cli_keyring_url }}'
        dest: '{{ github_cli_keyring_file }}'
        mode: '0644'

    - name: Set machine architecture
      shell: dpkg --print-architecture
      register: dpkg_architecture

    - set_fact:
        dpkg_architecture: '{{ dpkg_architecture.stdout }}'

    - name: Github CLI repository is in sources list
      become: true
      apt_repository:
        repo: 'deb [arch={{ dpkg_architecture }} signed-by={{ github_cli_keyring_file }}] https://cli.github.com/packages stable main'
        state: present
        filename: github-cli

    - name: Package list updated
      become: true
      apt:
        update_cache: true

    - name: Latest Github CLI installed
      become: true
      apt:
        name: gh
        state: latest

    - name: Ensure Authentication to Github
      block:
        - name: Check Github Authentication Status
          shell: gh auth status
      rescue:
        - name: Authenticate to Github
          shell: gh auth login --git-protocol https --with-token
          args:
            stdin: "{{ lookup('env', 'PAT') }}"

    - name: Github CLI as Git Credential Helper
      shell: gh auth setup-git

Finally this Playbook can be ran as:

ansible-playbook playbook.yml --inventory inventory.yml