Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bedevere-bot does not have 2FA enabled #531

Open
ewdurbin opened this issue Jan 18, 2023 · 10 comments
Open

bedevere-bot does not have 2FA enabled #531

ewdurbin opened this issue Jan 18, 2023 · 10 comments

Comments

@ewdurbin
Copy link
Member

Not sure who owns the login for this bot either.

The Python org will soon require 2FA for all member users, so as a stopgap enabling 2FA is sufficient (ideally making sure access credentials are shared with more than one person, PSF Infra staff for instance).

Long term though moving to a GitHub App or GitHub Action like #344 is a better solution.

@Mariatta
Copy link
Member

I don't have the login for bedevere, but I have it for miss-islington in case you need it.

I remembered it was @brettcannon who created the bot initially. Brett, do you still have the login for bedevere-bot?

@brettcannon
Copy link
Member

I do have the login credentials, although it would mean I'm the only one who can log into the account (which I would rather not have happen). Do we need to make it a priority to have @ambv move Bedevere over to GitHub Actions so the account isn't a concern (I take it the volunteer effort to do the port hasn't materialized)?

@nightlark
Copy link

Isn't PR #459 most of the port of Bedevere to GitHub Actions?

@arhadthedev
Copy link
Member

arhadthedev commented Feb 14, 2023

I have a lesser invasive port that replaces just __main__.py leaving the rest of gidgethub-based routing intact (however, I didn't test it fully plus documentation changes are required):

bot patch
From 50902410d3df1410caa94f425d2eb9e021071db4 Mon Sep 17 00:00:00 2001
From: Oleg Iarygin <[email protected]>
Date: Sun, 12 Feb 2023 22:41:37 +0400
Subject: [PATCH 1/2] Change a frontend to GitHub Actions

---
 action.yml           | 41 ++++++++++++++++++++++++++++++
 bedevere/__main__.py | 60 ++++++++++++++++++++++++++++----------------
 runtime.txt          |  1 -
 3 files changed, 80 insertions(+), 22 deletions(-)
 create mode 100644 action.yml
 delete mode 100644 runtime.txt

diff --git a/action.yml b/action.yml
new file mode 100644
index 00000000..b9941480
--- /dev/null
+++ b/action.yml
@@ -0,0 +1,41 @@
+name: Sir Bedevere of Python
+author: Python Software Foundation
+description: Determines if a pull request misses some important bits
+inputs:
+  token:
+    description: >
+      A token for reading pull request information and leaving comments.
+      Must be either:
+
+      - a repo scoped fine-grained personal access token (PAT) with
+        Commit Statuses, Issues, Pull Requests permissions set to Read
+        And Write, and Metadata set to Read (reminder: the rest must be
+        set to None for security reasons)
+      - an OAuth token
+    required: true
+  event:
+    description: GitHub event to check for shortfalls
+    required: true
+runs:
+  using: composite
+  steps:
+    - uses: actions/setup-python@v4
+      with:
+        python-version: 3.11
+    - run: python -m pip install -r requirements.txt
+      shell: bash
+      working-directory: ${{github.action_path }}
+    - run: echo "$EVENT_BODY" | python -m bedevere
+      env:
+        # Use environment variables to avoid scripting injection thus
+        # exposing Bedevere's personal access token. For details, see
+        # <https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions#understanding-the-risk-of-script-injections>.
+        #
+        # Note: in Linux, environment variables have no relevant length
+        # limit (it is ~1GB - <https://stackoverflow.com/questions/1078031/what-is-the-maximum-size-of-a-linux-environment-variable-value>).
+        EVENT_NAME: ${{ github.event_name }}
+        EVENT_BODY: ${{ toJSON(github.event) }}
+        GH_AUTH: ${{ inputs.token }}
+        RUN_ID: ${{ github.run_id }}
+      shell: bash
+      working-directory: ${{ github.action_path }}
diff --git a/bedevere/__main__.py b/bedevere/__main__.py
index 3a3c0740..deaeb3f7 100644
--- a/bedevere/__main__.py
+++ b/bedevere/__main__.py
@@ -1,15 +1,22 @@
+""" A dispatcher of GitHub Actions events to maintain CPython workflow.
+
+This file is run from ../action.yml via ``runs.steps.run``.
+
+Logs of runs triggered by a specific PR can be found at
+`https://github.com/python/cpython/actions/workflows/call-bedevere.yml?query=branch%3A{PR_BRANCH_NAME_WITHOUT_AUTHOR}`
+"""
+
 import asyncio
-import importlib
+import json
 import os
 import sys
 import traceback
 
 import aiohttp
-from aiohttp import web
 import cachetools
 from gidgethub import aiohttp as gh_aiohttp
 from gidgethub import routing
-from gidgethub import sansio
+from gidgethub.sansio import Event
 
 from . import backport, gh_issue, close_pr, filepaths, news, stage
 
@@ -22,36 +29,47 @@
 
 sentry_sdk.init(os.environ.get("SENTRY_DSN"))
 
-async def main(request):
+async def dispatch(event, oauth_token):
     try:
-        body = await request.read()
-        secret = os.environ.get("GH_SECRET")
-        event = sansio.Event.from_http(request.headers, body, secret=secret)
-        print('GH delivery ID', event.delivery_id, file=sys.stderr)
-        if event.event == "ping":
-            return web.Response(status=200)
-        oauth_token = os.environ.get("GH_AUTH")
         async with aiohttp.ClientSession() as session:
             gh = gh_aiohttp.GitHubAPI(session, "python/bedevere",
                                       oauth_token=oauth_token,
                                       cache=cache)
-            # Give GitHub some time to reach internal consistency.
-            await asyncio.sleep(1)
+            print(f'!!! event handlers: {router.fetch(event)}')
             await router.dispatch(event, gh, session=session)
         try:
             print('GH requests remaining:', gh.rate_limit.remaining)
         except AttributeError:
             pass
-        return web.Response(status=200)
     except Exception as exc:
         traceback.print_exc(file=sys.stderr)
-        return web.Response(status=500)
+
+# We do not use command line arguments so no use for argparse
+help_text = '''Identify missing information for CPython pull requests
+
+Inputs:
+- EVENT_NAME environment variable with an event name (like 'push')
+- GH_AUTH environment variable with a GitHub PAT or OAuth token
+- RUN_ID environment variable with a string for logging
+- SENTRY_DSN optional environment variable with an error report key
+- stdin with an event to process in JSON format
+'''
+
+def abort(text):
+    print(f'error: {text}', file=sys.stderr)
+    print(help_text, file=sys.stderr)
+    sys.exit(1)
 
 
 if __name__ == "__main__":  # pragma: no cover
-    app = web.Application()
-    app.router.add_post("/", main)
-    port = os.environ.get("PORT")
-    if port is not None:
-        port = int(port)
-    web.run_app(app, port=port)
+    try:
+        event_dict = json.load(sys.stdin)
+        event_name = os.environ['EVENT_NAME']
+        run_id = os.environ['RUN_ID']
+        event = Event(event_dict, event=event_name, delivery_id=run_id)
+        access_token = os.environ['GH_AUTH']
+        asyncio.run(dispatch(event, access_token))
+    except json.decoder.JSONDecodeError:
+        abort('empty standard output')
+    except KeyError as e:
+        abort(f'{e.args[0]} environment variable not set')
diff --git a/runtime.txt b/runtime.txt
deleted file mode 100644
index a5da7cc4..00000000
--- a/runtime.txt
+++ /dev/null
@@ -1 +0,0 @@
-python-3.10.5

From c51d1d2d5c7a9cec31ce16af209f42a5905070dd Mon Sep 17 00:00:00 2001
From: Oleg Iarygin <[email protected]>
Date: Sun, 12 Feb 2023 21:45:57 +0400
Subject: [PATCH 2/2] Add tests

---
 tests/test___main__.py | 61 +++++++++++++++++++++---------------------
 1 file changed, 30 insertions(+), 31 deletions(-)

diff --git a/tests/test___main__.py b/tests/test___main__.py
index e9b49172..4f407222 100644
--- a/tests/test___main__.py
+++ b/tests/test___main__.py
@@ -1,38 +1,37 @@
-from aiohttp import web
-import pytest
+from io import StringIO
+from itertools import chain, compress, product, repeate
+from test.support import SHORT_TIMEOUT
+from test.support.script_helper import spawn_python
+from unittest import TestCase
 
 from bedevere import __main__ as main
 
+class CliTests(TestCase):
 
-async def test_ping(aiohttp_client):
-    app = web.Application()
-    app.router.add_post("/", main.main)
-    client = await aiohttp_client(app)
-    headers = {"x-github-event": "ping",
-               "x-github-delivery": "1234"}
-    data = {"zen": "testing is good"}
-    response = await client.post("/", headers=headers, json=data)
-    assert response.status == 200
+    def _run_python(self, *args, *, env, expect_success):
+        with StringIO('{}') as valid:
+            proc = spawn_python(*args, stdin=valid, env=env)
+            proc.wait(SHORT_TIMEOUT)
+            if expect_success:
+                self.assertEqual(proc.returncode, 0)
+            else:
+                self.assertNotEqual(proc.returncode, 0)
 
+    def run_bedevere(environ_keys, *, feed_stdin, expect_success):
+        env = dict.fromkeys(environ_keys, 'example')
+        with StringIO('{}' if feed_stdin else '') as mock_stdin:
+            self._run_python('-m', 'bedevere', env=env,
+                             expect_success=expect_success)
 
-async def test_success(aiohttp_client):
-    app = web.Application()
-    app.router.add_post("/", main.main)
-    client = await aiohttp_client(app)
-    headers = {"x-github-event": "project",
-               "x-github-delivery": "1234"}
-    # Sending a payload that shouldn't trigger any networking, but no errors
-    # either.
-    data = {"action": "created"}
-    response = await client.post("/", headers=headers, json=data)
-    assert response.status == 200
+    def test_environment_variables(self):
+        env_names = ('EVENT_NAME', 'GH_AUTH', 'RUN_ID')
+        env_selectors = product((False, True), repeat=len(env_names))
+        env_combinations = reversed(compress(env_names, env_selectors))
+        env_correctness = zip(env_combinations, chain((True,), repeat(False)))
 
-
-async def test_failure(aiohttp_client):
-    """Even in the face of an exception, the server should not crash."""
-    app = web.Application()
-    app.router.add_post("/", main.main)
-    client = await aiohttp_client(app)
-    # Missing key headers.
-    response = await client.post("/", headers={})
-    assert response.status == 500
+        feed_stdin = (True, False)
+        test_scenarios = product(env_correctness, feed_stdin)
+        for env, is_correct, set_stdin in test_scenarios:
+            with self.subTest(env=env, set_stdin=set_stdin):
+                self.run_bedevere(env, feed_stdin=set_stdin,
+                                  expect_success=is_correct)
bot invocation
# .github/workflows/trigger-bots.yml
name: Bots

on:
  # PR synchronization is a push into a branch associated with the PR
  pull_request:
    types: [opened, edited, synchronize, labeled, unlabeled]
  issue_comment:
    types: [created, edited]
  pull_request_review_comment:
    types: [created, edited]

jobs:
  call_bedevere:
    name: Call Bedevere
    runs-on: ubuntu-latest
    steps:
      - uses: arhadthedev/bedevere@c51d1d2d5c7a9cec31ce16af209f42a5905070dd
        with:
          token: ${{ secrets.BEDEVERE_PAT }}
          event: ${{ toJSON(github.event) }}

However, I found out that only account-bound personal access tokens can use the REST API to query issues/PRs, modify statuses and post comments. Without tokens we must either use GitHub-provided @github/* JavaScript libraries from our Python code or rewrite everything from scratch in JS. I would refuse to go such two marvelous ways of pain and sufferings.

@ewdurbin
Copy link
Member Author

If the desire is to retain running bedevere as is via the REST API, the smallest "lift" would be to migrate it to a GitHub App which can be granted permissions in the same way that an account/PAT can. gidgethub has support for this

@Mariatta
Copy link
Member

I have a PR for having bedevere run as GitHub App here #569.

It will solve the issue for 2FA requirements, except that bedevere's token is also being used for buildbots, and buildbots doesn't seem to support GitHub App.

@ewdurbin
Copy link
Member Author

Thank you @Mariatta, I've assigned this to myself for review. I'll also assist in getting it deployed for bedevere.

Can you detail more about how buildbots are using the token?

@Mariatta
Copy link
Member

Thanks.

From what I understand (@pablogsal or @zware may have more info), we use bedevere's oauth token for the buildbots server. The token is stored in Buildbot's config. Buildbot doesn't currently support authenticating as a GitHub App. It seems that it's storing bedevere's OAuth Token as a secret, and when it needs to make GitHub API calls (to post status checks, etc), it uses the secret.
I'm invesitgating buildbot's codebase to see how we can change it into GitHub App.

@pradyunsg
Copy link
Member

With #569 merged and deployed, is the next step here to delete the bot account?

@Mariatta
Copy link
Member

Not yet because it is still a dependency for buildbot, but that's the next thing I want to tackle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants