If you’d like to skip the background and technical information, you can find the script for converting OpenAPI specs to Anki decks and usage examples here
Introduction
My last two posts on this site have been related to my preparation for the SAUTO 300-735 exam. Primarily dealing with Cisco security appliance automation, the exam predictably requires good knowledge of many separate APIs. Furthermore, the exam will expect the taker to have memorized certain API-specific details like URI formats and style conventions.
For this reason, it is important that I develop a healthy familiarity with the APIs listed in the exam blueprint, the majority being REST APIs. The documentation associated with these REST APIs often include OpenAPI (or formerly Swagger) specifications, which contain a veritable treasure trove of information about the REST APIs format and offerings. These are commonly published in JSON or YAML formats, which can easily parsed by scripts or tools used for API interaction.
In addition to being useful intrinsically as reference material, OpenAPI files contain important bits of data that can prove more useful on their own than in large, complex data structures. The most useful bit of information for my endeavors was the list of all URIs for an API. At the very least, a text file containing all API paths for a given service could prove useful for scripting and getting a full picture of the capabilities of said service. That was my original thought process, any way.
After this, I quickly began writing an informal Python script called swagger2urls.py which would extract of the URIs and write them to a simple newline separated text file. However, soon after completing the minimum viable product, the greater potential for this script was realized. Not only could URIs be sent to a text file, but they could be enriched with extra data as to make them especially useful for certain circumstances. For example, the output could optionally include the specification provided descriptions for each path. The output could also be manipulated for our benefit. For example, sections of the URIs containing variable information could be replaced with specialized keywords used for fuzzing, brute-force attacks, or object enumeration.
The last aspect of the script which I wrote was the ability to convert these OpenAPI schemas to Anki decks. Anki is a very neat open-source spaced repetition tool that I primarily use for the memorization tasks that I encounter in exams. As it turns out, memorizing API URIs certainly qualifies as a rout memorization task!
For my generated Anki decks, I used a very nice Python library called genanki. I simplified the flash cards by having the format include a description and domain name for the front side and a URI/method on the back.
Front: “petstore.example.com - uploads an image to a pet”
Back: “POST: petstore.example.com/pet/{petId}/uploadImage”
Once the necessary data has been extracted and the format created, all we have to do is package and export each card in the deck to a single .apkg file via the genanki library. On the whole, this process can be repeated for each REST API with an associated spec that is required or useful to have memorized for the exam.
I hope that in sharing this small tool, somebody might also derive value from it (I know I certainly have).
Usage
$ ./swagger2urls.py https://petstore.swagger.io/v2/swagger.json
Extracted URLs:
https://petstore.swagger.io/pet/{petId}/uploadImage
https://petstore.swagger.io/pet
https://petstore.swagger.io/pet/findByStatus
https://petstore.swagger.io/pet/findByTags
https://petstore.swagger.io/pet/{petId}
https://petstore.swagger.io/store/inventory
https://petstore.swagger.io/store/order
https://petstore.swagger.io/store/order/{orderId}
https://petstore.swagger.io/user/createWithList
https://petstore.swagger.io/user/{username}
https://petstore.swagger.io/user/login
https://petstore.swagger.io/user/logout
https://petstore.swagger.io/user/createWithArray
https://petstore.swagger.io/user
URLs saved to extracted_urls.txt
$ ./swagger2urls.py -h
usage: swagger2urls.py [-h] [--output OUTPUT] [--replace-domain REPLACE_DOMAIN] [--verbose] [--wfuzz] [--methods] [--anki] url
Fetch Swagger API documentation and extract URLs.
positional arguments:
url The URL of the Swagger API documentation
options:
-h, --help show this help message and exit
--output OUTPUT Output file to save the extracted URLs
--replace-domain REPLACE_DOMAIN
Replace domain in the extracted URLs
--verbose Enable verbose output (contains summary of URLs)
--wfuzz Replace parameters with FUZZ in the output
--methods Include HTTP methods in the output
--anki Create Anki cards for each URL
Conclusion
This script and its associated files are available as a repository on my Gitea server or its GitHub mirror.
Full Script
#!/usr/bin/env python3
"""
Benjamin Hays - swagger2urls.py
This script fetches Swagger API documentation and extracts URLs from it.
"""
import requests
import json
import argparse
import re
import genanki
import random
parser = argparse.ArgumentParser(description='Fetch Swagger API documentation and extract URLs.')
parser.add_argument('url', type=str, help='The URL of the Swagger API documentation')
parser.add_argument('--output', type=str, help='Output file to save the extracted URLs', default='extracted_urls.txt')
parser.add_argument('--replace-domain', type=str, help='Replace domain in the extracted URLs', default=None)
parser.add_argument('--verbose', action='store_true', default=False, help='Enable verbose output (contains summary of URLs)')
parser.add_argument('--wfuzz', action='store_true', default=False, help='Replace parameters with FUZZ in the output')
parser.add_argument('--methods', action='store_true', default=False, help='Include HTTP methods in the output')
parser.add_argument('--anki', action='store_true', default=False, help='Create Anki cards for each URL')
args = parser.parse_args()
swagger_url = args.url
if swagger_url.endswith('.json'):
# JSON format
try:
response = requests.get(swagger_url)
response.raise_for_status() # Raise an error for bad responses
swagger_data = response.json()
urls = []
if 'paths' in swagger_data:
for path, methods in swagger_data['paths'].items():
for method, details in methods.items():
if 'operationId' in details:
url = f"{swagger_url.split('/')[0]}//{swagger_url.split('/')[2]}{path}"
summary = details.get('summary', '')
urls.append((method.upper(), url, summary))
if urls:
print("Extracted URLs:")
seen = []
with open(args.output, 'w') as f:
for method, url, summary in urls:
if args.replace_domain:
url = url.replace(swagger_url.split('/')[2], args.replace_domain)
if args.wfuzz:
url = re.sub(r'\{[^}]+\}', 'FUZZ', url)
if args.verbose and summary:
output_line = f"{method}: {url} - {summary}" if args.methods else f"{url} - {summary}"
else:
output_line = f"{method}: {url}" if args.methods else url
if args.methods:
print(output_line)
f.write(f"{output_line}\n")
else:
if url not in seen:
seen.append(url)
print(output_line)
f.write(f"{output_line}\n")
print(f"URLs saved to {args.output}")
f.close()
if args.anki:
# Create Anki cards
new_deck = genanki.Deck(
random.randrange(1 << 30, 1 << 31),
'Swagger API URLs'
)
for method, url, summary in urls:
my_note = genanki.Note(
model=genanki.Model(
random.randrange(1 << 30, 1 << 31),
'Simple Model',
fields=[
{'name': 'Method'},
{'name': 'URL'},
{'name': 'Summary'},
],
templates=[
{
'name': 'Card 1',
'qfmt': args.replace_domain if args.replace_domain else swagger_url.split('/')[2] + ' -- {{Summary}}',
'afmt': '{{FrontSide}}<hr id="answer">{{Method}}: {{URL}}',
},
],
),
fields=[method, url, summary]
)
new_deck.add_note(my_note)
genanki.Package(new_deck).write_to_file('swagger_api_urls.apkg')
print("Anki deck created: swagger_api_urls.apkg")
else:
print("No URLs found in the Swagger documentation.")
except requests.RequestException as e:
print(f"Error fetching Swagger documentation: {e}")
except json.JSONDecodeError as e:
print(f"Error decoding JSON: {e}")
elif swagger_url.endswith('.yaml') or swagger_url.endswith('.yml'):
# YAML format
raise NotImplementedError("YAML format is not yet supported.")
else:
raise ValueError("Unsupported file format. Please provide a URL ending with .json, .yaml, or .yml.")