Search This Blog

Friday, August 8, 2014

Fetching Artifactory Maven Dependencies via Python

This clever python script will fetch your dependencies identified in a YAML file.  Don't forget that python uses white space for scope.  So make sure if you're going to copy and paste, to get it right.  The script uses artifactory's GAVC API.  This is the same API used by maven plugin for artifactory.  A nice feature is that you can run it multiple times, and by comparing md5 hashes, it will only download JAR files that have changed.  Also make note that javadoc, sources and POMs are omitted in the condition on line 67.

Because Python has a great API for Docker as well.  I will be using this code to implement something like fig to automatically deploy containers in my environment which can pull latest dependencies from Artifactory for installation.

First, an example of the YAML:

- artifact:
     artifactid:     accumulo-core
     groupid:     org.apache.accumulo
     version:     1.5.1

- artifact:
     artifactid:     accumulo-fate
     groupid:     org.apache.accumulo
     version:     1.5.1

- artifact:
     artifactid:     accumulo-trace
     groupid:     org.apache.accumulo
     version:     1.5.1

The Script:

#!/usr/bin/env python
import yaml
import hashlib
import os
import sys
import httplib
import json
import urllib2
from urlparse import urlparse

__author__ = 'champion'
artifactory_url = ""
#local download folder
local_folder = "./deps"
conn = httplib.HTTPConnection(artifactory_url)

def download(filename, remote):
    print "\nDownloading: " + remote
    req = urllib2.urlopen(remote)
    blocksize = 16 * 1024
    with open(local_folder + "/" + filename, 'wb') as fp:
        while True:
            chunk =
            if not chunk:

def main():
    if not os.path.exists(local_folder):

    # Take last arg as filename
    filename = sys.argv[-1]

    if os.path.isfile(filename):
        stream = open(filename, 'r')
        yaml_instance = yaml.safe_load(stream)

        artifacts = yaml_instance["artifacts"]

        print "\nFetching Artifacts in '" + filename + "' from Artifactory... "

        #for each element in YAML...
        for artifact in artifacts:
            entry = artifact["artifact"]
            artifact_version = str(entry["version"])
            artifact_groupid = str(entry["groupid"])
            artifact_artifactid = str(entry["artifactid"])

            # Create API call
            api_call = "/artifactory/api/search/gavc?g=" + artifact_groupid + "&a=" + artifact_artifactid + "&v=" + artifact_version

            # GET the results
            conn.request("GET", api_call)
            r1 = conn.getresponse()

            # If GET was Successful
            if r1.status == 200 and r1.reason == "OK":
                uris = json.loads(["results"]
                # Omit Javadoc, Sources, POMs...
                for uri in uris:
                    link = uri["uri"]
                    if not link.endswith("pom") and not link.endswith("sources.jar") and not link.endswith("javadoc.jar"):
                        #Request the Artifact information
                        conn.request("GET", link)
                        artifact_json = conn.getresponse().read()
                        artifact_props = json.loads(artifact_json)

                        downloaduri = artifact_props["downloadUri"]
                        md5 = artifact_props["checksums"]["md5"]
                        fname = urlparse(downloaduri).path.split('/')[-1]

                        #Always Download Dep, unless conditions change.
                        omit_dl = False
                        if os.path.exists(local_folder + "/" + fname):
                            print "\nLocal Copy of '" + fname + "' Exists, checking md5..."
                            print "Remote MD5: " + md5
                            curr_md5 = hashlib.md5(open(local_folder + "/" + fname).read()).hexdigest()
                            print " Local MD5: " + curr_md5
                            if curr_md5 == md5:
                                omit_dl = True  # conditions changed

                        if not omit_dl:
                            download(fname, downloaduri)
                            print "Hashes match, omitting download..."
                        #artifact is not the binary jar

                print "Artifact was not found in Artifactory."

        print "Done."
        print "YAML file: '" + sys.argv[-1] + "' not found."


No comments:

Post a Comment