Search This Blog

Wednesday, August 20, 2014

Using USB Devices on Solaris (ZFS)

UPDATE:
Solaris 11 does not support the ZFS filesystem provided by 'native' ZFS for linux, and vice-versa.  Thus, you should export your data as NTFS so that everyone is happy, as NTFS is the most supported filesystem for 'read-only'.  Otherwise it's Solaris->Solaris only.  I have not tested native ZFS with import/export, but i suspect the version of zpool (5000 or something like that) will work fine between fedora type distros and ubuntu.


Recently I needed to export a large amount of data on our Solaris NFS Server.  However, getting this information straight off this robust server is not as intuitive or straight forward as you might first think.  Many filesystems are not natively supported by solaris, thus can cause a lot of headache trying to figure out how to use fdisk and format.  Additionally, with block sizes of 4096 (like on a Seagate 3TB hard drive) it may not even be compatible with UFS.

Before you begin, take a look at this chart on wikipedia which breaks down the various versions of zpool and zfs.  This will only work if you are using the proper version of zpool/zfs between hosts.  Currently the Native Linux ZFS project uses a zpool version not support by Solaris 11 and vice versa.  So ZFS cannot be imported between hosts: http://en.wikipedia.org/wiki/ZFS#List_of_operating_systems_supporting_ZFS

ZFS is a good solution and can be imported on other linux distributions such as CentOS or Ubuntu.  Here is a synopsis in which I exported a large amount of data with the label "TwitterFeeds"

List available drives
$ format -e
...    other disks likely shown here...
54. c9t0d0 <Seagate-Expansion Desk-0604 cyl 45597 alt 2 hd 255 sec 63> /pci@0,0/pci108e,cb84@2,1/hub@6/storage@2/disk@0,0
          /pci@0,0/pci108e,cb84@2,1/hub@6/storage@2/disk@0,0
Created a ZFS pool on the USB drive, added a ZFS file system, chowned it with my default user and started rsync with nohup to log (rsync.out)
$ sudo zpool create TwitterFeeds c9t0d0
$ sudo zfs create TwitterFeeds/export 
$ sudo chown -R user:group /TwitterFeeds/export
$ nohup rsync -r --progress /sasdata/TwitterFeeds /TwitterFeeds/export > rsync.out 2>&1&
Example entry of output in rsync log:
bytesize Percent% xferRate Time(file#, to-check=filesremain/estimatedtotal)Absolute/File/Path/Filename.ext
805313211 100%   29.23MB/s    0:00:26 (xfer#364, to-check=1008/1417)TwitterFeeds/08/19/2014/05/17/22/10FEB25021318-S3DM_R5C4-053771096010_01_P001.DAT
You can use tail -f on the rsync log to periodically view progress.
$ tail -f /TwitterFeeds/export/rsync.out
Once rsync has completed, the ZFS File System and Zpool are unmounted and removed from your available zpools.
$ sudo zpool export TwitterFeeds
The USB Drive can now be physically removed and plugged into another computer.  Plug USB Drive in to target system.  ZFS detects any moved or renamed devices, and adjusts the configuration appropriately. To discover available pools, run the zpool import command with no options. To import a pool, specify the name as an argument to the import command (TwitterFeeds).  By default, the zpool import command only searches devices within the /dev/dsk directory. If devices exist in another directory, or you are using pools backed by files, you must use the -d option to search alternate directories.  This may be required when using CentOS/Ubuntu with ZFS.
$ zpool import TwitterFeeds
$ zpool import -d /dev/rdsk/c9t0d0 TwitterFeeds #specifying disk device path manually
Once Imported, "$ sudo zfs get all" should show a PROPERTY mount point for the zpool to access the data. From this point you should be able to use native OS filesystem utils like cp, rm, chmod, and others.

Reference Links:
Managing ZFS Storage Pools
Managing ZFS File Systems (Not the Physical Devices!)
Using ZFS on Linux

Friday, August 8, 2014

Fetching Artifactory Maven Dependencies via Python

This clever python script will fetch your dependencies identified in a YAML file.  Don't forget that python uses white space for scope.  So make sure if you're going to copy and paste, to get it right.  The script uses artifactory's GAVC API.  This is the same API used by maven plugin for artifactory.  A nice feature is that you can run it multiple times, and by comparing md5 hashes, it will only download JAR files that have changed.  Also make note that javadoc, sources and POMs are omitted in the condition on line 67.

Because Python has a great API for Docker as well.  I will be using this code to implement something like fig to automatically deploy containers in my environment which can pull latest dependencies from Artifactory for installation.

First, an example of the YAML:

artifacts:
- artifact:
     artifactid:     accumulo-core
     groupid:     org.apache.accumulo
     version:     1.5.1

- artifact:
     artifactid:     accumulo-fate
     groupid:     org.apache.accumulo
     version:     1.5.1

- artifact:
     artifactid:     accumulo-trace
     groupid:     org.apache.accumulo
     version:     1.5.1



The Script:

#!/usr/bin/env python
import yaml
import hashlib
import os
import sys
import httplib
import json
import urllib2
from urlparse import urlparse

__author__ = 'champion'
artifactory_url = "art.mydomain.com:8081"
#local download folder
local_folder = "./deps"
conn = httplib.HTTPConnection(artifactory_url)


def download(filename, remote):
    print "\nDownloading: " + remote
    req = urllib2.urlopen(remote)
    blocksize = 16 * 1024
    with open(local_folder + "/" + filename, 'wb') as fp:
        while True:
            chunk = req.read(blocksize)
            if not chunk:
                break
            fp.write(chunk)
        fp.close()


def main():
    if not os.path.exists(local_folder):
        os.mkdir(local_folder)

    # Take last arg as filename
    filename = sys.argv[-1]

    if os.path.isfile(filename):
        stream = open(filename, 'r')
        yaml_instance = yaml.safe_load(stream)
        stream.close()

        artifacts = yaml_instance["artifacts"]

        print "\nFetching Artifacts in '" + filename + "' from Artifactory... "

        #for each element in YAML...
        for artifact in artifacts:
            entry = artifact["artifact"]
            artifact_version = str(entry["version"])
            artifact_groupid = str(entry["groupid"])
            artifact_artifactid = str(entry["artifactid"])

            # Create API call
            api_call = "/artifactory/api/search/gavc?g=" + artifact_groupid + "&a=" + artifact_artifactid + "&v=" + artifact_version

            # GET the results
            conn.request("GET", api_call)
            r1 = conn.getresponse()

            # If GET was Successful
            if r1.status == 200 and r1.reason == "OK":
                uris = json.loads(r1.read())["results"]
                # Omit Javadoc, Sources, POMs...
                for uri in uris:
                    link = uri["uri"]
                    if not link.endswith("pom") and not link.endswith("sources.jar") and not link.endswith("javadoc.jar"):
                        #Request the Artifact information
                        conn.request("GET", link)
                        artifact_json = conn.getresponse().read()
                        artifact_props = json.loads(artifact_json)

                        downloaduri = artifact_props["downloadUri"]
                        md5 = artifact_props["checksums"]["md5"]
                        fname = urlparse(downloaduri).path.split('/')[-1]

                        #Always Download Dep, unless conditions change.
                        omit_dl = False
                        if os.path.exists(local_folder + "/" + fname):
                            print "\nLocal Copy of '" + fname + "' Exists, checking md5..."
                            print "Remote MD5: " + md5
                            curr_md5 = hashlib.md5(open(local_folder + "/" + fname).read()).hexdigest()
                            print " Local MD5: " + curr_md5
                            if curr_md5 == md5:
                                omit_dl = True  # conditions changed

                        if not omit_dl:
                            download(fname, downloaduri)
                        else:
                            print "Hashes match, omitting download..."
                    else:
                        #artifact is not the binary jar
                        continue

            else:
                print "Artifact was not found in Artifactory."

        conn.close()
        print "Done."
    else:
        print "YAML file: '" + sys.argv[-1] + "' not found."



main()

Saturday, August 2, 2014

Ubuntu 14.04 LTSP Docker Container

Just finished creating a redundant LTSP server as a Docker container.  Using a Dockerfile, we can now build portable LTSP servers to host both thick and thin client PCs. This of course depends on the LAN having at least DHCP and DNS already provided.  DHCP is used to support the PXE boot process, while DNS is used by the Dockerfile to resolve Ubuntu APT Repository package host IPs.  Thus, an Internet connection is also needed.

In this scenario, the container is running on the "docker01.devops.mydomain.com" CentOS 6.5 server.  LTSP ports were forwarded through the "devops.mydomain.com" VLAN Gateway (Zentyal server with two ethernet cards) to the docker01 host.  This allows clients on the normal domain LAN to hit the docker01 host behind the gateway on the VLAN.  The container running LTSP then has the gateway's forwarded ports mapped to the docker01 host so the clients can access the LTSP services.

Port 69/udp - TFTPD, serves the PXE boot configuration to clients referred by DHCPd.
Port 10809/tcp - Network Block Device daemon (NBDd), serves the base chroot up as a block device for the client. The external port is 10809 as well.
Port 22/tcp - SSH, serves the authentication and home folder via "SSHFS".  The external port is 2222 so that it does not conflict with the default SSH port 22 on the docker01 host.

Initially had some trouble with SSHFS due to custom security configurations of the 'Dockerhub' provided Ubuntu 14.04 image.  Essentially my SSH connections were being closed by the server as soon as authentication was completed.  I speculated a PAM configuration, but didnt want to dig through all the undocumented changes.  So, I performed a fresh install of Ubuntu 14.04 Server edition (~900MB) on a VM and exported the file system as a gzipped tarball, excluding /sys and /proc.  I then imported the file system in to a new docker container.

Dockerfile:
FROM champion:ubuntu-base #custom ubuntu 14.04 image
MAINTAINER championofcyrodiil@blogspot.com

RUN apt-get update

RUN apt-get install -y ltsp-server
RUN ltsp-build-client --arch="amd64" --fat-client
RUN ltsp-config lts.conf

Mannually Update lts.conf:
Port 22 would likely conflict with the docker host's SSH port. So manually add 'SSH_OVERRIDE_PORT=2222' below the [Default] tag inside /var/lib/tftpdboot/ltsp/amd64/lts.conf.  Also add 'SERVER=' so that the client hits the docker host w/ mapped ports, since it can't see the container's resolved host that is used by default.  The value should be the IP of your Docker container's host.

Run Command:
$ docker run -h ltsp -it -p 2222:22 -p 10809:10809 -p 69:69/udp -v /data/ltsp-home/:/home/ -v /data/ltsp-opt/:/opt/ --privileged=true cott:ltsp /bin/bash

-h Sets the hostname to 'ltsp', although this does not really matter, it helps to know what container your attached to.
-p Maps specific ports. (Uppercase P will use the docker range, check docker run usage)
-v The first mount point specified is the docker host path, the second mount point (after the colon) is the mount point inside the container.
--privileged=true Gives the container permissions to access kernel parameters and other information from /proc, this is definitely needed when you build the LTSP fat client.  I left it on during run time as well, but may not be needed.

Once the container has started and you're at the bash prompt, here you just needed to fire up the daemons:

$ service ssh start
$ service tftpd-hpa start
$ service nbd-server start

Press Ctrl+P, Q to drop from the container's bash prompt w/o killing the container's root bash process.  Ultimately you would want these service calls in a custom bash script with the 'wait' command at the end, and specify it in your Dockerfile as the default command.  But for sanity's sake, above it is a manual process to get the gist of what needs to happen.

[user@docker01 ~]$ docker ps -a
CONTAINER ID        IMAGE                                      COMMAND                CREATED             STATUS              PORTS                                                                                                NAMES
2c09964e1882        champion:ltsp                                  /bin/bash              2 hours ago         Up 2 hours          0.0.0.0:69->69/udp, 0.0.0.0:2222->22/tcp, 0.0.0.0:10809->10809/tcp                                   ltsp_server

LTSP Client Login Screen




Ultimately the configuration needs to be adjusted so that the fat/thin client connects to the docker container's host, with ALL LTSP required ports (69, 10809, 22) mapped to the container using the -p options with the docker 'run' command, and modifications to the client's tftpd provided ltsp configuration (/var/lib/tftpboot/ltsp/amd64/lts.conf) to support the changed SSHFS port, which is modified to 2222 in this post.

Geoserver 2.5.2 Dockerfile

This geoserver dockerfile also requires the Oracle JDK tar.gz reside in the Dockerfile context.

#
# example Dockerfile for Geoserver 2.5.2
#

FROM ubuntu:14.04
MAINTAINER championofcyrodiil@blogger.com

#JDK
RUN mkdir /opt/java
ADD jdk-7u60-linux-x64.tar.gz /opt/java
#Line below commented out because docker auto extracts a tar.gz when ADDed above
#RUN tar -C /opt/java -xf /opt/java/jdk-7u60-linux-x64.tar.gz
ENV JAVA_HOME /opt/java/jdk1.7.0_60
ENV PATH $PATH:$JAVA_HOME/bin

#Geoserver, wget or put the zip in the Dockerfile folder.
#ADD geoserver-2.5.2-bin.zip /opt/
RUN wget -P /opt/ http://sourceforge.net/projects/geoserver/files/GeoServer/2.5.2/geoserver-2.5.2-bin.zip
RUN unzip -d /opt/ /opt/geoserver-2.5.2-bin.zip
ENV GEOSERVER_HOME /opt/geoserver-2.5.2/
RUN chmod -R 755 /opt/geoserver-2.5.2

EXPOSE 8080
CMD ["/opt/geoserver-2.5.2/bin/startup.sh"]