Quantcast

Ben McCann

Co-founder at Connectifier.
ex-Googler. CMU alum.

Connectifier on Twitter Connectifier on AngelList Connectifier on Google+ Connectifier on Pinterest Connectifier on Facebook

OAuth in a command line script

08/05/2015

Many APIs today use OAuth. If you want to use an OAuth API from the command line, then what I recommend is starting a web server locally to handle the OAuth callback. Here’s a quick and dirty example of doing that in Python.

#!/usr/bin/env python

from flask import Flask,redirect, request
import json
import logging
import threading
import time
from urlparse import urlparse
import urllib
import urllib2
import webbrowser

CLIENT_ID = 'xxxx'
CLIENT_SECRET = 'yyyyyyyy'

SCOPE = 'repo:read'
AUTH_URL = 'https://quay.io/oauth/authorize'
IMAGES_URL = 'https://quay.io/api/v1/repository/myorg/myrepo/image/'

oauth_access_token = None

app = Flask(__name__)

@app.route('/oauth_request_token')
def oauth_request_token():
  url = 'https://quay.io/oauth/authorize?response_type=token&redirect_uri=' + urllib.quote('http://localhost:7777/oauth_callback') + '&realm=realm&client_id=' + urllib.quote(CLIENT_ID) + '&scope=' + urllib.quote(SCOPE)
  print 'Redirecting to ' + url
  return redirect(url)

@app.route('/oauth_callback')
def oauth_callback():
  result = """
  <script>
    getHashParams = function() {
      var hashParams = {};
      var e,
        a = /\+/g,  // Regex for replacing addition symbol with a space
        r = /([^&;=]+)=?([^&;]*)/g,
        d = function (s) { return decodeURIComponent(s.replace(a, " ")); },
        q = window.location.hash.substring(1);

      while (e = r.exec(q))
        hashParams[d(e[1])] = d(e[2]);
      return hashParams;
    };
    
    ajax = function(url, callback, data) {
      try {
        var x = new(this.XMLHttpRequest || ActiveXObject)('MSXML2.XMLHTTP.3.0');
        x.open(data ? 'POST' : 'GET', url, 1);
        x.setRequestHeader('X-Requested-With', 'XMLHttpRequest');
        x.setRequestHeader('Content-type', 'application/x-www-form-urlencoded');
        x.onreadystatechange = function () {
            x.readyState > 3 && callback && callback(x.responseText, x);
        };
        x.send(data)
      } catch (e) {
        window.console && console.log(e);
      }
    };

    hashParams = getHashParams();
    ajax('/receive_token', function() { window.close(); }, 'access_token=' + hashParams['access_token']);
  </script>
  """
  return result

@app.route('/receive_token', methods=['POST'])
def receive_token():
  global oauth_access_token
  oauth_access_token = request.form['access_token']
  return '{}'

class ServerThread(threading.Thread):

  def __init__(self):
    threading.Thread.__init__(self)

  def run(self):
    app.run(
      port=7777,
      host='localhost'
    )

if '__main__'==__name__:
  logging.getLogger().addHandler(logging.StreamHandler())

  thread = ServerThread()
  thread.daemon = True
  thread.start()

  webbrowser.open('http://localhost:7777/oauth_request_token')

  while oauth_access_token is None:
    time.sleep(0.2)

  print 'Retreived auth code ' + oauth_access_token

  opener = urllib2.build_opener()
  opener.addheaders = [('Authorization', 'Bearer ' + oauth_access_token)]
  images = opener.open(IMAGES_URL)
  print images.read()

Building Docker images with SBT

07/26/2015

A typical way to setup Jenkins is to connect it to your source repository (e.g. with the Git Plugin), run your tests after each commit, and then build a package for deployment when the tests pass. We’ll use SBT’s sbt-native-packager for this last step, which allows you to package your applications in numerous different formats including zip, deb, rpm, dmg, msi, and docker.

To setup sbt-native-packager to publish you Docker images you need to add sbt-native-packager to your project and specify your Docker repo in your build.sbt. E.g. dockerRepository := Some("quay.io/myorganization"). You now need to setup the credentials to publish to your Docker repository. This typically goes in ~/.dockercfg. You can place the .dockercfg in the Jenkins home directory, which on Ubuntu will by default be located at /var/lib/jenkins/.

The next thing you need to setup is the build step to build the Docker image. This can be a bit confusing because Jenkins has build steps and post-build actions and it’s not completely clear what the difference is. I’ve found that the build step does what we want. You can use the Jenkins SBT Plugin to run your sbt tests with each commit. Now, to build a Docker image you can click “Add build step” followed by “Build using sbt” and in the Actions field enter “docker:publish”

Another thing you may need to deal with is having SBT sub-projects. E.g. let’s assume you have a project named “myproj”, which depends on other libraries. You can set "project myproj" docker:publish in the Jenkins build step so that SBT switches to your myproj project before building the docker image, so that it won’t try to run docker:publish on your subprojects. If you’re using SBT’s aggregation to compile or run the tests of these sub-projects when doing the same for myproj, you’re probably going to want to disable this for publishing the Docker image. You can do this by adding the setting aggregate in Docker := false to your build.sbt:

lazy val myproj = project
    .enablePlugins(DockerPlugin, GitVersioning, PlayJava, SbtWeb)
    .dependsOn(subproj).aggregate(subproj)
    .settings(
      aggregate in Docker := false  // when building Docker image, don't build images for sub-projects
    )

Note that you’ll have to handle garbage collection of old Docker images. Docker has this on their roadmap. Until then, I recommend Spotify’s Docker GC.

MongoDB data migration

07/07/2015

Here is some benchmarking data regarding transferring data from one machine to another. These benchmarks were run on the AWS i2 instance class.

  • mongodump – 15min / 100GB
  • gzip using pigz – 15min/100GB
  • network transfer – 20min/100GB
  • extract archive – 30min/100GB
  • mongorestore -j 12 – 2hr/100GB

Vision and Culture at Connectifier

05/18/2015

There are an infinite number of things to focus on when building a company – building a product, marketing it, selling it, keeping the servers running, finding office space, recruiting a team, fundraising, accounting, payroll, benefits, legal, training. The list goes on forever. With so many things to work on, it’s a necessity to delegate. At the same time, all oars must rowing in the same direction. One of the ways we’ve accomplished that at Connectifier is by having a strong vision and culture.

Connectifier makes talent search engine technology that helps recruiters discover, qualify, and connect with exceptional job candidates with roughly twice the efficiency of existing methods. The vision for Connectifier is to be able to instantly connect companies and job candidates with perfect matches given just a job description or resume. The few companies that have tried to put people into the right jobs have fallen fall short of what’s possible today for numerous reasons. E.g. there are tons of latent variables which are not taken into consideration. It’s so difficult to tell if a position is a good match for someone’s professional abilities, skills, and interests. Recruiters aren’t equipped with tools that help them understand how the keywords on resumes and job postings relate. The culture of a company is often not easy to discern for a candidate until at least well into the interview process. Our vision for Connectifier is an exciting one and is one we can all relate to having conducted our own job searches. It’s so important to each of us to be professionally fulfilled and to help others find professional fulfillment.

The other half of the equation for us in aligning the organization is Connectifier’s company culture. Culture is more than just a nice office with a ping pong table and cool t-shirts. When folks first visit the Connectifier office and meet everyone, they say that they’re struck by the intellectually curiosity and high caliber of the team. We place a lot of emphasis on hiring the very best. Not every company requires that and many perform better focusing on other combinations of traits. But we’re solving a very difficult technical problem where the team with the highest intellectual horsepower will outperform. We’ve seen this already as we’ve out-performed much larger competitors like Monster and Dice, which have tried to do some of the things that we’re doing with larger teams. But it’s highly unlikely that those companies are modifying the source code of the databases and web browsers used by their products in the same way that we are. Having the team that we do has enabled us to be bold in ways others can’t.

Building the highest quality team isn’t limited to our employees. We have great investors and advisers. Connectifier joined Launchpad LA early on. We thought it was important that Connectifier join the So Cal startup scene, so that folks interested in joining high-growth companies would know who we are. We also have great members of our team at our accounting firm and law firm. And we’ve consulted with numerous firms for projects such as penetration testing (we were recently nominated for an award due to our security efforts). There are so many ways to organize a company – build around functional roles, divisions, geographies, create a cross-functional matrix, a no-hierarchy holocracy. But without the right vision and culture, no organizational structure can be successful.

Learn more about working at Connectifier.

Injecting JUnit tests with Guice using a Rule

05/04/2015

GuiceBerry is a pretty helpful library for injecting JUnit tests with Guice. However, it’s not super actively maintained and many of it’s methods and members are private making it difficult to change it’s behavior. Here’s a class, which essentially does what GuiceBerry does in one single class that you can edit yourself.

import org.junit.rules.MethodRule;
import org.junit.runners.model.FrameworkMethod;
import org.junit.runners.model.Statement;

import com.connectifier.data.mongodb.MongoConnection;
import com.connectifier.data.mongodb.MongoDBConfig;
import com.google.inject.Guice;
import com.google.inject.Injector;
import com.google.inject.Module;
import com.mongodb.DB;
import com.mongodb.MongoClientOptions;

public class DbRule implements MethodRule  {

  private final Injector injector;
  
  public DbRule(Class envClass) {
    try {
      this.injector = Guice.createInjector(envClass.newInstance());
    } catch (InstantiationException | IllegalAccessException e) {
      throw new IllegalStateException(e);
    }
  }
  
  @Override
  public Statement apply(Statement base, FrameworkMethod method, Object target) {
    return new Statement() {
      @Override
      public void evaluate() throws Throwable {
        try {
          injector.injectMembers(target);
          base.evaluate();
        } finally {
          runAfterTest();
        }
      }
    };
  }

  protected void runAfterTest() {
    DB db = MongoConnectionFactory.createDatabaseConnection(
        injector.getInstance(MongoDBConfig.class),
        injector.getInstance(MongoClientOptions.class));
    db.dropDatabase();
    db.getMongo().close();
  }

};

To use:

  @Rule
  public final DbRule env = new DbRule(DataEnv.class);

IntelliJ Setup

05/04/2015

The font rendering on IntelliJ is horrendous and makes you want to gouge your eyes out. This is because is uses Swing. In order to make this not completely horrible, you’ll need to install tuxjdk, which contains series of patches to OpenJDK to enhance user experience with Java-based and Swing-based tools. I also recommend installing the Monokai Sublime Text 3 theme.

If you install the Lombok plugin, then you’ll also need to set: Settings > Build …. > Compiler > Annotation Processing > Enable Annotation Processors

Formatting a Disk on Amazon EC2

02/10/2015

The following commands will format and mount your disk on a newly created EC2 machine:

sudo mkfs -t ext4 /dev/xvdb 
sudo mkdir /storage
sudo sed -i '\|^/dev/xvdb| d' /etc/fstab # delete existing entry if it exists
sudo sh -c 'echo "/dev/xvdb /storage ext4 defaults,nobootwait,noatime,nodiratime 0 2" >> /etc/fstab'
sudo mount -a

HTTP API Design

12/12/2014

Here are some things I consider when designing a web API.

Consider using the following response code:

  • 200 – OK
  • 400 – Bad Request
  • 500 – Internal Server Error
  • 401 – Unauthorized (i.e. authentication error)
  • 403 – Forbidden (i.e. not authorized)
  • 404 – Not Found

Version your API
Use limit and offset for pagination
Return JSON responses by default with camel case property names
Append extension to URL to indicate other types (e.g. /person/123.xml)
Host APIs off a subdomain like api.yelp.com
Use OAuth 2.0 for authentication
Pretty print the results by default

Running Marathon and Mesos with Panamax

09/03/2014

Technology Overview

Panamax is a new tool that allows you to manage multiple Docker containers and to link them together. In this post, I’ll talk about creating a Panamax template which will allow you to run Marathon and Mesos in Docker containers. Mesos is a cluster manager, which allows you to run many jobs in a fault-tolerant manner. It can scale to thousands of machines and is well suited for running large jobs like Hadoop or running many different services in a microservice architecture. Marathon is a Mesos framework which provides a UI for scheduling jobs on Mesos. Marathon and Mesos both rely on a distributed application called Zookeeper to store configuration information. Panamax is very helpful in wiring together Marathon, Mesos masters, Mesos slaves, and Zookeeper instances.

Running Panamax

Panamax has some great installation instructions. Locally it depends on Vagrant and VirtualBox to create a CoreOS instance on which to run the Docker containers. I got a bit hung up on running it for the time and the VM wouldn’t start. I debugged this problem by opening the VirtualBox UI and running the VM manually. It turns out that I didn’t have virtualization extensions turned on in my BIOS on this computer yet, so I got the error message “VT-x is disabled in the BIOS.” Most computers have VT-x disabled by default as a security precaution, so if you’ve never turned VT-x on, you’ll have to do so.

Creating a Panamax application

The first step of creating a Panamax application is to find Docker containers to use. This part was trickier than I imagined given that this was my first time using Docker. I first tried to use thefactory/marathon Docker image. However, it turned out that the version they published did not match what was in the Docker description because the DockerHub automated build didn’t build one of their commits, and so Marathon wouldn’t actually run. I filed a bug on this issue and it has since been fixed, so it would be a great image to try again. It’s always good to review the docker images you use. E.g. I ended up using the redjack/mesos-master and saw that it was doing some of its software installation over insecure HTTP, so I sent them a commit that they merged to change it to HTTPS. I also saw that it was using Ubuntu 14.04, but using the Mesos install for 12.04, so I also sent a pull request to have it use the correct Mesos install and upgrade it to Mesos 0.20 at the same time.

One problem with the way I set things up was that the initial download of all the images takes a long time. I used images from a few different sources and they all used slightly different base OS images. They’re quite big nearing 1GB each and need to be downloaded. If they used the same base file then it’d only have to be downloaded once. Now that the issue the thefactory images is fixed, it’d probably be nice to try to give those images another shot in order to speed up usage of the Panamax template.

One of the things you’ll have to figure out is how to pass configuration information to your docker containers. I passed some command line flags directly in the Docker run command. Another great strategy is to run services with wrapper script that reads config from environment variables as is done in this script in the CenturyLink MySQL Docker image.

Running the template

You can find my template by searching for “Marathon on Mesos and Zookeeper” from the Panamax Contest Templates. It has some great instructions for getting started, so I won’t rehash them here. After the various images are up and running and you’ve set the required settings, you should be able to see a Marathon screen like the following:

marathon-sample-app

Things to watch out for

Panamax seems to struggle with being disconnected from the internet while downloading an image, so be sure you have time to wait for your downloads to complete. As long as you’re plugged in and not going anywhere you shouldn’t have any problems. The other issue I had was a hard time saving my Panamax template because it wasn’t dealing well with GitHub accounts with lots of repos. That issue has already been fixed, which is evidence of how quickly this project is moving. I also wasn’t sure if it was possible to test local Docker images as part of a Panamax application, so it seems like you’ll want to publish any images you plan to use.

You’ll also have to be careful to create good documentation for your Panamax templates and to use templates with good docs. I saw that someone else posted a Mesos template, so I tried it out to see how it would compare to mine, but was unable to run it. I thought for awhile that it was broken and wouldn’t work, but I think now that it’s probably a case of missing documentation instead. However, those missing docs could cause hours of debugging. Panamax is really easy to use and has a nice UI, but there’s still technology under the covers that has to be configured correctly when using it.

Future improvements

The thing I’d most like to see change is for Marathon to offer better authentication and authorization support. I’ve submitted a pull request to the Chaos Web Framework, which was created for use by Marathon and Chronos, to make this possible.

Marathon on Panamax template interest

This blog post was mentioned in the CenturyLink Labs newsletter. I tweeted about this template and it was favorited or retweeted by several folks including Marc Averitt (Managing Director of Okapi Venture Capital) and AllThingsMesos. Ross Jimenez (Director of Software at CenturyLink Labs) tweeted as well and was retweeted and favorited by several folks including Florian Leibert (Founder of Mesosphere), the Panamax Project, and Lucas Carlson (CIO of CenturyLink Labs and CEO of AppFog). Grégory Horion said this was his favorite Panamax template (besides the Locomotive CMS template he created :-) and this Tweet was favorited by the engineering team at Twitter. Seen this template mentioned other places? Let me know!

What’s next for Panamax

Panamax is a very cool project. One of the biggest things that the Panamax team is working on is support for multiple hosts. Things will really start to get fun then. It will be very cool to see this deployed in production. I can see web hosts really loving something like this since it’s great for running software like WordPress where there are multiple components that need to linked together such as PHP, Apache, and MySQL.

How to take over the computer of a Jenkins user

08/14/2014

I recently began using Jenkins and found quite a bit of security indifference. This is unfortunate because Jenkins is the world’s leading continuous integration server used for testing, building, and deploying code. According to RebelLabs, Jenkins has 70% market share, with the next closest competitor having only 9%. I’ve raised these issues with the Jenkins team and have received only dismissive responses thus far. The response I’ve received and the fact that Jenkins has over 50 open bugs filed against it which are categorized as critical security issues and leaves me with little confidence that the team will move on these issues unless attention is drawn to them, which is why I’ve written this post.

Unsecure installation

Let’s start at the beginning and walk through the install instructions. The very first step on Ubuntu is:

wget -q -O - http://pkg.jenkins-ci.org/debian/jenkins-ci.org.key | sudo apt-key add -

Here are the first two steps on Redhat:

sudo wget -O /etc/yum.repos.d/jenkins.repo http://pkg.jenkins-ci.org/redhat-stable/jenkins.repo
sudo rpm --import http://pkg.jenkins-ci.org/redhat-stable/jenkins-ci.org.key

If you haven’t noticed anything wrong yet, you’re not alone. I didn’t either the first time I followed these instructions. The issue here is the http://. When you download software from a Linux repository, the system verifies downloaded packages against a gpg signature. Debian has been using strong crypto to validate downloaded packages since 2005, so this is a long standing best practice. However, if you download this signature over an insecure channel, then there is little point because anyone who could deliver a malicious package could also deliver a malicious signature. For this reason, you should only use https with “apt-key add” or else you are rendering void any security it provides. Indeed if you Google “apt-key add” the very first result you get is a StackOverflow post which says “adding keys you fetch over non-HTTPS breaks any security that signing packages added. Wherever possible, you should download keys over a secure channel (https://)”. If only Jenkins would properly configure their SSL certificate for downloading this file and update their docs to suggest https!

Unsecure updates

Jenkins by default loads the URLs to use for updating plugins from http://updates.jenkins-ci.org/update-center.json. This is a problem because Jenkins will download and install whatever package URLs are listed in this file, so if an attacker can modify this file they can install whatever malicious plugins they want. I attempted to remedy this with a one-character pull request to change http to https which was rejected as being too load intensive upon Jenkins servers. I was told on the bug that I filed for the issue that there’s a signature embedded within the file which makes it secure. The problem here is that you need a key which you received securely to check that signature. Because the key is delivered over HTTP as already discussed, much of its value is lost.

Unsecure plugins

A response I’ve gotten to the preceding issue is “You realize that anyone with a Jenkins-ci.org account can release updates to any plugin, right?” So why bother delivering widely used plugins securely when they could be malicious before they ever leave the Jenkins servers? I could update all the most popular Jenkins plugins with malicious code and no doubt thousands of people would update their plugins and find themselves running malicious code. The plugins are all open source, but I have no idea if I’m running the code that I see open sourced. An attacker could download the code for a plugin, modify it in an evil manner, and release an update to that plugin and there’s no way to know whether the code downloaded matches what is in the open source repository.

The irony here is almost killing me. Using Jenkins to build the plugins instead of letting “anyone with a Jenkins-ci.org account” build them would be a great solution to this problem. I was told that fixing this problem would violate “Jenkins project core principles, so you should probably build a better case than ‘this is wrong’ before you bring it up on the dev list.” Without further explanation I’m left wondering why closing security holes would violate Jenkins project core principles. Looking at the core principles only seem to reinforce the idea that these problems should be fixed. It would lower the barrier to entry by making it such that plugin developers don’t need to figure out how to publish them since a continuous integration server could do it. It seems meritocratic to fix security issues raised by the community. It would increase transparency to know that you’re running the code you see available on GitHub and not some attacker’s code. It would not affect compatibility or code licensing. It certainly would be a more automated solution (someone get Alanis Morissette on the phone before I die).

Unsecure for contributors

You can’t even work on Jenkins without facing security problems. If you try to write a plugin for Jenkins, for example, the docs suggest you add the following to your Maven settings:

      <repositories>
        <repository>
          <id>repo.jenkins-ci.org</id>
          <url>http://repo.jenkins-ci.org/public/</url>
        </repository>
      </repositories>
      <pluginRepositories>
        <pluginRepository>
          <id>repo.jenkins-ci.org</id>
          <url>http://repo.jenkins-ci.org/public/</url>
        </pluginRepository>
      </pluginRepositories>

Again, downloading software over http is not secure. I was told this is a “cosmetic issue” when I filed a bug though I’m hoping the engineer that the bug is assigned to will see that telling users to connect to http is a bit more than that. To help demonstrate this point, I linked to an article which shows how to exploit exactly this problem in my bug report. As a result of that article, Sonatype (who host the most popular Maven repository) is turning on SSL for all users. It is not yet apparent that this will sway anyone working on Jenkins.

Consequences

So what can you do by getting someone to install a malicious version of a Jenkins server or plugin and how hard is it? Well, there’s already a proof-of-concept for launching a Man-in-the-middle attack against a Maven repository http download and it’s pretty basic code, so I think it’s fair to say that it can be done. If you go to a Jenkins Meetup there’s a chance you’ll be able to snag someone downloading some Jenkins-related software over an unsecured wi-fi connection and be able to infect them. The types of folks who would install Jenkins on their laptops are also somewhat likely to have access to production systems at their companies. And because Jenkins is used to build software that means a malicious version could potentially inject further maliciousness into the software that it’s building or leak the source code of that software to an attacker.

If you care about building secure software, I hope that you’ll ask the Jenkins team to fix these issues and make sure other Jenkins users are familiar with these holes until then. You can also check out https://www.connectifier.com/careers.

Older Posts