Tag Archives: Code

Looking Back at EMC World 2016

Wow! How quickly a week can go by. Like many of you, EMC World 2016 was my first time in attendance and it also happen to be the first time I have been given the opportunity to be a presenter for a larger audience. I though the experience exceeded my expectations and based on some of the preliminary numbers and feedback that we have been getting on the sessions the EMC {code} team had presented, a good number of you agree the sessions content and presentations were of value to you. Thanks again for attending the sessions and providing your feedback.

Couldn’t make it this year?

For those that couldn’t make it out this year, a number of people in {code} have starting posting the materials and slide decks for our sessions. Official EMC World slide decks should be posted in the coming weeks, but there have been a large number of requests to get a hold of the material ASAP and many of us on the team have been happy to oblige. As for my sessions, you can find the materials below.

Introduction To Mesos & Mesosphere

Here is the session material for Introduction To Mesos & Mesosphere (Monday May 2 at 8:30) which just as the title says is a Apache Mesos 101 type session.

You can download the “Introduction To Mesos & Mesosphere” powerpoint presentation HERE. The video of the demonstration used at the end of the session highlighting Mesos using persistent external storage can be found on YouTube below:

The source code for the MVC Web Application written in Golang can be found in my GitHub repo. The two projects used in demo were RestServer and RestClient.

To launch the MVC Application with external persistent storage, you first need to have each of your Mesos Agent/Slave nodes running Mesos DNS and configured for persistent external storage using this Guide. Once you have those prerequisites in your Mesos Cluster, you can find the Marathon JSON files to launch tasks here. To start up the application, perform the following:

Start PostgreSQL:
curl -k -XPOST -d @postgres-mvc.json -H "Content-Type: application/json" YourMarathonIP:8080/v2/apps

Start RestServer:
curl -k -XPOST -d @restapi.json -H "Content-Type: application/json" YourMarathonIP:8080/v2/apps

Start RestClient:
curl -k -XPOST -d @ui.json -H "Content-Type: application/json" YourMarathonIP:8080/v2/apps

Deep Dive With Mesos & Persistent Storage For Applications

Here is the session material for Deep Dive With Mesos & Persistent Storage For Applications (Tuesday May 3 at 3:00) which covered the importance of Apache Mesos Frameworks and the powerful capabilities that 2 layer scheduling provides in your Datacenter and Mesos cluster.

You can download the “Deep Dive With Mesos & Persistent Storage For Applications” powerpoint presentation HERE. The video of the demonstration used at the end of the session highlighting the Elastic Search Mesos Framework using persistent external storage can be found on YouTube below:

To launch the Elastic Search Framework with external persistent storage, you first need to have at least a 3 Agent/Slave nodes in your Mesos cluster and each of your Mesos Agent/Slave nodes configured for persistent external storage using this Guide. To start the ElasticSearch scheduler, you can find the Marathon JSON files to launch task here. To start up the Scheduler, perform the following:

Start ElasticSearch Scheduler:
curl -k -XPOST -d @elasticsearch.json -H "Content-Type: application/json" YourMarathonIP:8080/v2/apps

If you want to run some of the advanced ElasticSearch functionality used in the demo, you can find additional information in this file here.

What’s Next…

After recharging for a bit, we have already started on our post-EMC World plans and deliverables. Hopefully this will bring a forth a bunch of interesting ideas and projects for the community. To keep up to date with the things that I will be working on, please follow me on Twitter at @dvonthenen. If anyone has any questions about the EMC World presentations, you can always catch me on the {code} Community Slack channel.

twittergoogle_plusredditlinkedinmail

Getting Ready for EMC World 2016

We are getting closer and closer to EMC World 2016. I have to admit, its approaching crazy fast. This will be my first time attending EMC World. Seems odd saying that as I have attended many conferences in my career, but never the one my company throws. This time its going to be a different conference going experience as I will be presenting two sessions in the “Code and Modern Operations” track this year. I am very excited for this opportunity to talk about things that are interesting to me and I hope that are of interest to others out there in the open source community.

Apache Mesos

The first session is Introduction To Mesos & Mesosphere. This is basically an Apache Mesos 101 type session with a focus on the company, Mesosphere, whom pushes the direction of Mesos. I will be co-presenting this session with Somik Behera from Mesosphere. For those that haven’t heard about Mesos or looking to learn more about it, this is a excellent session outlining why Mesos is among the best workload schedulers in the datacenter and why its the preferred choice among companies looking to scale their applications. You can catch us both Monday morning (May 2) at 8:30am. Yes, you read that right. Going to be difficult for people to drag themselves out of bed that early in the morning… this is Las Vegas after all. Hope you all can make it!

Deep Dive

My second session is Deep Dive With Mesos & Persistent Storage For Applications. After getting some time to digest the information in my previous Mesos 101 session, this will dive into some of the internals of Mesos as we explore Mesos Frameworks and 2 layer scheduling. We will discuss what 2 layer scheduling means and how external storage can enhance the story around applications leveraging Frameworks. For the architects, operators, and consumers of Mesos, this session is packed with things you need to know in order to make your applications function efficiently and be highly available in order to avoid the train wreck. Ultimately the goal of talk is to enable you to put things on autopilot so you don’t need to manage the application.

Train Wreck

If you haven’t purchases tickets for EMC World, I would highly recommend you do as soon as possible. The EMC {code} team has a huge presence this year as we have our own booth along with our own session track, Code and Modern Operations, which I eluded to earlier. The {code} team collectively has 21 sessions at the conference this year talking about everything from Docker, managing larger open source communities, Mesos, and contributing to open source just to name a few. I will have a follow up blog post just before EMC World highlighting some of the other sessions you might want to check out. Catch you all later!

twittergoogle_plusredditlinkedinmail

Enabling External Storage on Mesos Frameworks

There has been a huge push to take containers to the next level by twisting them to do much more. So much more in fact that many are starting to use them in ways that were never originally intended and even going against the founding principles of containers. The most noteworthy principle of containers being left on the designing room floor is without a doubt is being “stateless”. It is pretty evident that this trend is only accelerating… just doing a simple search of popular traditional databases in Docker Hub yields results like MySQL, MariaDB, Postgres, and OracleLinux in Docker Hub (Oracle suggests you might try running an Oracle instance in a Docker container. LAF!). Then there is all the NoSQL implementations like Elastic Search, Cassandra, MongoDB, and CouchBase just to name a few. We are going to take a look to see how we can bring these workloads into the next evolution of stateful containers using the Mesos Elastic Search Framework as a proposed model.

After Though

The problem with stateful containers today is that pretty much every implementation of a container whether its Docker, Apache Mesos, or etc has been architected with those original principles, such as being stateless, in mind. When the container goes away, anything associated with the container is gone. This obviously makes it difficult to maintain configuration or keep long term data around. Why make this tradeoff to begin with? It keeps the design simple. The problem is useful applications all have state somewhere. As a response, container implementations enabled the ability to store data on the local disks of compute nodes; thus tying workloads to a single node. However on the failure of a particular node, you could potentially lose all data on the direct attached storage. Not good for high availability, but at least there was some answer to this need.

Enter the Elastic Search Mesos Framework

This brings me to a recently submitted Pull Request to the Elastic Search (ES) Mesos Framework project on GitHub to add support for External Storage orchestration, but more importantly to enable management of those external storage resources among Mesos slave/agent nodes. Before I jump into talking about the ES Framework, I probably should quickly talk about Mesos Frameworks in general. A Mesos Framework is a way to specialize a particular workload or application. This specialization can come in the form of tuning the application to best utilize hardware, like a GPU for heavy graphics processing, on a given slave node or even distributing tasks that are scaled out to place them in different racks within a datacenter for high availability. A Framework is consists of 2 components a Scheduler and an Executor. When an resource offer is passed along to a Scheduler, the Scheduler can evaluate the offers, apply its unique view or knowledge of its particular workload, and deploy specialized tasks or applications in the form of Executors to slave/agent nodes (seen below).

Mesos Architecture - Picture Thanks to DigitalOcean

The ES Framework behaves in the same way described above. The design of the ES Scheduler and Executor have been done in such a way that both components have been implemented in Docker containers. The ES Scheduler is deployed to Marathon via Docker and by default the Scheduler will create 3 Elastic Search nodes based on a special list of criteria to meet. If the offer meets that criteria, an Elastic Search Executor in the form of a Docker container will be created on the Mesos slave/agent node representing the output for that offer. Within that Executor image holds the task which in this case is an Elastic Search node.

Deep Dive: How the External Storage Works

Let’s do a deep dive on the Pull Request and discuss why I made some of the decisions that I did. I first broke apart the OfferStrategy.java into a base class containing everything common to a “normal” Elastic Search strategy versus one that will make use of external storage. Then the OfferStrategyNormal.java retains the original functionality and behavior of the ES Scheduler which is on by default. Then I created the OfferStrategyExternalStorage.java which removes all checks for storage requirements. Since the storage used in this mode is all managed externally, the Scheduler does not need to take storage requirements into account when it looks at the criteria for deployment.

The critical piece in assigning External Volumes to Elastic Search nodes is to be able to uniquely associate a set of Volumes containing configuration and data of each elastic search node represented by /tmp/config and /data. That means we need to create, at minimum, runtime unique IDs. What do I mean runtime unique? It means that if there exists a ES node with an identifier of ID2, there exists no other node with an ID2. If an ID is freed lets say ID2 from a Mesos slave/agent node failure, we make every attempt to reuse that ID2. This identifier is defined as a task environment variable as seen in ExecutorEnvironmentalVariables.java.

addToList(ELASTICSEARCH_NODE_ID, Long.toString(lNodeId));
LOGGER.debug("Elastic Node ID: " + lNodeId);

private void addToList(String key, String value) {
  envList.add(getEnvProto(key, value));
}

Why an environment variable? Because when the task and therefore the Executor is lost, the reference to the ES Node ID is freed so that when a new ES Node is created, it will replace the failed node and the ES Node ID will be recycled. How do we determine what Node ID we should be using when selecting a new or recycling a Node ID? We do this using the following function in ClusterState.java:

public long getElasticNodeId() {
    List taskList = getTaskList();

    //create a bitmask of all the node ids currently being used
    long bitmask = 0;
    for (TaskInfo info : taskList) {
        LOGGER.debug("getElasticNodeId - Task:");
        LOGGER.debug(info.toString());
        for (Variable var : info.getExecutor().getCommand().getEnvironment().getVariablesList()) {
            if (var.getName().equalsIgnoreCase(ExecutorEnvironmentalVariables.ELASTICSEARCH_NODE_ID)) {
                bitmask |= 1 << Integer.parseInt(var.getValue()) - 1;
                break;
            }
        }
    }
    LOGGER.debug("Bitmask: " + bitmask);

    //the find out which node ids are not being used
    long lNodeId = 0;
    for (int i = 0; i < 31; i++) {
        if ((bitmask & (1 << i)) == 0) {
            lNodeId = i + 1;
            LOGGER.debug("Found Free: " + lNodeId);
            break;
        }
    }

    return lNodeId;
}

We get the current running task list, find out which tasks have the environment variable set, build a bit mask, then walk the bitmask starting from the least significant bit until we have a free ID. Fairly simple. As someone who doesn’t run Elastic Search in production, it was pointed out to me this would only support 32 nodes so there is a future commit that will be done to make this generic for an unlimited number of nodes.

Let’s Do This Thing

To run this, you need to have a Mesos configuration running 0.25.0 (version supported by the ES Framework) with at least 3 slave/agent nodes, you need to have your Docker Volume Driver installed, like REX-Ray for example, you need to pre-create the volumes you plan on using based on the parameter --frameworkName (default: elasticsearch) appended with the node id and config/data (example: elasticsearch1config and elasticsearch1data), and then start the ES Scheduler with the command line parameter --externalVolumeDriver=rexray or what ever volume driver you happen to be using. You are all set! Pretty easy huh? Interested in seeing more? You can find a demo on YouTube located below.

BONUS! The Elastic Search Framework has a facility (although only recommended for very advanced users) for using the Elastic Search JAR directly on the Mesos slave/agent node and in that case, code was also added in this Pull Request to use the mesos-module-dvdi, which is a Mesos Isolator, to create and mount your external volumes. You just need to install mesos-module-dvdi and DVDCLI.

The good news is that the Pull Request has been accepted and it is currently slated for the 8.0 release of Elastic Search Framework. The bad news is the next release looks like to be version 7.2. So you are going to have to wait a little longer before you get an official release with this External Volume support. HOWEVER if you are interested in test driving the functionality, I have the Elastic Search Docker Images used for the YouTube video up on my Docker Hub page. If you want to kick the tires first hand, you can visit https://hub.docker.com/r/dvonthenen/ for images and instructions on how to get up and running. The both the Scheduler image (and the Executor image) were auto created as a result of the gradle build done for the demo.

Frameworks.NEXT

What’s up next? This was a good exercise in adding on to an existing Mesos Scheduler and Executor and the {code} team may potentially have a Framework of our own on the way. Stay tuned!

Drops the Mic

twittergoogle_plusredditlinkedinmail

Difficulties with Multi-Version Mesos Support

I have been working with Apache Mesos for some time now and after a recent meeting with the Mesosphere team (the company largely driving the roadmap and development effort), I have come to learn that some Mesos users sit on the bleeding edge of releases and others that are sensitive to things like security, change management, and stability are more inclined to be a couple or several versions behind. This practice isn’t anything new, it took EMC IT a long time to adopt Windows 7… heck, Windows 10 is out now and we are still sitting at Windows 7. Not saying that is necessarily a bad thing..

Why you wait for the first Service Pack

So we have to accept the fact that we need to support multiple versions and take that into consideration when building components for Mesos. This takes us to the most recent release (0.4.0) of mesos-module-dvdi which supports multiple versions spanning from 0.23.1 to the most recent release of mesos, 0.26.0. A side note, for those not familiar with mesos-module-dvdi its a Mesos isolator module that can create and mount external storage volumes, such as ScaleIO, Amazon EBS, or etc, and provide persistent storage for Applications created on Mesos. If you are interested, you can find more information on its GitHub page.

Where’s the Beef?

The mesos-module-dvdi is a C++11 project and as such needs to implement defined Mesos interfaces so that the module can be loaded and invoked from a Mesos slave (think of it as a simple node that launches tasks). If you take a look at the Isolator interface between versions, you will immediately notice that the interface varies wildly between versions and that modules created for one version will not work on another version of Mesos. Heck, even the interface class name you need to implement has changed:

In 0.23.1:
class IsolatorProcess : public process::Process

process::Future<Nothing> recover(
const std::list<ExecutorRunState>& states,
const hashset<ContainerID>& orphans);

process::Future<Option<CommandInfo>> prepare(
const ContainerID& containerId,
const ExecutorInfo& executorInfo,
const std::string& directory,
const Option<std::string>& rootfs,
const Option<std::string>& user);

In 0.24.1:
class DockerVolumeDriverIsolator: public mesos::slave::Isolator

virtual process::Future<Nothing> recover(
const std::list<ContainerState>& states,
const hashset<ContainerID>& orphans);

virtual process::Future<Option<ContainerPrepareInfo>> prepare(
const ContainerID& containerId,
const ExecutorInfo& executorInfo,
const std::string& directory,
const Option<std::string>& user);

This is not good. This is basically Mesos’ version of Microsoft’s DLL Hell. As an experienced C++ developer, defining a C++ interface that is going to ensure backwards compatibility is very difficult. In previous projects I have worked on, the interfaces not only needed to be backwards compatible but also cross-platform (yes, Windows support) all from a single code base. That adds further complexity. These days some of these difficulties can easily be side stepped by standing up a REST endpoint. Even the worst REST APIs by in large “get the job done” even if the API is horribly designed. For example, some might argue if the API does not follow CRUD, HATEOAS, or etc they are bad designs, but I digress.

DLL Hell

A Meh Solution

So the current solution to support multiple versions of the Mesos Isolator interface from a single code base has been to use ifdefs. Yea, unfortunately the solution isn’t as elegant as it could be, but with the break in backwards compatibility in the Isolator interface, it leaves us with very little choice. This is also compounded by the fact that this isn’t the only interface that is changing between versions. There are other Mesos support libraries in which the interface is in flux and adding an abstraction layer among all these functions, classes, and structures might take you down a rabbit hole that will leave you rocking yourself in the fetal position for months crying “so many functions”.

Since we are using ifdefs, that unfortunately means our solution is a compile time solution and we therefore need to build every version that we plan on supporting. We do that in our Makefile as see below.

In the Makefile on lines 8 and 818 respectively:
ISO_VERSIONS := 0.23.1 0.24.1 0.25.0 0.26.0

$(foreach V,$(ISO_VERSIONS),$(eval $(call ISOLATOR_BUILD_RULES,$(V))))

Then we take that version and create an integer representation of the version by stripping out the periods from the version we are compiling. Represented by the MESOS_VERSION_INT preprocessor directive below. An example of that would be version 0.24.1 becomes 0241.

In the Makefile lines 790-795:
$$(ISO_MAKEFILE_$1): CXXFLAGS=-I$$(GLOG_OPT_DIR)/include
-I$$(PICOJSON_OPT_DIR)/include
-I$$(BOOST_OPT_DIR)
-I$$(PBUF_OPT_DIR)/include
-DMESOS_VERSION_INT=$$(subst .,,$1)
$$(ISO_MAKEFILE_$1): $$(ISO_CONFIGURE_$1) $$(ISO_DEPS_$1)

Then we compile the project using the MESOS_VERSION_INT and make alterations to the code to handle the ifdefs.

In the docker_volume_driver_isolator.hpp:
if MESOS_VERSION_INT != 0 && MESOS_VERSION_INT < 0240<br />
class DockerVolumeDriverIsolator: public mesos::slave::IsolatorProcess<br />
else<br />
class DockerVolumeDriverIsolator: public mesos::slave::Isolator<br />
endif

if MESOS_VERSION_INT != 0 && MESOS_VERSION_INT < 0240<br />
process::Future<Nothing> recover(
const std::list<ExecutorRunState>& states,
const hashset<ContainerID>& orphans);<br />
else<br />
virtual process::Future<Nothing> recover(
const std::list<ContainerState>& states,
const hashset<ContainerID>& orphans);<br />
endif

if MESOS_VERSION_INT != 0 && MESOS_VERSION_INT < 0240<br />
process::Future<Option<CommandInfo>> prepare(
const ContainerID& containerId,
const ExecutorInfo& executorInfo,
const std::string& directory,
const Option<std::string>& rootfs,
const Option<std::string>& user);<br />
else<br />
process::Future<Option<ContainerPrepareInfo>> prepare(
const ContainerID& containerId,
const ExecutorInfo& executorInfo,
const std::string& directory,
const Option<std::string>& user);<br />
endif

So that is it. Again, not the most elegant solution but unfortunately addressing the backwards compatibility is going to be very difficult to resolve going forward as it would require retrofitting older versions of Mesos. That could get pretty ugly to change architecture in the form of patches or even a minor release. Odds are this is not going to go away any time soon. I hope this helps other developers out there looking to create Mesos Isolators (and Mesos Frameworks because it looks like the Framework interfaces may have the same problem).

twittergoogle_plusredditlinkedinmail