Note (2017): this draft document was never completed. I am publishing it in its incomplete, draft state because there are a number of good links in the document – reader beware.
In this second part of my trip report I have included my notes on each of the sessions I attended. The purpose of this post is two fold. First, as a place to archive my notes and catalog the materials of the presentation. Second, to convey my impressions on the sessions I was able to attend. This post is not so much an opinion piece, but more of a archive for future reference.
1. An Elasticsearch Crash Course
- Just a distributed document store.
- Uses URL for key into data
- Lucine as the underlying index engine.
- TODO: BLOG - “Hello, world Elastic Search”
2. Is it safe to run applications in Linux Containers?
- root is root, inside the container as well.
A strategy: Rings of Security
- “defang it” - take away root privileges.
- place limits on resource usage
- use more secure kernels and secure them
System Services (upper level):
- (sshd, cron, syslog …) - typically run as root
- usually don’t NEED to run as root
- often interact w/ /dev – Fix: “devices” control group
- /proc & /sys are “not perfectly namespaced”
System services (lower level):
- much less, to completely not necessary inside “container”
VM in Container
The general impression that was given is that placing a VM inside a container gave the same level of isolation (i.e. security) as a VM running directly under a hyper-visor with no containerization present.
- This attitude was not defended or questioned but the audience seemed to accept the conclusion. It should like the community has accepted this. – [eg: look for “proof” of this]
VM in Container
The following is really VM in Docker
- https://github.com/jpetazzo/docker2docker – (Author: speaker of this session)
Immutable immutable infrastructure
- build a new “operating system” for each change
- copy on write mentality.
- container becomes Read-Only – (a model for security)
- scalability becomes easier … its read only.
3. Graph Theory - you need to know
- Based on lectures from Arthur Benjamin
- has a set of online lectures
“Wikipedia is good at 2 things: comic books and computer science” – Tim Berglund
4. Building a Massively Scalable Cloud Service from the Ground Up
This session was a presentation of JFrog’s experiences as part of the design and deployment of their Bintray project. Of particular note: JFrog chose to evolve away from AWS back to Physical infrastructure for this project.
- Were unhappy with the virtualization penalty
- Determined their elasticity did not require “dynamic” elasticity.
- Their hosting provider (physical hosting) can spin a new instance up in a few hours.
- Their demand is predictable to daily-ish levels.
5. Monitoring Distributed Systems in Real-time with Riemann and Cassandra
- TODO: Blog - Standing up Rieman
- Note: Used Coda Hale’s Metrics
6. Real-time Analytics with Open Source Technologies
- System’s Goals: System for Arbitrary exploration of analytical data.
- Hadoop was the obvious starting point, but was rejected due to it’s non-interactive nature.
“Exploration of analytical data” is NOT:
- Batch processing of bulk data
- looking for an individual event (thresholding, anomaly detection, ...) [eg: exploration is a middle ground]
Druid http://druid.io/ (@druidio) was started by the speakers.
- started in 2011
- targeting this problem specifically
- Low latency ingestion
- good “exploration” characteristics
The straw-man extended the prototype by continuing to use Hadoop in parallel to Storm. This addressed the long tail problem of latent data arrival while also allowing for interactive exploration::
Kafka --> Storm ---> Druid == Real-Time path |--> Hadoop --^ == Batch, warehouse path
7. Application Deployment and Auto-scaling On OpenStack using Heat
All about OpenShift. Almost no Heat.
8. How We Built a Cloud Platform Using Netflix OSS
Get the slide deck. There’s lots of good stuff in the deck.
- built their own discovery
- build their own configuration service
- Netflix “Bakery” - frozen image – good reasons this works better than config through composition.
- Multiple levels of configuration
- Program defaults
- backed in
- pushed with bootstrap
- pulled from config service
9. Mesos: Elastically Scalable Operations, Simplified
|Slides||http://mesosphere.io/slides/oscon-mesos-2014/ (use arrow keys, reveal.js slide deck)|
- http://elastic.mesosphere.io - sandbox, pay AWS for instances.
- Mesos has a programmable API - focus of prior workshop.
- Focus of this talk is Operations.
- Mesos is “below PaaS, but above IaaS” - treats stuff as a giant resource pool.
- Speaks protobuf
- Mesos creates partitioned spaces using CGroups.
- Mesos has it’s own container, but can use other containers.
- Uses ZooKeeper for “leader election”, probably for distributed execution for the scheduler framework.
- Tutorials (that can be used with the sandbox) - http://mesosphere.io/learn
10. Multiple Datastores Working Together: Will It Blend?
11. Migrating to the Web Using Dart and Polymer - A guide for Legacy OOP Developers
12. Thinking in a Highly Concurrent, Mostly Functional Language
13. Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Source Libraries
- “Data is the most important asset at Netflix”
- 3.2M msg/s
- This talk was about Netflix’s “Suro” project
- “Suro” - NetflixOSS: “Data Pipeline” - java library::
- feeds S3 and then hadoop to process out of S3
- Get the deck for this talk, it goes through what some of the tooling is. (surprise, surprise)
- Netflix is using Druid for ad-hoc analytics investigation.
- Druid is in parallel to ElasticSearch, both feed from Kafka
- RxJava - functional REACTIVE programming model
- (Fault tolerant features slide)
- disk backed queue – “big-queue” <– OSS package
14. A Presentation Toolbox that Just Might Blow Your Audience Away
Not covered, but recommended:
- Presentation Zen (have the book)
- Presentation Aikido - Damian Conway (youtube, … it’s a presentation)
reveal.js - https://github.com/hakimel/refeal.js
- alternates: impress.js / “slide-e” ??
- author preso in markdown.
shellinabox - http://code.google.com/p/shellinabox
- demo tool / terminal emulator in html5+ajax
- generally then fires off ‘screen’
- has a script that reattaches to an existing screen session
- shell in a box is included in the markdown
- using inline html
- alternate: novnc which does X
- engage your audience while presenting - point them to resources
- use url shortener to make qr code smaller. the
anchor is the qr code and the body of the href was text above/below the qr code.
- push published presentation to gh-pages
- keep source on trunk
- combine with git submodules
- includes reveal.js and qrcode.js through submodule
Trick: embeds a google analytics cookie to track popularity
- Google Fonts
- Visuals from Flickr (many CC licensed)
- Flickr community is very good at tagging
- trick from Baron Swartz
- Chrome search
- same trick w/ google search, see slides for url