I’d like to use TDD wherever I can.

A very brief description of TDD with a few useful links.

TDD in Java walkthrough.

I was impressed by the quality and the conciseness of “Pragmatic Unit Testing with JUnit” the first time I read it in ~2006. The latest release is named “Pragmatic Unit Testing in Java 8 with JUnit”. I am looking forward to reading it again!


Reactive Programming Notes and Links

A very well written introduction to reactive programming that implements Twitter recommendation using RX. Link by andrestaltz

What is reactive programming: how message-driven (has direction from sender to receiver) is different than event-driven (undirected) and how message-driven (Actor-based) avoids callback-hell.

What is Callback Hell

I have not read the followings:

Art of Node (node.js)

Fancy a PhD thesis on Concurrent Programming for Scalable Web Architecture by Benjamin Erb?

Futures and Promises from Scala Documentation

Spark and Scala and Big Data in General


Process JSON in scala for Spark.

Effective Scala

Guide to advanced Scala user.

Apache Spark

Spark is a in-mem platform for fast compute (ebay uses Spark); Hazelcast is data grid that is more about storage than computing; Cascading is a tool for building data processing pipline.

Spark set up script for AWS EMR from s3 provided by amazon

Best Practice for Testing Spark

Report Analysis Using Spark: Spark Report Patterns

Have you tried Spark API yet, I mean all of them? Examples for all the API functions.

Experience with Spark: The author has extensive hands on experiences with Spark. It seems he has encountered many issues. The one of the comments is from a Spark contributor, who points readers to some ways of improving app performance when using Spark.

AWS and Hadoop related

hdfs or hadoop fs is a client side tool to work with hadoop cluster. It requires proper config. hdfs user is the super user.

Boost performance of S3

Hadoop for dummies by Yahoo!

Deploying a Hadoop 2.0 Cluster on EC2 with HDP2