Spark and Scala and Big Data in General


Process JSON in scala for Spark.

Effective Scala

Guide to advanced Scala user.

Apache Spark

Spark is a in-mem platform for fast compute (ebay uses Spark); Hazelcast is data grid that is more about storage than computing; Cascading is a tool for building data processing pipline.

Spark set up script for AWS EMR from s3 provided by amazon

Best Practice for Testing Spark

Report Analysis Using Spark: Spark Report Patterns

Have you tried Spark API yet, I mean all of them? Examples for all the API functions.

Experience with Spark: The author has extensive hands on experiences with Spark. It seems he has encountered many issues. The one of the comments is from a Spark contributor, who points readers to some ways of improving app performance when using Spark.

AWS and Hadoop related

hdfs or hadoop fs is a client side tool to work with hadoop cluster. It requires proper config. hdfs user is the super user.

Boost performance of S3

Hadoop for dummies by Yahoo!

Deploying a Hadoop 2.0 Cluster on EC2 with HDP2


A Post About Nothing in Scala

A wonderful post about Nothings (Null, null, Nil, Nothing, None, and Unit) in Scala

Matt Malone's Old-Fashioned Software Development Blog

One of the main complaints you hear about the Scala language is that it’s too complicated compared to Java. The average developer will never be able to achieve a sufficient understanding of the type system, the functional programming idioms, etc. That’s the argument. To support this position, you’ll often hear it pointed out that Scala includes several notions of nothingness (Null, null, Nil, Nothing, None, and Unit) and that you have to know which one to use in each situation. I’ve read an argument like this more than once.

It’s not as bad as all that. Yes, each of those things is part of Scala, and yes, you have to use the right one in the right situation. But the situations are so wildly different it’s not hard to figure out once you know what each of these things mean.

Null and null

First, let’s tackle Null and null. Null…

View original post 1,098 more words