SQL on Java Collections with streams18 Jul 2015
We’re going to look at an example of how to represent SQL queries over java collections using streams from Java 8. Pretty cool right?
This post assumes you have some familiarty with basic SQL and Java 8 lambda expressions and streams.
Getting our hands dirty
Let’s say we have a list of key-value pairs defined to mark the number of bugs that each team member has squashed. Think of it like a map that can contain duplicate key values. We can define it and initialize it as such:
Now, let’s say we want to retrieve the names of the team members and the total number of bugs they have squashed. But let’s add two constraints:
We only sum up entries that have less than 25 bugs squashed, because 25 or higher seems suspiciously high.
We only report team member names with a total of 5 or more bugs squashed.
We can express this fairly easily in SQL:
We can express this same logic with Java 8 streams.
You’re probably either thinking that Java 8 streams are amazing, or that you have no idea why anyone would want to use them. Either way, we’ll now go into more details about what’s actually happening.
Let’s break it down.
We create a stream from our list of pairs. Then, we filter out all the pairs that have 25 or more bugs squashed.
While seemingly tricky, this isn’t so bad if we look at it piece by piece.
collect lets us store our stream back into a Java collections object. By passing in
Collectors.groupingBy, we are storing our stream into a map, where we smoosh all pairs with common keys into aggregated entries in our map.
This first argument to
Collectors.groupingBy is a function (via Java 8 named function syntax) that defines the grouping key for our final aggregated hash-map as the result of calling
getKey on each entry in our stream.
This second argument to
Collectors.groupingBy describes how to aggregate each set of entries in our map that has the same key.
We are saying that we want to add up the number of bugs squashed. And we pass the function
SimpleEntry<String, Integer>::getValue to say that the values of our key-value pairs are what we want to sum up.
Finally, this takes our aggregate result map, and turns it back into a stream for further processing.
Admittedly, it’s unfortunate that we have to store our data into a map, then read it back into a stream mid-way through our calculation, but that’s a necessary limitation of using
The last step straightforward. It filters out any aggregated entries that have less than 5 squashed bugs, then prints each remaining entry out.
Do not worry if some of the details (like collectors) seem confusing. The takeaway here is that Java 8 streams are extremely powerful and versatile.
Feel free to check out the entire example on github.