user-avatar
Today is Thursday
March 28, 2024

Tag: iterator

August 23, 2015

Using Google’s Protocol Buffer library to write GeoWave Filters for HBase datastore

by viggy — Categories: Uncategorized — Tags: , , , , Leave a comment

Accumulo provides Iterators which can be run on Tablet Servers as Filters during Scan. GeoWave uses this in form of local Client Filters and Distributable Filters which run on Tablet Servers when any scan is performed. As part of adding support for HBase, I needed to implement these filters in HBase. I have currently implemented two Filters, SingleEntryFilter and CqlHBaseQueryFilter which are counterparts for SingleEntryFilterIterator and CqlQueryFilterIterator in Accumulo.

Hbase makes use of Google’s Protocol Buffer to serialize data on client side and send it across to tablet servers. In this blog, I explain how I used the protobuf-java library to write the SingleEntryFilter for HBase in GeoWave.

Protobuf auto generates part of the code by using .proto file and its own code generator. The .proto file needs to contain the information about the arguments your class accepts in its constructor, which package the class needs to be generated in, etc. The arguments that class supports needs to be serializable. Since in our case, we are migrating from iterators, all the needed data are expected to be serializable as they need to be serialized even in case of iterators for Accumulo. I created a ‘protobuf’ directory inside local source directory ‘extensions/datastores/hbase/src/main/’ in GeoWave source code and in that created the following SingleEntryFilters.proto file.

 

option java_package = “mil.nga.giat.geowave.datastore.hbase.query.generated”;
option java_outer_classname = “FilterProtos”;
option java_generic_services = true;
option java_generate_equals_and_hash = true;
option optimize_for = SPEED;

message SingleEntryFilter {
required bytes adapterId = 1;
required bytes dataId = 2;
}

 

Now to generate the classes using protobuf, we need to install protobuf compiler on the machine. You can download the compiler from here.The README.txt given along with the compiler is quite explanatory for installing it.

After successful installation, by default the protoc compiler executable would be in src directory. Go to the source directory in which you want the generated package to be added. In my case, it was in <geowave-src-directory>/extensions/datastores/hbase/src/main/ .

Now, you can the following command.

<path-to-protoc-installation-dir>/src/protoc -I=. –java_out=java/ protobuf/SingleEntryFilters.proto

protobuf/SingleEntryFilter.proto is the path to the .proto file from your current directory.

 

This generated the necessary FilterProtos class. Now we need to create the SingleEntryFilter class. We use the FilterBase class provided by HBase to create new custom classes. Lars George’s book, HBase: The Definitive Guide  explains developing custom Filters for hbase and I used the example shared in the github repo for the book to develop Custom Filter for developing SingleEntryFilter.

It was only through that example that I came to know that I need to also implement toByteArray and parseFrom methods in the Custom Filter. Later I also found in HBase log that parseFrom method generates a DeserializationException which informs user about extending it in derived class.