Pigeon Overview
Pigeon is a set of user-defined functions that allows you to write Pig Latin scripts which deal with spatial data. The extension is unobtrusive and is pure UDFs which makes it compatible with any version of Pig you use. Besides, it easily meshes with existing built-in functions in Pig such as FILTER, GROUP and JOIN.
Installing Pigeon
Pigeon is made available as a JAR file that should be called from you Pig script. Pigeon uses ESRI-geometry-API to create and process spatial data types. You should also include the JAR file of ESRI-geometry-API library. Once these two JARs are included in your script, you can directly call all the functions available in Pigeon. As a shortcut, the installation of Pigeon includes a file that provides short names for all functions to make it similar to that of PostGIS.
Prerequisites
In order to use Pigeon, you should have Pig installed and configured in your system. Check this tutorial
Download
Download the latest version of Pigeon here.
Usage examples
Once you have Pigeon configured correctly, you are ready to run some sample scripts. Below is a few examples that you can use as a start.
Load ZIP codes and calculate the area of each one
REGISTER pigeon-0.1.jar REGISTER esri-geometry-api-1.1.1.jar; zips = LOAD 'zcta510.csv.bz2' USING PigStorage(','); zips_areas = FOREACH zips GENERATE $6 AS ZIP, ST_Area($0) AS area; STORE zips_areas INTO 'zips_areas';