go-whosonfirst-rasterzen

module
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 30, 2019 License: BSD-3-Clause

README

go-whosonfirst-rasterzen

Go package for working with Who's On First data and go-rasterzen tiles.

Install

You will need to have both Go (specifically version 1.12 or higher) and the make programs installed on your computer. Assuming you do just type:

make tools

All of this package's dependencies are bundled with the code in the vendor directory.

Important

This is work in progress. It works. Until it doesn't.

For background, you should read the following blog posts:

Tools

wof-rasterzen-seed

Pre-seed go-rasterzen tiles (and their vector/raster derivatives) for one or more go-whosonfirst-index compatible indices.

> go run cmd/wof-rasterzen-seed/main.go -h
Usage of /var/folders/_k/h7ndzcyx3dq027gsrg1q45xm0000gn/T/go-build375480020/b001/exe/main:
  -exclude value
    	Exclude records not matching one or path '{PATH}={VALUE}' statements. Paths are evaluated using the gjson package's 'dot' syntax.
  -fs-cache
    	Cache tiles with a filesystem-based cache.
  -fs-root string
    	The root of your filesystem cache. If empty rasterd will try to use the current working directory.
  -go-cache
    	Cache tiles with an in-memory (go-cache) cache.
  -include value
    	Include only those records matching one or path '{PATH}={VALUE}' statements. Paths are evaluated using the gjson package's 'dot' syntax.
  -lambda-dsn value
    	A valid go-whosonfirst-aws DSN string. Required paremeters are 'credentials=CREDENTIALS' and 'region=REGION'
  -lambda-function string
    	A valid AWS Lambda function name. (default "Rasterzen")
  -log-level string
    	Log level to use for logging (default "status")
  -max-zoom int
    	 (default 16)
  -min-zoom int
    	 (default 1)
  -mode string
    	 (default "repo")
  -nextzen-apikey string
    	A valid Nextzen API key.
  -nextzen-debug
    	Log requests (to STDOUT) to Nextzen tile servers.
  -nextzen-origin string
    	An optional HTTP 'Origin' host to pass along with your Nextzen requests.
  -nextzen-uri string
    	A valid URI template (RFC 6570) pointing to a custom Nextzen endpoint.
  -png-options string
    	The path to a valid RasterzenPNGOptions JSON file.
  -rasterzen-options string
    	The path to a valid RasterzenOptions JSON file.
  -refresh-all
    	Force all tiles to be generated even if they are already cached.
  -refresh-png
    	Force PNG tiles to be generated even if they are already cached.
  -refresh-rasterzen
    	Force rasterzen tiles to be generated even if they are already cached.
  -refresh-svg
    	Force SVG tiles to be generated even if they are already cached.
  -s3-cache
    	Cache tiles with a S3-based cache.
  -s3-dsn string
    	A valid go-whosonfirst-aws DSN string
  -s3-opts string
    	A valid go-whosonfirst-cache-s3 options string
  -seed-all
    	See all the tile formats
  -seed-max-workers runtime.NumCPU()
    	The maximum number of concurrent workers to invoke when seeding tiles. The default is the value of runtime.NumCPU() * 2.
  -seed-png
    	Seed PNG tiles.
  -seed-rasterzen
    	Seed Rasterzen tiles.
  -seed-svg
    	Seed SVG tiles.
  -seed-tileset-catalog-dsn string
    	A valid tile.SeedCatalog DSN string. Required parameters are 'catalog=CATALOG' (default "catalog=sync")
  -seed-worker string
    	The type of worker for seeding tiles. Valid workers are: lambda, local, sqs. (default "local")
  -sqs-dsn value
    	A valid go-whosonfirst-aws DSN string. Required paremeters are 'credentials=CREDENTIALS' and 'region=REGION' and 'queue=QUEUE'
  -svg-options string
    	The path to a valid RasterzenSVGOptions JSON file.
  -timings
    	Display timings for tile seeding.

For example:

./bin/wof-rasterzen-seed -fs-cache -fs-root cache -nextzen-apikey {APIKEY} -max-zoom 10 -mode repo /usr/local/data/sfomuseum-data-whosonfirst/
...time passes, your fan whirrrrrrrrrs...

Or if you just want to know how many tiles you'll end up fetching:

./bin/wof-rasterzen-seed -fs-cache -fs-root cache -nextzen-apikey {APIKEY} -max-zoom 10 -count -mode repo /usr/local/data/sfomuseum-data-whosonfirst/
164993
Filtering records

You can filter records to process by passing one or more -include or -exclude flags. The -exclude flag will exclude records not matching one or path '{PATH}={VALUE}' statements. The -include flag will include only those records matching one or path '{PATH}={VALUE}' statements. Paths are evaluated using the gjson package's 'dot' syntax. For example:

./bin/wof-rasterzen-seed -count -max-zoom 10 -include 'properties.wof:placetype=country' /usr/local/data/sfomuseum-data-whosonfirst/
00:16:49.630981 [wof-rasterzen-seed] STATUS SKIP Dayton (101712161) because properties.wof:placetype != country (is locality)
00:16:49.727171 [wof-rasterzen-seed] STATUS SKIP Columbus (101712381) because properties.wof:placetype != country (is locality)
00:16:49.741976 [wof-rasterzen-seed] STATUS SKIP Middletown (101712653) because properties.wof:placetype != country (is locality)
00:16:49.749884 [wof-rasterzen-seed] STATUS SKIP Portland (101715829) because properties.wof:placetype != country (is locality)
00:16:49.765511 [wof-rasterzen-seed] STATUS SKIP McMinnville (101716117) because properties.wof:placetype != country (is locality)
...and so on
...time passes
00:18:37.793127 [wof-rasterzen-seed] STATUS SKIP Mills Field Municipal Airport of San Francisco (1360695653) because properties.wof:placetype != country (is campus)
00:18:37.804110 [wof-rasterzen-seed] STATUS SKIP San Francisco Airport (1360695655) because properties.wof:placetype != country (is campus)
00:18:37.815119 [wof-rasterzen-seed] STATUS SKIP Denver Stapleton International Airport (1360695657) because properties.wof:placetype != country (is campus)
152956

Docker

Yes. There is a Dockerfile for the wof-rasterzen-seed tool.

ECS (AWS)

The detailed details of using the above mentioned Dockerfile with the AWS ECS service are outside the scope of this document but your container will need an IAM role with the following (minimum) built-in policies:

  • AmazonEC2ContainerServiceforEC2Role

Additionally, if you plan to use the SQS seed-worker you'll need to add a policy for sending messages to the SQS queue in question. For example:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "sqs:SendMessage",
                "sqs:GetQueueUrl",
                "sqs:GetQueueAttributes"
            ],
            "Resource": [
                "arn:aws:sqs:{AWS_REGION}:{AWS_ACCOUNT_ID}:{SQS_QUEUE}"
            ]
        }
    ]
}

If you are going to use the Lambda seed-worker you'll need to add a policy for invoking the Lambda function in question. For example:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowLambdaRasterzenSeeder",
            "Effect": "Allow",
            "Action": "lambda:InvokeFunction",
            "Resource": "arn:aws:lambda:{AWS_REGION}:{AWS_ACCOUNT_ID}:function:{LAMBDA_FUNCTION}"
        }
    ]
}

Your role should have the following trust relationship:

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "ecs-tasks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Finally, assuming you haven't baked a specific wof-rasterzen-seed command in to your container image you would invoke it as a (ECS) task with a "container override" like this:

/usr/local/bin/wof-rasterzen-seed,-seed-max-workers,25,-nextzen-apikey,{NEXTZEN_APIKEY},-seed-worker,sqs,-sqs-dsn,credentials=iam: region={AWS_REGION} queue={SQS_QUEUE},-seed-all,-min-zoom,14,-max-zoom,16,-timings,-mode,git,https://github.com/sfomuseum-data/sfomuseum-data-architecture.git

See the commas between everything and the lack of quotes around the -sqs-dsn flag? I love that about ECS...

The command above will fetch a copy of the sfomuseum-data-architecture repo, calculate all the tiles between zoom levels 14 and 16 (for all the WOF records in the repo) and then "fetch" each one of those tiles using the SQS seed-worker which just means an entry for the tile will be created in an SQS queue and assumes you've configured the go-rasterzen rasterzen-seed-sqs tool as a Lambda trigger for new queue items.

For a pretty picture of everything just described, see go-rasterzen/docs/rasterzen-seed-sqs-arch.jpg . (Note that the image assumes the go-rasterzen rasterzen-seed tool which doesn't know how to seed WOF repos, but the principle is the same.)

Because this (wof-rasterzen-seed) is being run from an ECS instance you could also invoke the Lambda seed-worker directly rather than queueing everything up in SQS. That's your business. The point is you can do either. In both cases though it's assumed that there is a Lambda seed-worker that is fetching data from a copy of the go-rasterzen rasterd application.

See also: go-rasterzen workers.

See also

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL