disadis

command module
v1.0.3-0...-49128d4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 27, 2021 License: Apache-2.0 Imports: 18 Imported by: 0

README

Disadis

APACHE 2 License Go Report Card

Disadis is an download proxy for Hydra-based applications. It will proxy content out of a Fedora 3 instance, so your Ruby application doesn't have to devote a valuable app instance to doing an otherwise mindless task. The way we do this is have the rails application handle the download request initially, and then, if the user is authorized, redirect to disadis by way of an nginx internal redirect. Then disadis will start and monitor the actual download to the client.

Features of Disadis include

  • provides E-tags based on datastream version numbers
  • responds to GET and HEAD requests
  • handles range requests
  • forces the allowable datastreams to download to be whitelisted
  • assumes the filename is the label of the datastream
  • can handle an arbitrary number of simultaneous downloads
  • uses a minimal amount of memory since all downloads are streamed

Use

The daemon will listen on several ports for incoming HTTP requests. The exact ports and the number of them is determined by the configuration file. Each port can have a number of handlers attached to it. Usually each datastream name you wish to proxy will have its own handler. On each port requests are expected to have the form /:id or /:id?datastream_id=XXX. The id can have some prefixed attached to it, and then fedora is checked for the object and the given datastream, with the content being proxied back if it exists.

Configuration

The daemon takes a command line argument which names a configuration file. The file gives how to determine the current user from a request, the handlers to set up, and the URL to use to address fedora.

The configuration file consists of a number of sections, which may appear in any order. The first section [general] has three variables to set:

  • log-filename is the name of the log file to use. If none is provided, logging is sent to stdout.
  • fedora-addr is the root URL to use to access your fedora instance. It should include the fedora username and password if those are needed to download content from your fedora.
  • bendo-token is a token to use for content stored at external URLs via E or R datastreams. (optional)

Sample section:

[general]
fedora-addr = http://fedoraAdmin:fedoraAdmin@localhost:8983/fedora
log-filename = /var/log/disadis/log.txt

The other sections each specisify a handler. There will be as many additional sections as you need for each handler. The section name is [Handler "name"] where name is the name you want to use for this handler. Inside the section there are a few variables to set for that handler.

  • port is the port number disadis should listen on for this handler.
  • versioned is whether disadis should support the versioned url. One of true or false. Defaults to false.
  • prefix is the prefix, if any, to add to the identifier in the URL.
  • Datastream is the datastream to proxy of the item in fedora.
  • Datastream-id is the datastream_id name you want to associate this handler with. Either not setting it or using the name default makes this the handler used when there is no datastream_id parameter on the incoming request.

A sample handler would look like

[Handler "thumbnail"]
datastream = thumbnail
prefix = sufia:
port = 4000
datastream-id = thumbnail

This configuration will have disadis listen to localhost:4000, and any requests of the form /{id}?datastream_id=thumbnail will result in the download of the datastream thumbnail from the object sufia:{id}.

Example

A complete configuration file would look similar to the following.

[general]
fedora-addr = http://fedoraAdmin:fedoraAdmin@localhost:8983/fedora

[Handler "thumbnail"]
datastream = thumbnail
prefix = sufia:
port = 4000
datastream-id = thumb

[Handler "dl"]
datastream = content
port = 4000
prefix = sufia:

This configuration will have disadis listen on port 4000 for connections. HTTP requests to path /{id} result in the download of the content datastream of the fedora object sufia:{id}. Requests to the path /{id}?datastream_id=thumb result in the download of the thumbnail datastream.

Versioned

If a datastream handler is has versioned set to true, then paths of the form /{id}/{version} are handled, where version refers to the integer fedora datastream number. Requests without a version are assigned the most current version for that datastream. For the moment, requests to versions besides the most current version are denied with a 404 error.

Nginx Redirects

The nginx internal redirect is handled by first defining an internal location in your nginx config file. The following block provides a template you can use.

location ^~ /download-internal/ {
    internal;
    proxy_set_header Host              $host;
    proxy_set_header X-Real-IP         $remote_addr;
    proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $http_x_forwarded_proto;
    proxy_redirect   off;
    proxy_buffering  off;
    proxy_pass       http://127.0.0.1:4000/;
}

And then the rails application can pass control to the disadis daemon by setting the header X-Accel-Redirect to the route /download-internal/{id} and then returning without writing a response body. The following code in Rails shows one way of doing it. (In this case the fedora id is in the variable asset.noid)

response.headers['X-Accel-Redirect'] = "/download-internal/#{asset.noid}"
head :ok

Nginx will then send the request to disadis. The client does not see any of the internal redirects--as far as the client is concerned, there is only a single request and a single response.

Future

  • Is there a simpler way to configure the whole thing? It seems too complicated to me.
  • Support config reloading and graceful shutdowns

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
Package fedora provides a thin wrapper around the Fedora REST API.
Package fedora provides a thin wrapper around the Fedora REST API.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL