Documentation ¶
Overview ¶
api is a part of dataset
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
api_docs.go is a part of dataset
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Package dataset includes the operations needed for processing collections of JSON documents and their attachments.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Package dataset includes the operations needed for processing collections of JSON documents and their attachments.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
cli is part of dataset
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
This is part of the dataset package.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Package dataset includes the operations needed for processing collections of JSON documents and their attachments.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Package dataset includes the operations needed for processing collections of JSON documents and their attachments.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
- compatibility.go provides some wrapping methods for backward complatible
- with v1 of dataset. These are likely to go away at some point.
config is a part of dataset
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Package dataset includes the operations needed for processing collections of JSON documents and their attachments.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Dataset Project ===============
The Dataset Project provides tools for working with collections of JSON Object documents stored on the local file system or via a dataset web service. Two tools are provided, a command line interface (dataset) and a web service (datasetd).
dataset command line tool -------------------------
_dataset_ is a command line tool for working with collections of JSON objects. Collections are stored on the file system in a pairtree directory structure or can be accessed via dataset's web service. For collections storing data in a pairtree JSON objects are stored in collections as plain UTF-8 text files. This means the objects can be accessed with common Unix text processing tools as well as most programming languages.
The _dataset_ command line tool supports common data management operations such as initialization of collections; document creation, reading, updating and deleting; listing keys of JSON objects in the collection; and associating non-JSON documents (attachments) with specific JSON documents in the collection.
### enhanced features include
- aggregate objects into data frames - generate sample sets of keys and objects
datasetd, dataset as a web service ----------------------------------
_datasetd_ is a web service implementation of the _dataset_ command line program. It features a sub-set of capability found in the command line tool. This allows dataset collections to be integrated safely into web applications or used concurrently by multiple processes. It achieves this by storing the dataset collection in a SQL database using JSON columns.
Design choices --------------
_dataset_ and _datasetd_ are intended to be simple tools for managing collections JSON object documents in a predictable structured way.
_dataset_ is guided by the idea that you should be able to work with JSON documents as easily as you can any plain text document on the Unix command line. _dataset_ is intended to be simple to use with minimal setup (e.g. `dataset init mycollection.ds` creates a new collection called 'mycollection.ds').
- _dataset_ and _datasetd_ store JSON object documents in collections. The storage of the JSON documents differs.
- dataset collections are defined in a directory containing a collection.json file
- collection.json metadata file describing the collection, e.g. storage type, name, description, if versioning is enabled
- collection objects are accessed by their key which is case insensitive
- collection names lowered case and usually have a `.ds` extension for easy identification the directory must be lower case folder contain
_datatset_ stores JSON object documents in a pairtree
- the pairtree path is always lowercase
- a pairtree of JSON object documents
- non-JSON attachments can be associated with a JSON document and found in a directories organized by semver (semantic version number)
- versioned JSON documents are created sub directory incorporating a semver
_datasetd_ stores JSON object documents in a table named for the collection
- objects are versioned into a collection history table by semver and key
- attachments are not supported
- can be exported to a collection using pairtree storage (e.g. a zip file will be generated holding a pairtree representation of the collection)
The choice of plain UTF-8 is intended to help future proof reading dataset collections. Care has been taken to keep _dataset_ simple enough and light weight enough that it will run on a machine as small as a Raspberry Pi Zero while being equally comfortable on a more resource rich server or desktop environment. _dataset_ can be re-implement in any programming language supporting file input and output, common string operations and along with JSON encoding and decoding functions. The current implementation is in the Go language.
Features --------
_dataset_ supports - Initialize a new dataset collection
- Define metadata about the collection using a codemeta.json file
- Define a keys file holding a list of allocated keys in the collection
- Creates a pairtree for object storage
- Listing _keys_ in a collection - Object level actions
- create
- read
- update
- delete
- Documents as attachments
- attachments (list)
- attach (create/update)
- retrieve (read)
- prune (delete)
- The ability to create data frames from while collections or based on keys lists
- frames are defined using a list of keys and a lost "dot paths" describing what is to be pulled out of a stored JSON objects and into the frame
- frame level actions
- frames, list the frame names in the collection
- frame, define a frame, does not overwrite an existing frame with the same name
- frame-def, show the frame definition (in case we need it for some reason)
- frame-objects, return a list of objects in the frame
- refresh, using the current frame definition reload all the objects in the frame
- reframe, replace the frame definition then reload the objects in the frame using the old frame key list
- has-frame, check to see if a frame exists
- delete-frame remove the frame
_datasetd_ supports
- List collections available from the web service - List or update a collection's metadata - List a collection's keys - Object level actions
- create
- read
- update
- delete
- Documents as attachments
- attachments (list)
- attach (create/update)
- retrieve (read)
- prune (delete)
- A means of importing to or exporting from pairtree based dataset collections
- The ability to create data frames from while collections or based on keys lists
- frames are defined using "dot paths" describing what is to be pulled out of a stored JSON objects
Both _dataset_ and _datasetd_ maybe useful for general data science applications needing JSON object management or in implementing repository systems in research libraries and archives.
Limitations of _dataset_ and _datasetd_ -------------------------------------------
_dataset_ has many limitations, some are listed below
- the pairtree implementation it is not a multi-process, multi-user data store
- it is not a general purpose database system
- it stores all keys in lower case in order to deal with file systems that are not case sensitive, compatibility needed by pairtrees
- it stores collection names as lower case to deal with file systems that are not case sensitive
- it does not have a built-in query language, search or sorting
- it should NOT be used for sensitive or secret information
_datasetd_ is a simple web service intended to run on "localhost:8485".
- it is a RESTful service
- it does not include support for authentication
- it does not support a query language, search or sorting
- it does not support access control by users or roles
- it does not provide auto key generation
- it limits the size of JSON documents stored to the size supported by with host SQL JSON columns
- it limits the size of attached files to less than 250 MiB
- it does not support partial JSON record updates or retrieval
- it does not provide an interactive Web UI for working with dataset collections
- it does not support HTTPS or "at rest" encryption
- it should NOT be used for sensitive or secret information
Authors and history -------------------
- R. S. Doiel - Tommy Morrell
Package dataset includes the operations needed for processing collections of JSON documents and their attachments.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
- import_export.go provides methods to import and export JSON content to
- and from tables or CSV files.
Package dataset includes the operations needed for processing collections of JSON documents and their attachments.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
ptstore is a part of the dataset
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Package dataset includes the operations needed for processing collections of JSON documents and their attachments.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
sqlstore is a part of dataset
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Package dataset includes the operations needed for processing collections of JSON documents and their attachments.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
table.go provides some utility functions to move string one and two dimensional slices into/out of one and two dimensional slices.
texts is part of dataset
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Package dataset includes the operations needed for processing collections of JSON documents and their attachments.
Authors R. S. Doiel, <rsdoiel@library.caltech.edu> and Tom Morrel, <tmorrell@library.caltech.edu>
Copyright (c) 2022, Caltech All rights not granted herein are expressly reserved by Caltech.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Index ¶
- Constants
- Variables
- func Analyzer(cName string, verbose bool) error
- func ApiDisplayUsage(out io.Writer, appName string, flagSet *flag.FlagSet)
- func ApiVersion(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, ...)
- func Attach(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func AttachmentVersions(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func Attachments(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func BytesProcessor(varMap map[string]string, text []byte) []byte
- func CliDisplayHelp(in io.Reader, out io.Writer, eout io.Writer, args []string) error
- func CliDisplayUsage(out io.Writer, appName string, flagSet *flag.FlagSet)
- func Codemeta(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, ...)
- func Collections(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, ...)
- func Create(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, ...)
- func Delete(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, ...)
- func DeleteVersion(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func DisplayLicense(out io.Writer, appName string)
- func DisplayVersion(out io.Writer, appName string)
- func FixMissingCollectionJson(cName string) error
- func FmtHelp(src string, appName string, version string, releaseDate string, ...) string
- func FrameCreate(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func FrameDef(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func FrameDelete(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func FrameKeys(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func FrameObjects(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func FrameUpdate(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func Frames(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func HasFrame(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func JSONIndent(src []byte, prefix string, indent string) []byte
- func JSONMarshal(data interface{}) ([]byte, error)
- func JSONMarshalIndent(data interface{}, prefix string, indent string) ([]byte, error)
- func JSONUnmarshal(src []byte, data interface{}) error
- func Keys(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, ...)
- func MakeCSV(src []byte, attributes []string) ([]byte, error)
- func MakeGrid(src []byte, attributes []string) ([]byte, error)
- func Migrate(srcName string, dstName string, verbose bool) error
- func ObjectVersions(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func ParseDSN(uri string) (string, error)
- func Prune(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func PruneVersion(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func Read(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, ...)
- func ReadKeys(keysName string, in io.Reader) ([]string, error)
- func ReadSource(fName string, in io.Reader) ([]byte, error)
- func ReadVersion(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func Repair(cName string, verbose bool) error
- func Retrieve(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func RetrieveVersion(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, ...)
- func RowInterfaceToString(r []interface{}) []string
- func RowStringToInterface(r []string) []interface{}
- func RunAPI(appName string, settingsFile string) error
- func RunCLI(in io.Reader, out io.Writer, eout io.Writer, args []string) error
- func SetupApiTestCollection(cName string, dsnURI string, records map[string]map[string]interface{}) error
- func StringProcessor(varMap map[string]string, text string) string
- func TableInterfaceToString(t [][]interface{}) [][]string
- func TableStringToInterface(t [][]string) [][]interface{}
- func Update(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, ...)
- func ValueInterfaceToString(val interface{}) (string, error)
- func ValueStringToInterface(s string) (interface{}, error)
- func WriteKeys(keyFilename string, out io.Writer, keys []string) error
- func WriteSource(fName string, out io.Writer, src []byte) error
- type API
- func (api *API) Init(appName string, settingsFile string) error
- func (api *API) RegisterRoute(prefix string, method string, ...) error
- func (api *API) Reload(sigName string) error
- func (api *API) Router(w http.ResponseWriter, r *http.Request)
- func (api *API) Shutdown(sigName string) int
- func (api *API) WebService() error
- type Attachment
- type Collection
- func (c *Collection) AttachFile(key string, filename string) error
- func (c *Collection) AttachStream(key string, filename string, buf io.Reader) error
- func (c *Collection) AttachVersionFile(key string, filename string, version string) error
- func (c *Collection) AttachVersionStream(key string, filename string, version string, buf io.Reader) error
- func (c *Collection) AttachmentPath(key string, filename string) (string, error)
- func (c *Collection) AttachmentVersionPath(key string, filename string, version string) (string, error)
- func (c *Collection) AttachmentVersions(key string, filename string) ([]string, error)
- func (c *Collection) Attachments(key string) ([]string, error)
- func (c *Collection) Clone(cloneName string, cloneDsnURI string, keys []string, verbose bool) error
- func (c *Collection) CloneSample(trainingName string, trainingDsnURI string, testName string, testDsnURI string, ...) error
- func (c *Collection) Close() error
- func (c *Collection) Codemeta() ([]byte, error)
- func (c *Collection) Create(key string, obj map[string]interface{}) error
- func (c *Collection) CreateJSON(key string, src []byte) error
- func (c *Collection) CreateObject(key string, obj interface{}) error
- func (c *Collection) CreateObjectsJSON(keyList []string, src []byte) error
- func (c *Collection) Delete(key string) error
- func (c *Collection) DocPath(key string) (string, error)
- func (c *Collection) ExportCSV(fp io.Writer, eout io.Writer, f *DataFrame, verboseLog bool) (int, error)
- func (c *Collection) ExportTable(eout io.Writer, f *DataFrame, verboseLog bool) (int, [][]interface{}, error)
- func (c *Collection) FrameClear(name string) error
- func (c *Collection) FrameCreate(name string, keys []string, dotPaths []string, labels []string, verbose bool) (*DataFrame, error)
- func (c *Collection) FrameDef(name string) (map[string]interface{}, error)
- func (c *Collection) FrameDelete(name string) error
- func (c *Collection) FrameKeys(name string) []string
- func (c *Collection) FrameNames() []string
- func (c *Collection) FrameObjects(fName string) ([]map[string]interface{}, error)
- func (c *Collection) FrameRead(name string) (*DataFrame, error)
- func (c *Collection) FrameReframe(name string, keys []string, verbose bool) error
- func (c *Collection) FrameRefresh(name string, verbose bool) error
- func (c *Collection) HasFrame(frameName string) bool
- func (c *Collection) HasKey(key string) bool
- func (c *Collection) ImportCSV(buf io.Reader, idCol int, skipHeaderRow bool, overwrite bool, verboseLog bool) (int, error)
- func (c *Collection) ImportTable(table [][]interface{}, idCol int, useHeaderRow bool, ...) (int, error)
- func (c *Collection) Join(key string, obj map[string]interface{}, overwrite bool) error
- func (c *Collection) Keys() ([]string, error)
- func (c *Collection) Length() int64
- func (c *Collection) MergeFromTable(frameName string, table [][]interface{}, overwrite bool, verbose bool) error
- func (c *Collection) MergeIntoTable(frameName string, table [][]interface{}, overwrite bool, verbose bool) ([][]interface{}, error)
- func (c *Collection) ObjectList(keys []string, dotPaths []string, labels []string, verbose bool) ([]map[string]interface{}, error)
- func (c *Collection) Prune(key string, filename string) error
- func (c *Collection) PruneAll(key string) error
- func (c *Collection) PruneVersion(key string, filename string, version string) error
- func (c *Collection) Read(key string, obj map[string]interface{}) error
- func (c *Collection) ReadJSON(key string) ([]byte, error)
- func (c *Collection) ReadJSONVersion(key string, semver string) ([]byte, error)
- func (c *Collection) ReadObject(key string, obj interface{}) error
- func (c *Collection) ReadObjectVersion(key string, version string, obj interface{}) error
- func (c *Collection) ReadVersion(key string, version string, obj map[string]interface{}) error
- func (c *Collection) RetrieveFile(key string, filename string) ([]byte, error)
- func (c *Collection) RetrieveStream(key string, filename string, out io.Writer) error
- func (c *Collection) RetrieveVersionFile(key string, filename string, version string) ([]byte, error)
- func (c *Collection) RetrieveVersionStream(key string, filename string, version string, buf io.Writer) error
- func (c *Collection) Sample(size int) ([]string, error)
- func (c *Collection) SaveFrame(name string, f *DataFrame) error
- func (c *Collection) SetVersioning(versioning string) error
- func (c *Collection) Update(key string, obj map[string]interface{}) error
- func (c *Collection) UpdateJSON(key string, src []byte) error
- func (c *Collection) UpdateMetadata(fName string) error
- func (c *Collection) UpdateObject(key string, obj interface{}) error
- func (c *Collection) UpdatedKeys(start string, end string) ([]string, error)
- func (c *Collection) Versions(key string) ([]string, error)
- func (c *Collection) WorkPath() string
- type Config
- type DSImport
- type DSQuery
- type DataFrame
- type PTStore
- func (store *PTStore) Close() error
- func (store *PTStore) Create(key string, src []byte) error
- func (store *PTStore) Delete(key string) error
- func (store *PTStore) DocPath(key string) (string, error)
- func (store *PTStore) HasKey(key string) bool
- func (store *PTStore) Keymap() map[string]string
- func (store *PTStore) KeymapName() string
- func (store *PTStore) Keys() ([]string, error)
- func (store *PTStore) Length() int64
- func (store *PTStore) Read(key string) ([]byte, error)
- func (store *PTStore) ReadVersion(key string, version string) ([]byte, error)
- func (store *PTStore) SetVersioning(setting int) error
- func (store *PTStore) Update(key string, src []byte) error
- func (store *PTStore) UpdateKeymap(keymap map[string]string) error
- func (store *PTStore) Versions(key string) ([]string, error)
- type SQLStore
- func (store *SQLStore) Close() error
- func (store *SQLStore) Create(key string, src []byte) error
- func (store *SQLStore) Delete(key string) error
- func (store *SQLStore) HasKey(key string) bool
- func (store *SQLStore) Keys() ([]string, error)
- func (store *SQLStore) Length() int64
- func (store *SQLStore) Read(key string) ([]byte, error)
- func (store *SQLStore) ReadVersion(key string, version string) ([]byte, error)
- func (store *SQLStore) SetVersioning(setting int) error
- func (store *SQLStore) Update(key string, src []byte) error
- func (store *SQLStore) UpdatedKeys(start string, end string) ([]string, error)
- func (store *SQLStore) Versions(key string) ([]string, error)
- type Settings
- type StorageSystem
Constants ¶
const ( // PTSTORE describes the storage type using a pairtree PTSTORE = "pairtree" // SQLSTORE describes the SQL storage type SQLSTORE = "sqlstore" )
const ( // None means versioning is turned off for collection None = iota // Major means increment the major semver value on creation or update Major // Minor means increment the minor semver value on creation or update Minor // Patach means increment the patch semver value on creation or update Patch )
const ( // Version number of release Version = "2.1.7" // ReleaseDate, the date version.go was generated ReleaseDate = "2023-10-02" // ReleaseHash, the Git hash when version.go was generated ReleaseHash = "d1b3172" LicenseText = `` /* 1524-byte string literal not displayed */ )
const ( // License is a formatted from for dataset package based command line tools License = `` /* 1545-byte string literal not displayed */ )
Variables ¶
var ( // // documentation for running Daemon // WebDescription = `` /* 455-byte string literal not displayed */ WebExamples = `` /* 829-byte string literal not displayed */ // EndPointREADME a copy of docs/datasetd.md EndPointREADME = ` {app_name} ========== Overview -------- {app_name} is a minimal web service intended to run on localhost port 8485. It presents one or more dataset collections as a web service. It features a subset of functionallity available with the dataset command line program. {app_name} does support multi-process/asynchronous update to a dataset collection. {app_name} is notable in what it does not provide. It does not provide user/role access restrictions to a collection. It is not intended to be a standalone web service on the public internet or local area network. It does not provide support for search or complex querying. If you need these features I suggest looking at existing mature NoSQL data management solutions like Couchbase, MongoDB, MySQL (which now supports JSON objects) or Postgres (which also support JSON objects). {app_name} is a simple, miminal service. NOTE: You could run {app_name} could be combined with a front end web service like Apache 2 or NginX and through them provide access control based on {app_name}'s predictable URL paths. That would require a robust understanding of the front end web server, it's access control mechanisms and how to defend a proxied service. That is beyond the skope of this project. Configuration ------------- {app_name} can make one or more dataset collections visible over HTTP. The dataset collections hosted need to be avialable on the same file system as where {app_name} is running. {app_name} is configured by reading a "settings.json" file in either the local directory where it is launch or by a specified directory on the command line to a appropriate JSON settings. The "settings.json" file has the following structure ` + "```" + ` { "host": "localhost:8485", "sql_type": "mysql", "dsn": "<DSN_STRING>" } ` + "```" + ` Running {app_name} ---------------- {app_name} runs as a HTTP service and as such can be exploited in the same manner as other services using HTTP. You should only run {app_name} on localhost on a trusted machine. If the machine is a multi-user machine all users can have access to the collections exposed by {app_name} regardless of the file permissions they may in their account. Example: If all dataset collections are in a directory only allowed access to be the "web-data" user but another users on the machine have access to curl they can access the dataset collections based on the rights of the "web-data" user by access the HTTP service. This is a typical situation for most localhost based web services and you need to be aware of it if you choose to run {app_name}. {app_name} should NOT be used to store confidential, sensitive or secret information. Supported Features ------------------ {app_name} provides a RESTful web service for accessing a collection's metdata, keys, documents, frames and attachments. The form of the path is generally '/rest/<COLLECTION_NAME>/<DOC_TYPE>/<ID>[/<NAME>]'. REST maps the CRUD operations to POST (create), GET (read), PUT (update), and DELETE (delete). There are four general types of objects in a dataset collection 1. keys (point to a JSON document, these are unique identifiers) 2. docs are the JSON documents 3. frames hold data frames (aggregation's of an collection's content) 4. attachments hold files attached to JSON documents Additionally you can list all the collections available in the web service as well as collection level metadata (as a codemeta.json document). Collections can have their CRUD operations turned on or off based on the columns set in the "_collections" table of the database hosting the web service. Use case -------- In this use case a dataset collection called "recipes.ds" has been previously created and populated using the command line tool. If I have a settings file for "recipes" based on the collection "recipes.ds" and want to make it read only I would make the attribute "read" set to true and if I want the option of listing the keys in the collection I would set that true also. ` + "```" + ` { "host": "localhost:8485", "collections": { "recipes": { "dataset": "recipes.ds", "keys": true, "read": true } } } ` + "```" + ` I would start {app_name} with the following command line. ` + "```" + `shell {app_name} settings.json ` + "```" + ` This would display the start up message and log output of the service. In another shell session I could then use curl to list the keys and read a record. In this example I assume that "waffles" is a JSON document in dataset collection "recipes.ds". ` + "```" + `shell curl http://localhost:8485/recipies/read/waffles ` + "```" + ` This would return the "waffles" JSON document or a 404 error if the document was not found. Listing the keys for "recipes.ds" could be done with this curl command. ` + "```" + `shell curl http://localhost:8485/recipies/keys ` + "```" + ` This would return a list of keys, one per line. You could show all JSON documents in the collection be retrieving a list of keys and iterating over them using curl. Here's a simple example in Bash. ` + "```" + `shell for KEY in $(curl http://localhost:8485/recipes/keys); do curl "http://localhost/8485/recipe/read/${KEY}" done ` + "```" + ` Add a new JSON object to a collection. ` + "```" + `shell KEY="sunday" curl -X POST -H 'Content-Type:application/json' \ "http://localhost/8485/recipe/create/${KEY}" \ -d '{"ingredients":["banana","ice cream","chocalate syrup"]}' ` + "```" + ` Online Documentation -------------------- {app_name} provide documentation as plain text output via request to the service end points without parameters. Continuing with our "recipes" example. Try the following URLs with curl. ` + "```" + ` curl http://localhost:8485 curl http://localhost:8485/recipes curl http://localhost:8485/recipes/create curl http://localhost:8485/recipes/read curl http://localhost:8485/recipes/update curl http://localhost:8485/recipes/delete curl http://localhost:8485/recipes/attach curl http://localhost:8485/recipes/retrieve curl http://localhost:8485/recipes/prune ` + "```" + ` End points ---------- The following end points are supported by {app_name} - '/' returns documentation for {app_name} - '/collections' returns a list of available collections. The following end points are per colelction. They are available for each collection where the settings are set to true. Some end points require POST HTTP method and specific content types. The terms '<COLLECTION_ID>', '<KEY>' and '<SEMVER>' refer to the collection path, the string representing the "key" to a JSON document and semantic version number for attachment. Unless specified end points support the GET method exclusively. - '/<COLLECTION_ID>' returns general dataset documentation with some tailoring to the collection. - '/<COLLECTION_ID>/keys' returns a list of keys available in the collection - '/<COLLECTION_ID>/create' returns documentation on the 'create' end point - '/<COLLECTION_IO>/create/<KEY>' requires the POST method with content type header of 'application/json'. It can accept JSON document up to 1 MiB in size. It will create a new JSON document in the collection or return an HTTP error if that fails - '/<COLLECTION_ID>/read' returns documentation on the 'read' end point - '/<COLLECTION_ID>/read/<KEY>' returns a JSON object for key or a HTTP error - '/<COLLECTION_ID>/update' returns documentation on the 'update' end point - '/COLLECTION_ID>/update/<KEY>' requires the POST method with content type header of 'application/json'. It can accept JSON document up to 1 MiB is size. It will replace an existing document in the collection or return an HTTP error if that fails - '/<COLLECTION_ID>/delete' returns documentation on the 'delete' end point - '/COLLECTION_ID>/delete/<KEY>' requires the GET method. It will delete a JSON document for the key provided or return an HTTP error - '/<COLLECTION_ID>/attach' returns documentation on attaching a file to a JSON document in the collection. - '/COLLECTION_ID>/attach/<KEY>/<SEMVER>/<FILENAME>' requires a POST method and expects a multi-part web form providing the filename in the 'filename' field. The <FILENAME> in the URL is used in storing the file. The document will be written the JSON document directory by '<KEY>' in sub directory indicated by '<SEMVER>'. See https://semver.org/ for more information on semantic version numbers. - '/<COLLECTION_ID>/retrieve' returns documentation on how to retrieve a versioned attachment from a JSON document. - '/<COLLECTION_ID>/retrieve/<KEY>/<SEMVER>/<FILENAME>' returns the versioned attachment from a JSON document or an HTTP error if that fails - '/<COLLECTION_ID>/prune' removes a versioned attachment from a JSON document or returns an HTTP error if that fails. - '/<COLLECTION_ID>/prune/<KEY>/<SEMVER>/<FILENAME>' removes a versioned attachment from a JSON document. ` EndPointCollections = ` Collections (end point) ======================= Interacting with the _{app_name}_ web service can be done with any web client. For documentation purposes I am assuming you are using [curl](https://curl.se/). This command line program is available on most POSIX systems including Linux, macOS and Windows. This provides a JSON list of collections available from the running _{app_name}_ service. Example ======= The assumption is that we have _{app_name}_ running on port "8485" of "localhost" and a set of collections, "t1" and "t2", defined in the "settings.json" used at launch. ` + "```" + `{.json} [ "t1", "t2" ] ` + "```" + ` ` EndPointCollection = ` Collection (end point) ======================= Interacting with the _{app_name}_ web service can be done with any web client. For documentation purposes I am assuming you are using [curl](https://curl.se/). This command line program is available on most POSIX systems including Linux, macOS and Windows. This provides a metadata as JSON for a specific collection. This may including attributes like authorship, funding and contributions. If this end point is request with a GET method then the data is returned, if requested with a POST method the date is updated the updated metadata returned. The POST must submit JSON encoded object with the mime type of "application/json". The metadata fields are - "dataset" (string, semver, version of dataset managing collection) - "name" (string) name of dataset collection - "contact" (string) free format contact info - "description" (string) - "doi" (string) a DOI assigned to the collection - "created" (string) a date string in RFC1123 format - "version" (string) the collection's version as a semver - "author" (array of PersonOrOrg) a list of authors of the collection - "contributor" (array of PersonOrOrg) a list of contributors to a collection - "funder" (array of PersonOrOrg) a list of funders of the collection - "annotations" (an object) this is a map to any ad-hoc fields for the collection's metadata The PersonOrOrg structure holds the metadata for either a person or organization. This is inspired by codemeta's peron or organization object scheme. For a person you'd have a structure like - "@type" (the string "Person") - "@id" (string) the person's ORCID - "givenName" (string) person's given name - "familyName" (string) person's family name - "affiliation" (array of PersonOrOrg) an list of affiliated organizations For an organization structure like - "@type" (the string "Organization") - "@id" (string) the orgnization's ROR - "name" (string) name of organization Example ======= The assumption is that we have _{app_name}_ running on port "8485" of "localhost" and a collection named characters is defined in the "settings.json" used at launch. Retrieving metatadata ` + "```" + `{.shell} curl -X GET https://localhost:8485/collection/characters ` + "```" + ` This would return the metadata found for our characters' collection. ` + "```" + ` { "dataset_version": "v0.1.10", "name": "characters.ds", "created": "2021-07-28T11:32:36-07:00", "version": "v0.0.0", "author": [ { "@type": "Person", "@id": "https://orcid.org/0000-0000-0000-0000", "givenName": "Jane", "familyName": "Doe", "affiliation": [ { "@type": "Organization", "@id": "https://ror.org/05dxps055", "name": "California Institute of Technology" } ] } ], "contributor": [ { "@type": "Person", "givenName": "Martha", "familyName": "Doe", "affiliation": [ { "@type": "Organization", "@id": "https://ror.org/05dxps055", "name": "California Institute of Technology" } ] } ], "funder": [ { "@type": "Organization", "name": "Caltech Library" } ], "annotation": { "award": "00000000000000001-2021" } } ` + "```" + ` Update metadata requires a POST with content type "application/json". In this example the dataset collection is named "t1" only the "name" and "dataset_version" set. ` + "```" + `{.shell} curl -X POST -H 'Content-Type: application/json' \ http://localhost:8485/collection/t1 \ -d '{"author":[{"@type":"Person","givenName":"Jane","familyName":"Doe"}]}' ` + "```" + ` The curl calls returns ` + "```" + `{.json} { "dataset_version": "1.0.1", "name": "T1.ds", "author": [ { "@type": "Person", "givenName": "Robert", "familyName": "Doiel" } ] } ` + "```" + ` ` EndPointKeys = ` Keys (end point) ================ Interacting with the _{app_name}_ web service can be done with any web client. For documentation purposes I am assuming you are using [curl](https://curl.se/). This command line program is available on most POSIX systems including Linux, macOS and Windows. This end point lists keys available in a collection. 'http://localhost:8485/<COLLECTION_ID>/keys' Requires a "GET" method. The keys are turned as a JSON array or http error if not found. Example ------- In this example '<COLLECTION_ID>' is "t1". ` + "```" + `{.shell} curl http://localhost:8485/t1/keys ` + "```" + ` The document return looks some like ` + "```" + ` [ "one", "two", "three" ] ` + "```" + ` For a "t1" containing the keys of "one", "two" and "three". ` EndPointDocument = ` Create (end point) ================== Interacting with the _{app_name}_ web service can be done with any web client. For documentation purposes I am assuming you are using [curl](https://curl.se/). This command line program is available on most POSIX systems including Linux, macOS and Windows. Create a JSON document in the collection. Requires a unique key in the URL and the content most be JSON less than 1 MiB in size. 'http://localhost:8485/<COLLECTION_ID>/created/<KEY>' Requires a "POST" HTTP method with. Creates a JSON document for the '<KEY>' in collection '<COLLECTION_ID>'. On success it returns HTTP 201 OK. Otherwise an HTTP error if creation fails. The "POST" needs to be JSON encoded and using a Content-Type of "application/json" in the request header. Example ------- The '<COLLECTION_ID>' is "t1", the '<KEY>' is "one" The content posted is ` + "```" + `{.json} { "one": 1 } ` + "```" + ` Posting using CURL is done like ` + "```" + `shell curl -X POST -H 'Content-Type: application.json' \ -d '{"one": 1}' \ http://locahost:8485/t1/create/one ` + "```" + ` ` EndPointRead = ` Read (end point) ================ Interacting with the _{app_name}_ web service can be done with any web client. For documentation purposes I am assuming you are using [curl](https://curl.se/). This command line program is available on most POSIX systems including Linux, macOS and Windows. Retrieve a JSON document from a collection. 'http://localhost:8485/<COLLECTION_ID>/read/<KEY>' Requires a "GET" HTTP method. Returns the JSON document for given '<KEY>' found in '<COLLECTION_ID>' or a HTTP error if not found. Example ------- Curl accessing "t1" with a key of "one" ` + "```" + `{.shell} curl http://localhost:8485/t1/read/one ` + "```" + ` An example JSON document (this example happens to have an attachment) returned. ` + "```" + ` { "_Attachments": [ { "checksums": { "0.0.1": "bb327f7bcca0f88649f1c6acfdc0920f" }, "created": "2021-10-11T11:09:51-07:00", "href": "T1.ds/pairtree/on/e/0.0.1/a1.png", "modified": "2021-10-11T11:09:51-07:00", "name": "a1.png", "size": 32511, "sizes": { "0.0.1": 32511 }, "version": "0.0.1", "version_hrefs": { "0.0.1": "T1.ds/pairtree/on/e/0.0.1/a1.png" } } ], "_Key": "one", "four": "four", "one": 1, "three": 3, "two": 2 } ` + "```" + ` ` EndPointUpdate = ` Update (end point) ================== Interacting with the _{app_name}_ web service can be done with any web client. For documentation purposes I am assuming you are using [curl](https://curl.se/). This command line program is available on most POSIX systems including Linux, macOS and Windows. Update a JSON document in the collection. Requires a key to an existing JSON record in the URL and the content most be JSON less than 1 MiB in size. 'http://localhost:8485/<COLLECTION_ID>/update/<KEY>' Requires a "POST" HTTP method. Update a JSON document for the '<KEY>' in collection '<COLLECTION_ID>'. On success it returns HTTP 200 OK. Otherwise an HTTP error if creation fails. The "POST" needs to be JSON encoded and using a Content-Type of "application/json" in the request header. Example ------- The '<COLLECTION_ID>' is "t1", the '<KEY>' is "one" The revised content posted is ` + "```" + `{.json} { "one": 1, "two": 2, "three": 3, "four": 4 } ` + "```" + ` Posting using CURL is done like ` + "```" + `shell curl -X POST -H 'Content-Type: application.json' \ -d '{"one": 1, "two": 2, "three": 3, "four": 4}' \ http://locahost:8485/t1/update/one ` + "```" + ` ` EndPointDelete = ` Delete (end point) ================== Interacting with the _{app_name}_ web service can be done with any web client. For documentation purposes I am assuming you are using [curl](https://curl.se/). This command line program is available on most POSIX systems including Linux, macOS and Windows. Delete a JSON document in the collection. Requires the document key and collection name. 'http://localhost:8485/<COLLECTION_ID>/delete/<KEY>' Requires a 'GET' HTTP method. Deletes a JSON document for the '<KEY>' in collection '<COLLECTION_ID>'. On success it returns HTTP 200 OK. Otherwise an HTTP error if creation fails. Example ------- The '<COLLECTION_ID>' is "t1", the '<KEY>' is "one" The content posted is Posting using CURL is done like ` + "```" + `shell curl -X GET -H 'Content-Type: application.json' \ http://locahost:8485/t1/delete/one ` + "```" + ` ` EndPointAttach = ` Attach (end point) ================== Interacting with the _{app_name}_ web service can be done with any web client. For documentation purposes I am assuming you are using [curl](https://curl.se/). This command line program is available on most POSIX systems including Linux, macOS and Windows. Attaches a document to a JSON Document using '<KEY>', '<SEMVER>' and '<FILENAME>'. 'http://localhost:8485/<COLLECTION_ID>/attach/<KEY>/<SEMVER>/<FILENAME>' Requires a "POST" method. The "POST" is expected to be a multi-part web form providing the source filename in the field "filename". The document will be written the JSON document directory by '<KEY>' in sub directory indicated by '<SEMVER>'. See https://semver.org/ for more information on semantic version numbers. Example ======= In this example the '<COLLECTION_ID>' is "t1", the '<KEY>' is "one" and the content upload is "a1.png" in the home directory "/home/jane.doe". The '<SEMVER>' is "0.0.1". ` + "```" + `shell curl -X POST -H 'Content-Type: multipart/form-data' \ -F 'filename=@/home/jane.doe/a1.png' \ http://localhost:8485/t1/attach/one/0.0.1/a1.png ` + "```" + ` NOTE: The URL contains the filename used in the saved attachment. If I didn't want to call it "a1.png" I could have provided a different name in the URL path. ` EndPointRetrieve = ` Retrieve (end point) ==================== Interacting with the _{app_name}_ web service can be done with any web client. For documentation purposes I am assuming you are using [curl](https://curl.se/). This command line program is available on most POSIX systems including Linux, macOS and Windows. Retrieves an s attached document from a JSON record using '<KEY>', '<SEMVER>' and '<FILENAME>'. 'http://localhost:8485/<COLLECTION_ID>/attach/<KEY>/<SEMVER>/<FILENAME>' Requires a POST method and expects a multi-part web form providing the filename. The document will be written the JSON document directory by '<KEY>' in sub directory indicated by '<SEMVER>'. See https://semver.org/ for more information on semantic version numbers. Example ------- In this example we're retieving the '<FILENAME>' of "a1.png", with the '<SEMVER>' of "0.0.1" from the '<COLLECTION_ID>' of "t1" and '<KEY>' of "one" using curl. ` + "```" + `{.shell} curl http://localhost:8485/t1/retrieve/one/0.0.1/a1.png ` + "```" + ` This should trigger a download of the "a1.png" image file in the collection for document "one". ` EndPointPrune = ` Prune (end point) ================= Removes an attached document from a JSON record using '<KEY>', '<SEMVER>' and '<FILENAME>'. 'http://localhost:8485/<COLLECTION_ID>/attach/<KEY>/<SEMVER>/<FILENAME>' Requires a GET method. Returns an HTTP 200 OK on success or an HTTP error code if not. See https://semver.org/ for more information on semantic version numbers. Example ------- In this example '<COLLECTION_ID>' is "t1", '<KEY>' is "one", '<SEMVER>' is "0.0.1" and '<FILENAME>' is "a1.png". Once again our example uses curl. ` + "```" + ` curl http://localhost:8485/t1/prune/one/0.0.1/a1.png ` + "```" + ` This will cause the attached file to be removed from the record and collection. ` WEBDescription = ` USAGE ===== {app_name} SETTINGS_FILENAME SYNPOSIS -------- {app_name} is a web service for serving dataset collections via HTTP/HTTPS. DETAIL ------ {app_name} is a minimal web service typically run on localhost port 8485 that exposes a dataset collection as a web service. It features a subset of functionality available with the dataset command line program. {app_name} does support multi-process/asynchronous update to a dataset collection. {app_name} is notable in what it does not provide. It does not provide user/role access restrictions to a collection. It is not intended to be a stand alone web service on the public internet or local area network. It does not provide support for search or complex querying. If you need these features I suggest looking at existing mature NoSQL style solutions like Couchbase, MongoDB, MySQL (which now supports JSON objects) or Postgres (which also support JSON objects). {app_name} is a simple, miminal service. NOTE: You could run {app_name} with access control based on a set of set of URL paths by running {app_name} behind a full feature web server like Apache 2 or NginX but that is beyond the skope of this project. Configuration ------------- {app_name} can make one or more dataset collections visible over HTTP/HTTPS. The dataset collections hosted need to be avialable on the same file system as where {app_name} is running. {app_name} is configured by reading a "settings.json" file in either the local directory where it is launch or by a specified directory on the command line. The "settings.json" file has the following structure ` + "```" + ` { "host": "localhost:8483", "sql_type": "mysql", "dsn": "DB_USER:DB_PASSWORD3@/DB_NAME" } ` + "```" + ` The "host" is the URL listened to by the dataset daemon, the "sql_type" is usually "mysql" though could be "sqlite", the "dsn" is the data source name used to initialized the connection to the SQL engine. It is SQL engine specific. E.g. if "sql_type" is "sqlite" then the "dsn" might be "file:DB_NAME?cache=shared". Running {app_name} ------------------ {app_name} runs as a HTTP/HTTPS service and as such can be exploit as other network based services can be. It is recommend you only run {app_name} on localhost on a trusted machine. If the machine is a multi-user machine all users can have access to the collections exposed by {app_name} regardless of the file permissions they may in their account. E.g. If all dataset collections are in a directory only allowed access to be the "web-data" user but another user on the system can run cURL then they can access the dataset collections based on the rights of the "web-data" user. This is a typical situation for most web services and you need to be aware of it if you choose to run {app_name}. A way to minimize the problem would be to run {app_name} in a container restricted to the specific user. Supported Features ------------------ {app_name} provide a limitted subset of actions support by the standard datset command line tool. It only supports the following verbs 1. init (create a new collection SQL based collection) 2. keys (return a list of all keys in the collection) 3. has-key (return true if has key false otherwise) 4. create (create a new JSON document in the collection) 5. read (read a JSON document from a collection) 6. update (update a JSON document in the collection) 7. delete (delete a JSON document in the collection) 8. frames (list frames available) 9. frame (define a frame) 10. frame-def (show frame definition) 11. frame-objects (return list of framed objects) 12. refresh (refresh all the objects in a frame) 13. reframe (update the definition and reload the frame) 14. delete-frame (remove the frame) 15. has-frame (returns true if frame exists, false otherwise) 16. codemeta (imports a codemeta JSON file providing collection metadata) Each of theses "actions" can be restricted in the _collections table ( ) by setting the value to "false". If the attribute for the action is not specified in the JSON settings file then it is assumed to be "false". Working with {app_name} --------------------- E.g. if I have a settings file for "recipes" based on the collection "recipes.ds" and want to make it read only I would make the attribute "read" set to true and if I want the option of listing the keys in the collection I would set that true also. { "host": "localhost:8485", "collections": { "recipes": { "dataset": "recipes.ds", "keys": true, "read": true } } } I would start {app_name} with the following command line. {app_name} settings.json This would display the start up message and log output of the service. In another shell session I could then use cURL to list the keys and read a record. In this example I assume that "waffles" is a JSON document in dataset collection "recipes.ds". curl http://localhost:8485/recipies/read/waffles This would return the "waffles" JSON document or a 404 error if the document was not found. Listing the keys for "recipes.ds" could be done with this cURL command. curl http://localhost:8485/recipies/keys This would return a list of keys, one per line. You could show all JSON documents in the collection be retrieving a list of keys and iterating over them using cURL. Here's a simple example in Bash. for KEY in $(curl http://localhost:8485/recipes/keys); do curl "http://localhost/8485/recipe/read/${KEY}" done Access Documentation -------------------- {app_name} provide documentation as plain text output via request to the service end points without parameters. Continuing with our "recipes" example. Try the following URLs with cURL. curl http://localhost:8485 curl http://localhost:8485/recipes curl http://localhost:8485/recipes/read {app_name} is intended to be combined with other services like Solr 8.9. {app_name} only implements the simplest of object storage. ` )
Functions ¶
func Analyzer ¶
Analyzer checks the collection version and analyzes current state of collection reporting on errors.
NOTE: the collection MUST BE CLOSED when Analyzer is called otherwise the results will not be accurate.
func ApiDisplayUsage ¶
ApiDisplayUsage displays a usage message.
func ApiVersion ¶
func ApiVersion(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, options []string)
ApiVersion returns the version of the web service running. This will normally be the same version of dataset you installed.
```shell
curl -X GET http://localhost:8485/api/version
```
func Attach ¶
Attach will add or replace an attachment for a JSON object in the collection.
```shell
KEY="123" FILENAME="mystuff.zip" curl -X POST \ http://localhost:8585/api/journals.ds/attachment/$KEY/$FILENAME -H "Content-Type: application/zip" \ --data-binary "@./mystuff.zip"
```
func AttachmentVersions ¶
func Attachments ¶
func Attachments(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, options []string)
Attachemnts lists the attachments avialable for a JSON object in the collection.
```shell
KEY="123" curl -X GET http://localhost:8585/api/journals.ds/attachments/$KEY
```
func BytesProcessor ¶
BytesProcessor takes the a text and replaces all the keys (e.g. "{app_name}") with their value (e.g. "dataset"). It is used to prepare command line and daemon document for display.
func CliDisplayHelp ¶
CliDisplayHelp writes out help on a supported topic
func CliDisplayUsage ¶
CliDisplayUsage displays a usage message.
func Codemeta ¶
func Codemeta(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, options []string)
Collection returns the codemeta JSON for a specific collection. Example collection name "journals.ds"
```shell
curl -X GET http://localhost:8485/api/collection/journals.ds
```
func Collections ¶
func Collections(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, options []string)
Collections returns a list of dataset collections supported by the running web service.
```shell
curl -X GET http://localhost:8485/api/collections
```
func Create ¶
func Create(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, options []string)
Create deposit a JSON object in the collection for a given key.
In this example the json document is in the working directory called "record-123.json" and the environment variable KEY holds the document key which is the string "123".
```shell
KEY="123" curl -X POST http://localhost:8585/api/journals.ds/object/$KEY -H "Content-Type: application/json" \ --data-binary "@./record-123.json"
```
func Delete ¶
func Delete(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, options []string)
Delete removes a JSON object from the collection for a given key.
In this example the environment variable KEY holds the document key which is the string "123".
```shell
KEY="123" curl -X DELETE http://localhost:8585/api/journals.ds/object/$KEY
```
func DeleteVersion ¶
func DisplayLicense ¶
DisplayLicense returns the license associated with dataset application.
func DisplayVersion ¶
DisplayVersion returns the of the dataset application.
func FixMissingCollectionJson ¶
FixMissingCollectionJson will scan the collection directory and environment making an educated guess to type of collection collection type
func FmtHelp ¶ added in v2.1.3
func FmtHelp(src string, appName string, version string, releaseDate string, releaseHash string) string
FmtHelp lets you process a text block with simple curly brace markup.
func FrameCreate ¶
func FrameCreate(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, options []string)
FrameCreate creates a new frame in a collection. It accepts the frame definition as a POST of JSON.
```shell
FRM_NAME="names" cat<<EOT>frame-def.json { "dot_paths": [ ".given", ".family" ], "labels": [ "Given Name", "Family Name" ], "keys": [ "Miller-A", "Stienbeck-J", "Topez-T", "Valdez-L" ] } EOT curl -X POST http://localhost:8585/api/journals.ds/frame/$FRM_NAME -H "Content-Type: application/json" \ --data-binary "@./frame-def.json"
```
func FrameDef ¶
func FrameDef(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, options []string)
FrameDef retrieves the frame definition associated with a frame
```shell
FRM_NAME="names" curl -X GET http://localhost:8585/api/journals.ds/frame-def/$FRM_NAME
```
func FrameDelete ¶
func FrameDelete(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, options []string)
FrameDelete removes a frame from a collection.
```shell
FRM_NAME="names" curl -X DELETE http://localhost:8585/api/journals.ds/frame/$FRM_NAME
```
func FrameKeys ¶
func FrameKeys(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, options []string)
FrameKeys retrieves the list of keys associated with a frame
```shell
FRM_NAME="names" curl -X GET http://localhost:8585/api/journals.ds/frame-keys/$FRM_NAME
```
func FrameObjects ¶
func FrameObjects(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, options []string)
FrameObjects retrieves the frame objects associated with a frame
```shell
FRM_NAME="names" curl -X GET http://localhost:8585/api/journals.ds/frame-objects/$FRM_NAME
```
func FrameUpdate ¶
func FrameUpdate(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, options []string)
FrameUpdate updates a frame either refreshing the current frame objects on the keys associated with the object or if a JSON array of keys is provided it reframes the objects using the new list of keys.
```shell
FRM_NAME="names" curl -X PUT http://localhost:8585/api/journals.ds/frame/$FRM_NAME
```
Reframing a frame providing new keys looks something like this --
```shell
FRM_NAME="names" cat<<EOT>frame-keys.json [ "Gentle-M", "Stienbeck-J", "Topez-T", "Valdez-L" ] EOT curl -X PUT http://localhost:8585/api/journals.ds/frame/$FRM_NAME \ -H "Content-Type: application/json" \ --data-binary "@./frame-keys.json"
```
func Frames ¶
Frames retrieves a list of available frames in a collection.
```shell
curl -X GET http://localhost:8585/api/journals.ds/frames
```
func HasFrame ¶
func HasFrame(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, options []string)
HasFrame checks a collection for a frame by its name
```shell
FRM_NAME="name" curl -X GET http://localhost:8585/api/journals.ds/has-frame/$FRM_NAME
```
func JSONIndent ¶ added in v2.1.4
JSONIndent takes an byte slice of JSON source and returns an indented version.
func JSONMarshal ¶ added in v2.1.2
JSONMarshal provides provide a custom json encoder to solve a an issue with HTML entities getting converted to UTF-8 code points by json.Marshal(), json.MarshalIndent().
func JSONMarshalIndent ¶ added in v2.1.2
JSONMarshalIndent provides provide a custom json encoder to solve a an issue with HTML entities getting converted to UTF-8 code points by json.Marshal(), json.MarshalIndent().
func JSONUnmarshal ¶ added in v2.1.2
JSONUnmarshal is a custom JSON decoder so we can treat numbers easier
func Keys ¶
func Keys(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, options []string)
Keys returns the available keys in a collection as a JSON array. Example collection name "journals.ds"
```shell
curl -X GET http://localhost:8485/api/journals.ds/keys
```
func MakeCSV ¶ added in v2.1.6
MakeCSV takes JSON source holding an array of objects and uses the attribute list to render a CSV file from the list. It returns the CSV content as a byte slice along with an error.
func MakeGrid ¶ added in v2.1.5
MakeGrid takes JSON source holding an array of objects and uses the attribute list to render a 2D grid of values where the columns match the attribute name list provided. If an attribute is missing a nil is inserted. MakeGrid returns the grid as JSON source along with an error value.
func Migrate ¶
Migrate a dataset v1 collection to a v2 collection. Both collections need to already exist. Records from v1 will be read out of v1 and created in v2.
NOTE: Migrate does not current copy attachments.
func ObjectVersions ¶
func Prune ¶
Prune removes and attachment from a JSON object in the collection.
```shell
KEY="123" FILENAME="mystuff.zip" curl -X DELETE \ http://localhost:8585/api/journals.ds/attachment/$KEY/$FILENAME
```
func PruneVersion ¶
func Read ¶
func Read(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, options []string)
Read retrieves a JSON object from the collection for a given key.
In this example the json retrieved will be called "record-123.json" and the environment variable KEY holds the document key as a string "123".
```shell
KEY="123" curl -o "record-123.json" -X GET \ http://localhost:8585/api/journals.ds/object/$KEY
```
func ReadKeys ¶
ReadKeys reads a list of keys given filename or an io.Reader (e.g. standard input) as fallback. The key file should be formatted one key per line with a line delimited of "\n".
```
keys, err := dataset.ReadKeys(keysFilename, os.Stdin) if err != nil { }
```
func ReadSource ¶
ReadSource reads the source text from a filename or io.Reader (e.g. standard input) as fallback.
``` src, err := ReadSource(inputName, os.Stdin)
if err != nil { ... }
```
func ReadVersion ¶
func Repair ¶
Repair takes a collection name and calls walks the pairtree and repairs collection.json as appropriate.
NOTE: the collection MUST BE CLOSED when repair is called otherwise the repaired collection may revert.
func Retrieve ¶
func Retrieve(w http.ResponseWriter, r *http.Request, api *API, cName, verb string, options []string)
Attach retrieve an attachment from a JSON object in the collection.
```shell
KEY="123" FILENAME="mystuff.zip" curl -X GET \ http://localhost:8585/api/journals.ds/attachment/$KEY/$FILENAME
```
func RetrieveVersion ¶
func RowInterfaceToString ¶ added in v2.1.0
func RowInterfaceToString(r []interface{}) []string
RowInterfaceToString takes a 1D slice of interface{} and returns a 1D slice of string, of conversion then cell will be set to empty string.
func RowStringToInterface ¶ added in v2.1.0
func RowStringToInterface(r []string) []interface{}
RowStringToInterface takes a 1D slice of string and returns a 1D slice of interface{}
func RunAPI ¶
RunAPI takes a JSON configuration file and opens all the collections to be used by web service.
```
appName := path.Base(sys.Argv[0]) settingsFile := "settings.json" if err := api.RunAPI(appName, settingsFile); err != nil { ... }
```
func SetupApiTestCollection ¶
func StringProcessor ¶
StringProcessor takes the a text and replaces all the keys (e.g. "{app_name}") with their value (e.g. "dataset"). It is used to prepare command line and daemon document for display.
func TableInterfaceToString ¶ added in v2.1.0
func TableInterfaceToString(t [][]interface{}) [][]string
TableInterfaceToString takes a 2D slice of interface{} holding simple types (e.g. string, int, int64, float, float64, rune) and returns a 2D slice of string suitable for working with the csv encoder package. Uses ValueInterfaceToString() for conversion storing an empty string if they is an error.
func TableStringToInterface ¶ added in v2.1.0
func TableStringToInterface(t [][]string) [][]interface{}
TableStringToInterface takes a 2D slice of string and returns an 2D slice of interface{}.
func Update ¶
func Update(w http.ResponseWriter, r *http.Request, api *API, cName string, verb string, options []string)
Update replaces a JSON object in the collection for a given key.
In this example the json document is in the working directory called "record-123.json" and the environment variable KEY holds the document key which is the string "123".
```shell
KEY="123" curl -X PUT http://localhost:8585/api/journals.ds/object/$KEY -H "Content-Type: application/json" \ --data-binary "@./record-123.json"
```
func ValueInterfaceToString ¶ added in v2.1.0
ValueInterfaceToString - takes a interface{} and renders it as a string
func ValueStringToInterface ¶ added in v2.1.0
ValueStringToInterface takes a string and returns an interface{}
Types ¶
type API ¶
type API struct { // AppName is the name of the running application. E.g. os.Args[0] AppName string // SettingsFile is the path to the settings file. SettingsFile string // Version is the version of the API running Version string // Settings is the configuration reading from SettingsFile Settings *Settings // CMap is a map to the collections supported by the web service. CMap map[string]*Collection // Routes holds a double map of prefix path and HTTP method that // points to the function that will be dispatched if found. // // The the first level map identifies the prefix path for the route // e.g. "api/version". No leading slash is expected. // The second level map is organized by HTTP method, e.g. "GET", // "POST". The second map points to the function to call when // the route and method matches. Routes map[string]map[string]func(http.ResponseWriter, *http.Request, *API, string, string, []string) // Process ID Pid int }
API this structure holds the information for running an web service instance. One web service may host many collections.
func (*API) RegisterRoute ¶
func (api *API) RegisterRoute(prefix string, method string, fn func(http.ResponseWriter, *http.Request, *API, string, string, []string)) error
RegisterRoute resigns a prefix path to a route handler.
prefix is the url path prefix minus the leading slash that is targetted by this handler.
method is the HTTP method the func will process fn is the function that handles this route.
```
func Version(w http.ResponseWriter, r *http.Reqest, api *API, verb string, options []string) { ... } ... err := api.RegistereRoute("version", http.MethodGet, Version) if err != nil { ... }
```
func (*API) Reload ¶
Reload performs a Shutdown and an init after re-reading in the settings.json file.
func (*API) WebService ¶
WebService this starts and runs a web server implementation of dataset.
type Attachment ¶
type Attachment struct { // Name is the filename and path to be used inside the generated tar file Name string `json:"name"` // Size remains to to help us migrate pre v0.0.61 collections. // It should reflect the last size added. Size int64 `json:"size"` // Sizes is the sizes associated with the version being attached Sizes map[string]int64 `json:"sizes"` // Current holds the semver to the last added version Version string `json:"version"` // Checksum, current implemented as a MD5 checksum for now // You should have one checksum per attached version. Checksums map[string]string `json:"checksums"` // HRef points at last attached version of the attached document // If you moved an object out of the pairtree it should be a URL. HRef string `json:"href"` // VersionHRefs is a map to all versions of the attached document // { // "0.0.0": "... /photo.png", // "0.0.1": "... /photo.png", // "0.0.2": "... /photo.png" // } VersionHRefs map[string]string `json:"version_hrefs"` // Created a date string in RTC3339 format Created string `json:"created"` // Modified a date string in RFC3339 format Modified string `json:"modified"` // Metadata is a map for application specific metadata about attachments. Metadata map[string]interface{} `json:"metadata,omitempty"` }
Attachment is a structure for holding non-JSON content metadata you wish to store alongside a JSON document in a collection Attachments reside in a their own pairtree of the collection directory. (even when using a SQL store for the JSON document). The attachment metadata is read as needed from disk where the collection folder resides.
type Collection ¶
type Collection struct { // DatasetVersion of the collection DatasetVersion string `json:"dataset,omitempty"` // Name of collection Name string `json:"name"` // StoreType can be either "pairtree" (default or if attribute is // omitted) or "sqlstore". If sqlstore the connection string, DSN URI, // will determine the type of SQL database being accessed. StoreType string `json:"storage_type,omitempty"` // DsnURI holds protocol plus dsn string. The protocol can be // "sqlite://", "mysql://" or "postgres://"and the dsn conforming to the Golang // database/sql driver name in the database/sql package. DsnURI string `json:"dsn_uri,omitempty"` // Created Created string `json:"created,omitempty"` // Repaired Repaired string `json:"repaired,omitempty"` // PTStore the point to the pairtree implementation of storage PTStore *PTStore `json:"-"` // SQLStore points to a SQL database with JSON column support SQLStore *SQLStore `json:"-"` // Versioning holds the type of versioning implemented in the collection. // It can be set to an empty string (the default) which means no versioning. // It can be set to "patch" which means objects and attachments are versioned by // a semver patch value (e.g. 0.0.X where X is incremented), "minor" where // the semver minor value is incremented (e.g. e.g. 0.X.0 where X is incremented), // or "major" where the semver major value is incremented (e.g. X.0.0 where X is // incremented). Versioning affects storage of JSON objects and their attachments // across the whole collection. Versioning string `json:"versioning,omitempty"` // contains filtered or unexported fields }
Collection is the holds both operational metadata for collection level operations on collections of JSON objects. General metadata is stored in a codemeta.json file in the root directory along side the collection.json file.
func Init ¶
func Init(name string, dsnURI string) (*Collection, error)
Init - creates a new collection and opens it. It takes a name (e.g. directory holding the collection.json and codemeta.josn files) and an optional DSN in URI form. The default storage engine is a pairtree (i.e. PTSTORE) but some SQL storage engines are supported.
If a DSN URI is a non-empty string then it is the SQL storage engine is used. The database and user access in the SQL engine needs be setup before you can successfully intialized your dataset collection. Currently three SQL database engines are support, SQLite3 or MySQL 8. You select the SQL storage engine by forming a URI consisting of a "protocol" (e.g. "sqlite", "mysql", "postgres"), the protocol delimiter "://" and a Go SQL supported DSN based on the database driver implementation.
A MySQL 8 DSN URI would look something like
`mysql://DB_USER:DB_PASSWD@PROTOCAL_EXPR/DB_NAME`
The one for SQLite3
`sqlite://FILENAME_FOR_SQLITE_DATABASE`
NOTE: The DSN URI is stored in the collections.json. The file should NOT be world readable as that will expose your database password. You can remove the DSN URI after initializing your collection but will then need to provide the DATASET_DSN_URI envinronment variable so you can open your database successfully.
For PTSTORE the access value can be left blank.
```
var ( c *Collection err error ) name := "my_collection.ds" c, err = dataset.Init(name, "") if err != nil { // ... handle error } defer c.Close()
```
For a sqlstore collection we need to pass the "access" value. This is the file containing a DNS or environment variables formating a DSN.
```
var ( c *Collection err error ) name := "my_collection.ds" dsnURI := "sqlite://collection.db" c, err = dataset.Init(name, dsnURI) if err != nil { // ... handle error } defer c.Close()
```
func Open ¶
func Open(name string) (*Collection, error)
Open reads in a collection's operational metadata and returns a new collection structure and error value.
```
var ( c *dataset.Collection err error ) c, err = dataset.Open("collection.ds") if err != nil { // ... handle error } defer c.Close()
```
func (*Collection) AttachFile ¶
func (c *Collection) AttachFile(key string, filename string) error
AttachFile reads a filename from file system and attaches it.
```
key, filename := "123", "report.pdf" err := c.AttachFile(key, filename) if err != nil { ... }
```
func (*Collection) AttachStream ¶
AttachStream is for attaching a non-JSON file via a io buffer. It requires the JSON document key, the filename and a io.Reader. It does not close the reader. If the collection is versioned then the document attached is automatically versioned per collection versioning setting.
Example: attach the file "report.pdf" to JSON document "123" in an open collection.
```
key, filename := "123", "report.pdf" buf, err := os.Open(filename) if err != nil { ... } err := c.AttachStream(key, filename, buf) if err != nil { ... } buf.Close()
```
func (*Collection) AttachVersionFile ¶
func (c *Collection) AttachVersionFile(key string, filename string, version string) error
AttachVersionFile attaches a file to a JSON document in the collection. This does NOT increment the version number of attachment(s). It is used to explicitly replace a attached version of a file. It does not update the symbolic link to the "current" attachment.
```
key, filename, version := "123", "report.pdf", "0.0.3" err := c.AttachVersionFile(key, filename, version) if err != nil { ... }
```
func (*Collection) AttachVersionStream ¶
func (c *Collection) AttachVersionStream(key string, filename string, version string, buf io.Reader) error
AttachVersionStream is for attaching open a non-JSON file buffer (via an io.Reader) to a specific version of a file. If attached file exists it is replaced.
Example: attach the file "report.pdf", version "0.0.3" to JSON document "123" in an open collection.
```
key, filename, version := "123", "helloworld.txt", "0.0.3" buf, err := os.Open(filename) if err != nil { ... } err := c.AttachVersionStream(key, filename, version, buf) if err != nil { ... } buf.Close()
```
func (*Collection) AttachmentPath ¶
func (c *Collection) AttachmentPath(key string, filename string) (string, error)
AttachmentPath takes a key and filename and returns the path file system path to the attached file (if found). For versioned collections this is the path the symbolic link for the "current" version.
```
key, filename := "123", "report.pdf" docPath, err := c.AttachmentPath(key, filename) if err != nil { ... }
```
func (*Collection) AttachmentVersionPath ¶
func (c *Collection) AttachmentVersionPath(key string, filename string, version string) (string, error)
AttachmentVersionPath takes a key, filename and semver returning the path to the attached versioned file (if found).
```
key, filename, version := "123", "report.pdf", "0.0.3" docPath, err := c.AttachmentVersionPath(key, filename, version) if err != nil { ... }
```
func (*Collection) AttachmentVersions ¶
func (c *Collection) AttachmentVersions(key string, filename string) ([]string, error)
AttachmentVersions returns a list of versions for an attached file to a JSON document in the collection.
Example: retrieve a list of versions of an attached file. "key" is a key in the collection, filename is name of an attached file for the JSON document referred to by key.
```
versions, err := c.AttachmentVersions(key, filename) if err != nil { ... } for i, version := range versions { fmt.Printf("key: %q, filename: %q, version: %q", key, filename, version) }
```
func (*Collection) Attachments ¶
func (c *Collection) Attachments(key string) ([]string, error)
Attachments returns a list of filenames for a key name in the collection
Example: "c" is a dataset collection previously opened, "key" is a string. The "key" is for a JSON document in the collection. It returns an slice of filenames and err.
```
filenames, err := c.Attachments(key) if err != nil { ... } // Print the names of the files attached to the JSON document // referred to by "key". for i, filename := ranges { fmt.Printf("key: %q, filename: %q", key, filename) }
```
func (*Collection) Clone ¶
Clone initializes a new collection based on the list of keys provided. If the keys list is empty all the objects are copied from one collection to the other. The collections do not need to be the same storage type.
NOTE: The cloned copy is not open after cloning is complete.
```
newName, dsnURI := "new-collection.ds", "sqlite://new-collection.ds/collection.db" c, err := dataset.Open("old-collection.ds") if err != nil { ... // handle error } defer c.Close() nc, err := c.Clone(newName, dsnURI, []string{}, false) if err != nil { ... // handle error } defer nc.Close()
```
func (*Collection) CloneSample ¶
func (c *Collection) CloneSample(trainingName string, trainingDsnURI string, testName string, testDsnURI string, keys []string, sampleSize int, verbose bool) error
CloneSample initializes two new collections based on a training and test // sampling of the keys in the original collection. If the keys list is empty all the objects are used for creating the taining and test sample collections. The collections do not need to be the same storage type.
NOTE: The cloned copy is not open after cloning is complete.
```
trainingSetSize := 10000 trainingName, trainingDsnURI := "training.ds", "sqlite://training.ds/collection.db" testName, testDsnURI := "test.ds", "sqlite://test.ds/collection.db" c, err := dataset.Open("old-collection") if err != nil { ... // handle error } defer c.Close() nc, err := c.CloneSample(trainingName, trainingDsnURI, testName, testDsnURI, []string{}, trainingSetSize, false) if err != nil { ... // handle error } defer nc.Close()
```
func (*Collection) Close ¶
func (c *Collection) Close() error
Close closes a collection. For a pairtree that means flushing the keymap to disk. For a SQL store it means closing a database connection. Close is often called in conjunction with "defer" keyword.
```
c, err := dataset.Open("my_collection.ds") if err != nil { /* .. handle error ... */ } /* do some stuff with the collection */ defer func() { if err := c.Close(); err != nil { /* ... handle closing error ... */ } }()
```
func (*Collection) Codemeta ¶
func (c *Collection) Codemeta() ([]byte, error)
Codemeta returns a copy of the codemeta.json file content found in the collection directory. The collection must be previous open.
```
name := "my_collection.ds" c, err := dataset.Open(name) if err != nil { ... } defer c.Close() src, err := c.Metadata() if err != nil { ... } ioutil.WriteFile("codemeta.json", src, 664)
```
func (*Collection) Create ¶
func (c *Collection) Create(key string, obj map[string]interface{}) error
Create store a an object in the collection. Object will get converted to JSON source then stored. Collection must be open. A Go `map[string]interface{}` is a common way to handle ad-hoc JSON data in gow. Use `CreateObject()` to store structured data.
```
key := "123" obj := map[]*interface{}{ "one": 1, "two": 2 } if err := c.Create(key, obj); err != nil { ... }
```
func (*Collection) CreateJSON ¶
func (c *Collection) CreateJSON(key string, src []byte) error
CreateJSON is used to store JSON directory into a dataset collection. NOTE: the JSON is NOT validated.
```
import ( "fmt" "os" ) func main() { c, err := dataset.Open("friends.ds") if err != nil { fmt.Fprintf(os.Stderr, "%s", err) os.Exit(1) } defer c.Close() src := []byte(`{ "ID": "mojo", "Name": "Mojo Sam", "EMail": "mojo.sam@cosmic-cafe.example.org" }`) if err := c.CreateJSON("modo", src); err != nil { fmt.Fprintf(os.Stderr, "%s", err) os.Exit(1) } fmt.Printf("OK\n") os.Exit(0) }
```
func (*Collection) CreateObject ¶
func (c *Collection) CreateObject(key string, obj interface{}) error
CreateObject is used to store structed data in a dataset collection. The object needs to be defined as a Go struct notated approriately with the domain markup for working with json.
```
import ( "encoding/json" "fmt" "os" ) type Record struct { ID string `json:"id"` Name string `json:"name,omitempty"` EMail string `json:"email,omitempty"` } func main() { c, err := dataset.Open("friends.ds") if err != nil { fmt.Fprintf(os.Stderr, "%s", err) os.Exit(1) } defer c.Close() obj := &Record{ ID: "mojo", Name: "Mojo Sam", EMail: "mojo.sam@cosmic-cafe.example.org", } if err := c.CreateObject(obj.ID, obj); err != nil { fmt.Fprintf(os.Stderr, "%s", err) os.Exit(1) } fmt.Printf("OK\n") os.Exit(0) }
```
func (*Collection) CreateObjectsJSON ¶ added in v2.1.0
func (c *Collection) CreateObjectsJSON(keyList []string, src []byte) error
CreateObjectsJSON takes a list of keys and creates a default object for each key as quickly as possible. This is useful in vary narrow situation like quickly creating test data. Use with caution.
NOTE: if object already exist creation is skipped without reporting an error.
func (*Collection) Delete ¶
func (c *Collection) Delete(key string) error
Delete removes an object from the collection. If the collection is versioned then all versions are deleted. Any attachments to the JSON document are also deleted including any versioned attachments.
```
key := "123" if err := c.Delete(key); err != nil { // ... handle error }
```
func (*Collection) DocPath ¶ added in v2.1.0
func (c *Collection) DocPath(key string) (string, error)
DocPath method provides access to a PTStore's document path. If the collection is not a PTStore then an empty path and error is returned with an error message. NOTE: the path returned is a full path including the JSON document stored.
```
c, err := dataset.Open(cName, "") // ... handle error ... key := "2488" s, err := c.DocPath(key) // ... handle error ... fmt.Printf("full path to JSON document %q is %q\n", key, s)
```
func (*Collection) ExportCSV ¶ added in v2.1.0
func (c *Collection) ExportCSV(fp io.Writer, eout io.Writer, f *DataFrame, verboseLog bool) (int, error)
ExportCSV takes a reader and frame and iterates over the objects generating rows and exports then as a CSV file
func (*Collection) ExportTable ¶ added in v2.1.0
func (c *Collection) ExportTable(eout io.Writer, f *DataFrame, verboseLog bool) (int, [][]interface{}, error)
ExportTable takes a reader and frame and iterates over the objects generating rows and exports then as a CSV file
func (*Collection) FrameClear ¶
func (c *Collection) FrameClear(name string) error
FrameClear empties the frame's object and key lists but leaves in place the Frame definition. Use Reframe() to re-populate a frame based on a new key list.
```
frameName := "journals" err := c.FrameClear(frameName) if err != nil { ... }
func (*Collection) FrameCreate ¶
func (c *Collection) FrameCreate(name string, keys []string, dotPaths []string, labels []string, verbose bool) (*DataFrame, error)
FrameCreate takes a set of collection keys, dot paths and labels builds an ObjectList and assembles additional metadata returning a new Frame associated with the collection as well as an error value. If there is a mis-match in number of labels and dot paths an an error will be returned. If the frame already exists an error will be returned.
Conceptually a frame is an ordered list of objects. Frames are associated with a collection and the objects in a frame can easily be refreshed. Frames also serve as the basis for indexing a dataset collection and provide the data paths (expressed as a list of "dot paths"), labels (aka attribute names), and type information needed for indexing and search.
If you need to update a frame's objects use FrameRefresh(). If you need to change a frame's objects or ordering use FrameReframe().
```
frameName := "journals" keys := []string{ "123", "124", "125" } dotPaths := []string{ ".title", ".description" } labels := []string{ "Title", "Description" } verbose := true frame, err := c.FrameCreate(frameName, keys, dotPaths, labels, verbose) if err != nil { ... }
```
func (*Collection) FrameDef ¶
func (c *Collection) FrameDef(name string) (map[string]interface{}, error)
FrameDef retrieves the frame definition returns a a map string interface.
```
definition := map[string]interface{}{} frameName := "journals" definition, err := c.FrameDef(frameName) if err != nil { .. }
```
func (*Collection) FrameDelete ¶
func (c *Collection) FrameDelete(name string) error
FrameDelete removes a frame from a collection, returns an error if frame can't be deleted.
```
frameName := "journals" err := c.FrameDelete(frameName) if err != nil { ... }
```
func (*Collection) FrameKeys ¶
func (c *Collection) FrameKeys(name string) []string
FrameKeys retrieves a list of keys assocaited with a data frame
```
frameName := "journals" keys := c.FrameKeys(frameName)
```
func (*Collection) FrameNames ¶
func (c *Collection) FrameNames() []string
Frames retrieves a list of available frame names associated with a collection.
```
frameNames := c.FrameNames() for _, name := range frames { // do something with each frame name objects, err := c.FrameObjects(name) ... }
```
func (*Collection) FrameObjects ¶
func (c *Collection) FrameObjects(fName string) ([]map[string]interface{}, error)
FrameObjects returns a copy of a DataFrame's object list given a collection's frame name.
```
var ( err error objects []map[string]interface{} ) frameName := "journals" objects, err = c.FrameObjects(frameName) if err != nil { ... }
```
func (*Collection) FrameRead ¶
func (c *Collection) FrameRead(name string) (*DataFrame, error)
FrameRead retrieves a frame from a collection. Returns the DataFrame and an error value
```
frameName := "journals" data, err := c.FrameRead(frameName) if err != nil { .. }
```
func (*Collection) FrameReframe ¶
func (c *Collection) FrameReframe(name string, keys []string, verbose bool) error
FrameReframe **replaces** a frame's object list based on the keys provided. It uses the frame's existing definition.
```
frameName, verbose := "journals", false keys := ... err := c.FrameReframe(frameName, keys, verbose) if err != nil { ... }
```
func (*Collection) FrameRefresh ¶
func (c *Collection) FrameRefresh(name string, verbose bool) error
FrameRefresh updates a DataFrames' object list based on the existing keys in the frame. It doesn't change the order of objects. It is used when objects in a collection that are included in the frame have been updated. It uses the frame's existing definition.
NOTE: If an object is missing in the collection it gets pruned from the object list.
```
frameName, verbose := "journals", true err := c.FrameRefresh(frameName, verbose) if err != nil { ... }
```
func (*Collection) HasFrame ¶
func (c *Collection) HasFrame(frameName string) bool
HasFrame checks if a frame is defined already. Collection needs to previously been opened.
```
frameName := "journals" if c.HasFrame(frameName) { ... }
```
func (*Collection) HasKey ¶
func (c *Collection) HasKey(key string) bool
HasKey takes a collection and checks if a key exists. NOTE: collection must be open otherwise false will always be returned.
```
key := "123" if c.HasKey(key) { ... }
```
func (*Collection) ImportCSV ¶ added in v2.1.0
func (c *Collection) ImportCSV(buf io.Reader, idCol int, skipHeaderRow bool, overwrite bool, verboseLog bool) (int, error)
ImportCSV takes a reader and iterates over the rows and imports them as a JSON records into dataset. BUG: returns lines processed should probably return number of rows imported
func (*Collection) ImportTable ¶ added in v2.1.0
func (c *Collection) ImportTable(table [][]interface{}, idCol int, useHeaderRow bool, overwrite, verboseLog bool) (int, error)
ImportTable takes a [][]interface{} and iterates over the rows and imports them as a JSON records into dataset.
func (*Collection) Join ¶ added in v2.1.0
func (c *Collection) Join(key string, obj map[string]interface{}, overwrite bool) error
Join takes a key, a map[string]interface{}{} and overwrite bool and merges the map with an existing JSON object in the collection. BUG: This is a naive join, it assumes the keys in object are top level properties.
func (*Collection) Keys ¶
func (c *Collection) Keys() ([]string, error)
Keys returns a array of strings holding all the keys in the collection.
```
keys, err := c.Keys() for _, key := range keys { ... }
```
func (*Collection) Length ¶
func (c *Collection) Length() int64
Length returns the number of objects in a collection NOTE: Returns a -1 (as int64) on error, e.g. collection not open or Length not available for storage type.
```
var x int64 x = c.Length()
```
func (*Collection) MergeFromTable ¶ added in v2.1.0
func (c *Collection) MergeFromTable(frameName string, table [][]interface{}, overwrite bool, verbose bool) error
MergeFromTable - uses a DataFrame associated in the collection to map columns from a table into JSON object attributes saving the JSON object in the collection. If overwrite is true then JSON objects for matching keys will be updated, if false only new objects will be added to collection. Returns an error value
func (*Collection) MergeIntoTable ¶ added in v2.1.0
func (c *Collection) MergeIntoTable(frameName string, table [][]interface{}, overwrite bool, verbose bool) ([][]interface{}, error)
MergeIntoTable - uses a DataFrame associated in the collection to map attributes into table appending new content and optionally overwriting existing content for rows with matching ids. Returns a new table (i.e. [][]interface{}) or error.
func (*Collection) ObjectList ¶
func (c *Collection) ObjectList(keys []string, dotPaths []string, labels []string, verbose bool) ([]map[string]interface{}, error)
ObjectList (on a collection) takes a set of collection keys and builds an ordered array of objects from the array of keys, dot paths and labels provided.
```
var mapList []map[string]interface{} keys := []string{ "123", "124", "125" } dotPaths := []string{ ".title", ".description" } labels := []string{ "Title", "Description" } verbose := true mapList, err = c.ObjectList(keys, dotPaths, labels, verbose)
```
func (*Collection) Prune ¶
func (c *Collection) Prune(key string, filename string) error
Prune removes a an attached document from the JSON record given a key and filename. NOTE: In versioned collections this include removing all versions of the attached document.
```
key, filename := "123", "report.pdf" err := c.Prune(key, filename) if err != nil { ... }
```
func (*Collection) PruneAll ¶
func (c *Collection) PruneAll(key string) error
PruneAll removes attachments from a JSON record in the collection. When the collection is versioned it removes all versions of all too.
```
key := "123" err := c.PruneAll(key) if err != nil { ... }
```
func (*Collection) PruneVersion ¶
func (c *Collection) PruneVersion(key string, filename string, version string) error
PruneVersion removes an attached version of a document.
```
key, filename, version := "123", "report.pdf, "0.0.3" err := c.PruneVersion(key, filename, version) if err != nil { ... }
```
func (*Collection) Read ¶
func (c *Collection) Read(key string, obj map[string]interface{}) error
Read retrieves a map[string]inteferface{} from the collection, unmarshals it and updates the object pointed to by the map.
```
obj := map[string]interface{}{} key := "123" if err := c.Read(key, &obj); err != nil { ... }
```
func (*Collection) ReadJSON ¶
func (c *Collection) ReadJSON(key string) ([]byte, error)
ReadJSON retrieves JSON stored in a dataset collection for a given key. NOTE: It does not validate the JSON
```
key := "123" src, err := c.ReadJSON(key) if err != nil { // ... handle error }
```
func (*Collection) ReadJSONVersion ¶ added in v2.1.1
func (c *Collection) ReadJSONVersion(key string, semver string) ([]byte, error)
ReadJSONVersion retrieves versioned JSON record stored in a dataset collection for a given key and semver. NOTE: It does not validate the JSON
```
key := "123" semver := "0.0.2" src, err := c.ReadVersionJSON(key, semver) if err != nil { // ... handle error }
```
func (*Collection) ReadObject ¶
func (c *Collection) ReadObject(key string, obj interface{}) error
ReadObject retrieves structed data via Go's general inteferface{} type. The JSON document is retreived from the collection, unmarshaled and variable holding the struct is updated.
```
type Record struct { ID string `json:"id"` Name string `json:"name,omitempty"` EMail string `json:"email,omitempty"` } // ... var obj *Record key := "123" if err := c.Read(key, &obj); err != nil { // ... handle error }
```
func (*Collection) ReadObjectVersion ¶
func (c *Collection) ReadObjectVersion(key string, version string, obj interface{}) error
ReadObjectVersion retrieves a specific vesion from the collection for the given object.
```
type Record srtuct { // ... structure def goes here. } var obj = *Record key, version := "123", "0.0.1" if err := ReadObjectVersion(key, version, &obj); err != nil { ... }
```
func (*Collection) ReadVersion ¶
func (c *Collection) ReadVersion(key string, version string, obj map[string]interface{}) error
ReadVersion retrieves a specific vesion from the collection for the given object.
```
var obj map[string]interface{} key, version := "123", "0.0.1" if err := ReadVersion(key, version, &obj); err != nil { ... }
```
func (*Collection) RetrieveFile ¶
func (c *Collection) RetrieveFile(key string, filename string) ([]byte, error)
RetrieveFile retrieves a file attached to a JSON document in the collection.
```
key, filename := "123", "report.pdf" src, err := c.RetrieveFile(key, filename) if err != nil { ... } err = ioutil.WriteFile(filename, src, 0664) if err != nil { ... }
```
func (*Collection) RetrieveStream ¶
RetrieveStream takes a key and filename then returns an io.Reader, and error. If the collection is versioned then the stream is for the "current" version of the attached file.
```
key, filename := "123", "report.pdf" src := []byte{} buf := bytes.NewBuffer(src) err := c.Retrieve(key, filename, buf) if err != nil { ... } ioutil.WriteFile(filename, src, 0664)
```
func (*Collection) RetrieveVersionFile ¶
func (c *Collection) RetrieveVersionFile(key string, filename string, version string) ([]byte, error)
RetrieveVersionFile retrieves a file version attached to a JSON document in the collection.
```
key, filename, version := "123", "report.pdf", "0.0.3" src, err := c.RetrieveVersionFile(key, filename, version) if err != nil { ... } err = ioutil.WriteFile(filename + "_" + version, src, 0664) if err != nil { ... }
```
func (*Collection) RetrieveVersionStream ¶
func (c *Collection) RetrieveVersionStream(key string, filename string, version string, buf io.Writer) error
RetrieveVersionStream takes a key, filename and version then returns an io.Reader and error.
```
key, filename, version := "123", "helloworld.txt", "0.0.3" src := []byte{} buf := bytes.NewBuffer(src) err := c.RetrieveVersion(key, filename, version, buf) if err != nil { ... } ioutil.WriteFile(filename + "_" + version, src, 0664)
```
func (*Collection) Sample ¶
func (c *Collection) Sample(size int) ([]string, error)
Sample takes a sample size and returns a list of randomly selected keys and an error. Sample size most be greater than zero and less or equal to the number of keys in the collection. Collection needs to be previously opened.
```
smapleSize := 1000 keys, err := c.Sample(sampleSize)
```
func (*Collection) SaveFrame ¶
func (c *Collection) SaveFrame(name string, f *DataFrame) error
SaveFrame saves a frame in a collection or returns an error
```
frameName := "journals" data, err := c.FrameRead(frameName) if err != nil { ... } // do stuff with the frame's data ... // Save the changed frame data err = c.SaveFrame(frameName, data)
```
func (*Collection) SetVersioning ¶
func (c *Collection) SetVersioning(versioning string) error
SetVersioning sets the versioning on a collection. The version string can be "major", "minor", "patch". Any other value (e.g. "", "off", "none") will turn off versioning for the collection.
func (*Collection) Update ¶
func (c *Collection) Update(key string, obj map[string]interface{}) error
Update replaces a JSON document in the collection with a new one. If the collection is versioned then it creates a new versioned copy and updates the "current" version to use it.
```
key := "123" obj["three"] = 3 if err := c.Update(key, obj); err != nil { ... }
```
func (*Collection) UpdateJSON ¶
func (c *Collection) UpdateJSON(key string, src []byte) error
UpdateJSON replaces a JSON document in the collection with a new one. NOTE: It does not validate the JSON
```
src := []byte(`{"Three": 3}`) key := "123" if err := c.UpdateJSON(key, src); err != nil { // ... handle error }
```
func (*Collection) UpdateMetadata ¶
func (c *Collection) UpdateMetadata(fName string) error
UpdateMetadata imports new codemeta citation information replacing the previous version. Collection must be open.
```
name := "my_collection.ds" codemetaFilename := "../codemeta.json" c, err := dataset.Open(name) if err != nil { ... } defer c.Close() c.UpdateMetadata(codemetaFilename)
```
func (*Collection) UpdateObject ¶
func (c *Collection) UpdateObject(key string, obj interface{}) error
UpdateObject replaces a JSON document in the collection with a new one. If the collection is versioned then it creates a new versioned copy and updates the "current" version to use it.
```
type Record struct { // ... structure def goes here. Three int `json:"three"` } var obj = *Record key := "123" obj := &Record { Three: 3, } if err := c.Update(key, obj); err != nil { // ... handle error }
```
func (*Collection) UpdatedKeys ¶
func (c *Collection) UpdatedKeys(start string, end string) ([]string, error)
UpdatedKeys takes a start and end time and returns a list of keys for records that were modified in that time range. The start and end values are expected to be in YYYY-MM-DD HH:MM:SS notation or empty strings.
NOTE: This currently only supports SQL stored collections.
func (*Collection) Versions ¶
func (c *Collection) Versions(key string) ([]string, error)
Versions retrieves a list of versions available for a JSON document if versioning is enabled for the collection.
```
key, version := "123", "0.0.1" if versions, err := Versions(key); err != nil { ... }
```
func (*Collection) WorkPath ¶
func (c *Collection) WorkPath() string
WorkPath returns the working path to the collection.
type Config ¶
type Config struct { // Dname holds the dataset collection name/path. CName string `json:"dataset,omitempty"` // Dsn URI describes how to connection to a SQL storage engine // use by the collection(s). // e.g. "sqlite://my_collection.ds/collection.db". // // The Dsn URI may be past in from the environment via the // variable DATASET_DSN_URI. E.g. where all the collections // are stored in a common database. DsnURI string `json:"dsn_uri,omitemtpy"` // Keys lets you get a list of keys in a collection Keys bool `json:"keys,omitempty"` // Create allows you to add objects to a collection Create bool `json:"create,omitempty"` // Read allows you to retrive an object from a collection Read bool `json:"read,omitempty"` // Update allows you to replace objects in a collection Update bool `json:"update,omitempty"` // Delete allows ytou to removes objects, object versions, // and attachments from a collection Delete bool `json:"delete,omitempty"` // Attachments allows you to attached documents for an object in the // collection. Attachments bool `json:"attachments,omitempty"` // Attach allows you to store an attachment for an object in // the collection Attach bool `json:"attach,omitempty"` // Retrieve allows you to get an attachment in the collection for // a given object. Retrieve bool `json:"retrieve,omitempty"` // Prune allows you to remove an attachment from an object in // a collection Prune bool `json:"prune,omitempty"` // FrameRead allows you to see a list of frames, check for // a frame's existance and read the content of a frame, e.g. // it's definition, keys, object list. FrameRead bool `json:"frame_read,omitempty"` // FrameWrite allows you to create a frame, change the frame's // content or remove the frame completely. FrameWrite bool `json:"frame_write,omitempty"` // Versions allows you to list versions, read and delete // versioned objects and attachments in a collection. Versions bool `json:"versions,omitempty"` }
Config holds the collection specific configuration.
type DSImport ¶ added in v2.1.7
type DSQuery ¶ added in v2.1.4
type DSQuery struct { CName string `json:"c_name,omitempty"` Stmt string `json:"stmt,omitempty"` Pretty bool `json:"pretty,omitempty"` AsGrid bool `json:"as_grid,omitempty"` AsCSV bool `json:"csv,omitempty"` Attributes []string `json:"attributes,omitempty"` PTIndex bool `json:"pt_index,omitempty"` // contains filtered or unexported fields }
type DataFrame ¶
type DataFrame struct { // Explicit at creation Name string `json:"frame_name"` // CollectionName holds the name of the collection the frame was generated from. In theory you could // define a frame in one collection and use its results in another. A DataFrame can be rendered as a JSON // document. CollectionName string `json:"collection_name"` // DotPaths is a slice holding the definitions of what each Object attribute's data source is. DotPaths []string `json:"dot_paths"` // Labels are new attribute names for fields create from the provided // DotPaths. Typically this is used to surface a deeper dotpath's // value as something more useful in the frame's context (e.g. // first_title from an array of titles might be labeled "title") Labels []string `json:"labels"` // NOTE: Keys is an orded list of object keys in the frame. Keys []string `json:"keys"` // NOTE: Object map privides a quick index by key to object index. ObjectMap map[string]interface{} `json:"object_map"` // Created is the date the frame is originally generated and defined Created time.Time `json:"created"` // Updated is the date the frame is updated (e.g. reframed) Updated time.Time `json:"updated"` }
DataFrame is the basic structure holding a list of objects as well as the definition of the list (so you can regenerate an updated list from a changed collection). It persists with the collection.
func (*DataFrame) Grid ¶
Grid returns a table representaiton of a DataFrame's ObjectList
```
frameName, includeHeader := "journals", true data, err := c.FrameRead(frameName) if err != nil { ... } rows, err := data.Grid(includeHeader) if err != nil { ... } ... /* now do something with the rows */ ...
```
type PTStore ¶
type PTStore struct { // Working path to the directory where the collections.json is found. WorkPath string // Versioning holds the type of versioning active for the stored // collection. The options are None (no versioning, the default), // Major (major value in semver is incremented), Minor (minor value // in semver is incremented) and Patch (patch value in semver is incremented) Versioning int // contains filtered or unexported fields }
func PTStoreOpen ¶
Open opens the storage system and returns an storage struct and error It is passed a directory name that holds collection.json. The second parameter is for a DSN URI which is ignored in a pairtree implementation.
```
name := "testout/T1.ds" // a collection called "T1.ds" store, err := c.Store.Open(name, "") if err != nil { ... } defer store.Close()
```
func (*PTStore) Close ¶
Close closes the storage system freeing resources as needed.
```
if err := store.Close(); err != nil { ... }
```
func (*PTStore) Create ¶
Create stores a new JSON object in the collection It takes a string as a key and a byte slice of encoded JSON
err := store.Create("123", []byte(`{"one": 1}`)) if err != nil { ... }
func (*PTStore) Delete ¶
Delete removes all versions of JSON document and attachment indicated by the key provided.
key := "123" if err := store.Delete(key); err != nil { ... }
NOTE: If you're versioning your collection then you never really want to delete. An approach could be to use update using an empty JSON document to indicate the document is retired those avoiding the deletion problem of versioned content.
```
key := "123" if err := store.Delete(key); err != nil { ... }
```
func (*PTStore) HasKey ¶
HasKey will look up and make sure key is in collection. PTStore must be open or zero false will always be returned.
```
key := "123" if store.HasKey(key) { ... }
```
func (*PTStore) KeymapName ¶
func (*PTStore) Keys ¶
List returns all keys in a collection as a slice of strings.
```
var keys []string keys, _ = store.Keys() /* iterate over the keys retrieved */ for _, key := range keys { ... }
```
NOTE: the error will always be nil, this func signature needs to match the other storage engines.
func (*PTStore) Length ¶
Length returns the number of records (len(store.keys)) in the collection Requires collection to be open.
```
var x int64 x = store.Length()
```
func (*PTStore) Read ¶
Read retrieves takes a string as a key and returns the encoded JSON document from the collection. If versioning is enabled this is always the "current" version of the object. Use Versions() and ReadVersion() for versioned copies.
```
src, err := store.Read("123") if err != nil { ... } obj := map[string]interface{}{} if err := json.Unmarshal(src, &obj); err != nil { ... }
```
func (*PTStore) ReadVersion ¶
ReadVersion retrieves a specific version of a JSON document stored in a collection.
```
key, version := "123", "0.0.1" src, err := store.ReadVersion(key, version) if err != nil { ... }
```
func (*PTStore) SetVersioning ¶
SetVersioning sets the type of versioning associated with the stored collection.
func (*PTStore) Update ¶
Update takes a key and encoded JSON object and updates a JSON document in the collection.
```
key := "123" src := []byte(`{"one": 1, "two": 2}`) if err := store.Update(key, src); err != nil { ... }
```
type SQLStore ¶
type SQLStore struct { // WorkPath holds the path to where the collection definition is held. WorkPath string // versioning Versioning int // contains filtered or unexported fields }
func SQLStoreInit ¶
SQLStoreInit creates a table to hold the collection if it doesn't already exist.
func SQLStoreOpen ¶
SQLStoreOpen opens the storage system and returns an storage struct and error It is passed either a filename. For a Pairtree the would be the path to collection.json and for a sql store file holding a DSN URI. The DSN URI is formed from a protocal prefixed to the DSN. E.g. for a SQLite connection to test.ds database the DSN URI might be "sqlite://collections.db".
```
store, err := c.Store.Open(c.Name, c.DsnURI) if err != nil { ... }
```
func (*SQLStore) Close ¶
Close closes the storage system freeing resources as needed.
```
if err := storage.Close(); err != nil { ... }
```
func (*SQLStore) Create ¶
Create stores a new JSON object in the collection It takes a string as a key and a byte slice of encoded JSON
err := storage.Create("123", []byte(`{"one": 1}`)) if err != nil { ... }
func (*SQLStore) Delete ¶
Delete removes a JSON document from the collection
key := "123" if err := storage.Delete(key); err != nil { ... }
func (*SQLStore) HasKey ¶
HasKey will look up and make sure key is in collection. SQLStore must be open or zero false will always be returned.
```
key := "123" if store.HasKey(key) { ... }
```
func (*SQLStore) Keys ¶
Keys returns all keys in a collection as a slice of strings.
var keys []string keys, _ = storage.Keys() /* iterate over the keys retrieved */ for _, key := range keys { ... }
func (*SQLStore) Length ¶
Length returns the number of records (count of rows in collection). Requires collection to be open.
func (*SQLStore) Read ¶
Read retrieves takes a string as a key and returns the encoded JSON document from the collection
src, err := storage.Read("123") if err != nil { ... } obj := map[string]interface{}{} if err := json.Unmarshal(src, &obj); err != nil { ... }
func (*SQLStore) ReadVersion ¶
ReadVersion returns a specific version of a JSON object.
func (*SQLStore) SetVersioning ¶
SetVersioning sets versioning to Major, Minor, Patch or None If versioning is set to Major, Minor or Patch a table in the open SQL storage engine will be created.
func (*SQLStore) Update ¶
Update takes a key and encoded JSON object and updates a
key := "123" src := []byte(`{"one": 1, "two": 2}`) if err := storage.Update(key, src); err != nil { ... }
func (*SQLStore) UpdatedKeys ¶
UpdatedKeys returns all keys updated in a time range
```
var ( keys []string start = "2022-06-01 00:00:00" end = "20022-06-30 23:23:59" ) keys, _ = storage.UpdatedKeys(start, end) /* iterate over the keys retrieved */ for _, key := range keys { ... }
```
type Settings ¶
type Settings struct { // Host holds the URL to listen to for the web API Host string `json:"host"` // Htdocs holds the path to static content that will be // provided by the web service. Htdocs string `json:"htdocs"` // Collections holds an array of collection configurations that // will be supported by the web service. Collections []*Config `json:"collections"` }
Settings holds the specific settings for the web service.
func ConfigOpen ¶
ConfigOpen reads the JSON configuration file provided, validates it and returns a Settings structure and error.
NOTE: if the dsn string isn't specified
```
settings := "settings.json" settings, err := ConfigOpen(settings) if err != nil { ... }
```
func (*Settings) WriteFile ¶
Write will save a configuration to the filename provided.
```
fName := "new-settings.json" mysql_dsn_uri := os.Getenv("DATASET_DSN_URI") settings := new(Settings) settings.Host = "localhost:8001" settings.Htdocs = "/usr/local/www/htdocs" cfg := &Config{ DsnURI: mysql_dsn_uri, CName: "my_collection.ds", Keys: true, Create: true, Read: true, Update: true Delete: true Attach: false Retrieve: false Prune: false }} settings.Collections = append(settings.Collections, cfg) if err := api.WriteFile(fName, 0664); err != nil { ... }
```
type StorageSystem ¶
type StorageSystem interface { // Open opens the storage system and returns an storage struct and error // It is passed either a filename. For a Pairtree the would be the // path to collection.json and for a sql store file holding a DSN // // “` // store, err := c.Store.Open(c.Access) // if err != nil { // ... // } // “` // Open(name string, dsnURI string) (*StorageSystem, error) // Close closes the storage system freeing resources as needed. // // “` // if err := storage.Close(); err != nil { // ... // } // “` // Close() error // Create stores a new JSON object in the collection // It takes a string as a key and a byte slice of encoded JSON // // err := storage.Create("123", []byte(`{"one": 1}`)) // if err != nil { // ... // } // Create(string, []byte) error // Read retrieves takes a string as a key and returns the encoded // JSON document from the collection // // src, err := storage.Read("123") // if err != nil { // ... // } // obj := map[string]interface{}{} // if err := json.Unmarshal(src, &obj); err != nil { // ... // } Read(string) ([]byte, error) // Versions returns a list of semver formatted version strings avialable for an JSON object Versions(string) ([]string, error) // ReadVersion takes a key and semver version string and return that version of the // JSON object. ReadVersion(string, string) ([]byte, error) // Update takes a key and encoded JSON object and updates a // JSON document in the collection. // // key := "123" // src := []byte(`{"one": 1, "two": 2}`) // if err := storage.Update(key, src); err != nil { // ... // } // Update(string, []byte) error // Delete removes all versions and attachments of a JSON document. // // key := "123" // if err := storage.Delete(key); err != nil { // ... // } // Delete(string) error // Keys returns all keys in a collection as a slice of strings. // // var keys []string // keys, _ = storage.List() // /* iterate over the keys retrieved */ // for _, key := range keys { // ... // } // Keys() ([]string, error) // HasKey returns true if collection is open and key exists, // false otherwise. HasKey(string) bool // Length returns the number of records in the collection Length() int64 }
StorageSystem describes the functions required to implement a dataset storage system. Currently two types of storage systems are supported -- pairtree and sql storage (via MySQL 8 and JSON columns) If the funcs describe are not supported by the storage system they must return a "Not Implemented" error value.
Source Files ¶
- api.go
- api_cmd.go
- api_docs.go
- api_routes.go
- api_setup.go
- attachments.go
- checksums.go
- cli.go
- cli_doc.go
- clone.go
- collection.go
- compatibility.go
- config.go
- doc.go
- dsimport.go
- dsquery.go
- frames.go
- import_export.go
- json_handlers.go
- license.go
- ptstore.go
- repair.go
- sqlstore.go
- storage.go
- sync.go
- tables.go
- texts.go
- timefmt.go
- version.go
Directories ¶
Path | Synopsis |
---|---|
cmd
|
|
dataset
dataset is a command line tool, Go package, shared library and Python package for working with JSON objects as collections on local disc.
|
dataset is a command line tool, Go package, shared library and Python package for working with JSON objects as collections on local disc. |
datasetd
datasetd implements a web service for working with dataset collections.
|
datasetd implements a web service for working with dataset collections. |
dsimport
dsimport.go is a command line program for working an dataset collections.
|
dsimport.go is a command line program for working an dataset collections. |
dsquery
dsquery.go is a command line program for working an dataset collections using the dataset v2 SQL store for the JSON documents (e.g.
|
dsquery.go is a command line program for working an dataset collections using the dataset v2 SQL store for the JSON documents (e.g. |