Skip to content
Snippets Groups Projects
Commit 5879c5ae authored by PTSEFTON's avatar PTSEFTON
Browse files

edits

parent 39e9da50
Branches
No related merge requests found
......@@ -12,13 +12,15 @@ discuss DataCrate but this could just be treated as an abstract.
This source file gets built into a [PDF file](./build/paper.pdf) - I'll update that periodically, but
this is the source file.
If you want/ deserve to be an author, add your name to the [metadata file](./metadata.yaml).
I am managing the bibliography in Zotero and exporting to bibliography.bib. If
you can contribute by finding more references that would be great - let me know
and I'll share the library with you or we can talk about formats.
# Note to reviewers
The specification here is currently in draft at v0.2. A version one release is
The specification described here is currently in draft at v0.2. A version one release is
planned for October 2018. If accepted this paper would be updated for
presentation and subsequent publication.
......@@ -106,9 +108,9 @@ We were not able to find any general-purpose packaging specification with
anything like the HTML+RDFa index that HIEv data packages have, allowing for
human and machine readable metadata. In light of that our approach was to choose
a base standard that covered the other requirements and add to it, as with the
HIEv approach. Our starting point was that using BagIt plus extra files worked
well in our initial implementations - the descisions were around formalising
metadata standards.
HIEv approach. Using BagIt plus extra files worked well in our initial
implementations so that was to be kept unless a better alternative surfaced --
the decisions were around formalising metadata standards.
BagIt, which had been used in HIEv and Cr8it is an obvious standard on which to
base a packaging format - it is widely used in the research data community,
......@@ -119,14 +121,14 @@ integrity aspects of packaging data.
### Alternatives considered
Frictionless data packages [TODO ref], which uses a simple JSON format as a
manifest has roughly equivalent packaging features to BagIt and they have
manifest has roughly equivalent packaging features to BagIt having
checksum features built in. In their favour, frictionless data packages have the
ability to describe the headers in tabular data files. However, but they do not
meet the requirement `7` of having linked-data metadata, so while the JSON
metadata is technically machine readable, it is not easy to relate to the
semantic web as it does not use linked-data standards, and the terms are defined
locally to the specification. It was also unclear how to extend the
specification, contrasting with linked-data approaches which *automatically*
locally to the specification. It is also unclear how to extend the
specification in a standardised way, contrasting with linked-data approaches which *automatically*
allow extension by the use of URIs.
As an example, [the spec](https://frictionlessdata.io/specs/data-package/) does not give a single way to describe temporal coverage
......@@ -156,10 +158,10 @@ As an example, [the spec](https://frictionlessdata.io/specs/data-package/) does
> and stored in CSV.
> <https://github.com/frictionlessdata/specs/blob/0860ecd6bbb7685425e6493165c9b1a1c91eb16b/specs/data-package.md>
This extension mechanism in frictionless Data Pacakges is likely to result in a
proliferation of highly divergent non-standardised metadata - by using JSON-LD
and specifying how to represent temporal and geographical coverage, etc
DataCrate aims to encourage common behaviours.
This *laissez faire* extension mechanism in frictionless Data Packages is likely
to result in a proliferation of highly divergent non-standardised metadata - by
using JSON-LD and specifying how to represent temporal and geographical
coverage, etc DataCrate aims to encourage common behaviours.
The other main alternative was the Research Object Bundle specification
[@soiland-reyesResearchObjectBundle2014]. At the time we started the DataCrate
......@@ -191,7 +193,7 @@ TODO: Spell this out better.
that we don't think implementers will get right
- Uses lots of little files
The initial version of DataCrate (v0.1) was developed in 2017. V1 persisted with
The initial version of DataCrate (v0.1) was developed in 2017. V0.1 persisted with
HTML+RDFa for human and machine readability but this was cumbersome and was
removed in favour of an approach where the human-centred HTML page is generated
from a machine-readable JSON-LD file rather than the other way around.
......@@ -220,11 +222,13 @@ TBA - what to do for scientific discipline metadata.
# Implementation
The specification.
The specification. A quick summary
# Conclusion
Link to showcase examples.
Points to note (TODO):
Looking good so far - tool developers are coming on board (Western Sydney, MIF,
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment