|
1 |
| -# agnosticmapper |
| 1 | +# agnosticmapper |
| 2 | + |
| 3 | +The agnosticmapper is a Python package depending on rdflib . |
| 4 | + |
| 5 | +By parsing a [canonical JSON](#Canonical-JSON-example), it generates assertions (abox) and returns a turtle file-like string, that |
| 6 | + |
| 7 | +- uses rdfs:lables described in provided terminology boxes to determine which classes/object-/dataproperties are instantiated |
| 8 | +- creates class instances with uuids in the IRIs |
| 9 | +- cross-references instances within the abox |
| 10 | + |
| 11 | + |
| 12 | +## Getting Started |
| 13 | + |
| 14 | +### Installation of the module |
| 15 | + |
| 16 | +To install the module locally for usage in another python script, after cloning the repository, navigate to the module path and build a wheel file. |
| 17 | + |
| 18 | +``` |
| 19 | +$ cd path/to/agnosticmapper/ |
| 20 | +``` |
| 21 | + |
| 22 | +The structure of the directory should look like this: |
| 23 | +``` |
| 24 | +$ ls . |
| 25 | +.... |
| 26 | +agnosticmapper |
| 27 | +setup.py |
| 28 | +pyproject.toml |
| 29 | +... |
| 30 | +``` |
| 31 | + |
| 32 | +Build the wheel file: |
| 33 | +``` |
| 34 | +python3 setup.py bdist_wheel |
| 35 | +``` |
| 36 | + |
| 37 | +Check for the creation of the wheel file in the `dist` directory |
| 38 | +``` |
| 39 | +$ ls ./dist |
| 40 | +agnosticmapper-1.0-py3-none-any.whl |
| 41 | +``` |
| 42 | + |
| 43 | +Navigate to the top path to install the python module |
| 44 | +``` |
| 45 | +$ cd ../ |
| 46 | +$ ls |
| 47 | +... |
| 48 | +agnosticmapper |
| 49 | +... |
| 50 | +$ pip install agnosticmapper/dist/agnosticmapper-1.0-py3-none-any.whl |
| 51 | +... |
| 52 | +Successfully installed agnosticmapper-1.0 |
| 53 | +``` |
| 54 | + |
| 55 | +If necessary you can uninstall again the module with |
| 56 | +``` |
| 57 | +pip uninstall agnosticmapper |
| 58 | +``` |
| 59 | + |
| 60 | +### Usage |
| 61 | +If the module is installed you can use the python `import` to use the module in your code. |
| 62 | +``` |
| 63 | +from agnosticmapper import Mapper |
| 64 | +``` |
| 65 | + |
| 66 | +You can then create a new Mapper instance |
| 67 | +``` |
| 68 | +mapper = Mapper() |
| 69 | +``` |
| 70 | + |
| 71 | +The mapper only provides one method, called `map(...)`. |
| 72 | +The method creates the Turtle file out of the given canonical json. Based on provided ontology terminologies. It instantiates classes by its labels with a uuid and using the given entity context as the namespace. |
| 73 | + |
| 74 | +``` |
| 75 | +mapper.map(canon, ontos, context, entityContextTuple, ignoreEntityInstantiationList) |
| 76 | +``` |
| 77 | + |
| 78 | +#### Parameter description |
| 79 | + |
| 80 | + |
| 81 | +| Parameter | Type | Description | |
| 82 | +| -------- | -------- | -------- | |
| 83 | +| canon | list[dict]\|dict | Given dict or list of dicts of the canonical json to be converted to Turtle | |
| 84 | +| ontos | list[str] | List of ontology terminologies as strings which are used to resolve the labels | |
| 85 | +| context | dict | Dictionary of all used namespaces in the canonical json with its prefix as key and IRI as value | |
| 86 | +| entityContextTuple | tuple | Tuple with exactly 2 elements where the first element is the prefix and the second element the IRI. The prefix is used for the instantiated classes. | |
| 87 | +| ignoreEntityInstantiationList | list[str] | List of strings which are the labels that will not be instatiated. Instead it keeps the associated value as it is in the given canonical json. | |
| 88 | + |
| 89 | +#### Example usage |
| 90 | +``` |
| 91 | +from agnosticmapper import Mapper |
| 92 | +import json |
| 93 | +import os |
| 94 | +
|
| 95 | +mapper = Mapper() |
| 96 | +ontos = [open(file, "r").read() for file in [f"/path/to/agnosticmapper/example/foaf.ttl", |
| 97 | + f"/path/to/agnosticmapper/example/rdf-schema.ttl", |
| 98 | + f"/path/to/agnosticmapper/example/dublin_core_terms.ttl"]] |
| 99 | + |
| 100 | +canon_json = json.loads(open(f"/path/to/agnosticmapper/example/foaf_canon.json", "r").read()) |
| 101 | +
|
| 102 | +context = { |
| 103 | + "foaf": "http://xmlns.com/foaf/0.1", |
| 104 | + "rdfs": "http://www.w3.org/2000/01/rdf-schema#", |
| 105 | + "dcterms": "http://purl.org/dc/terms/" |
| 106 | +} |
| 107 | + |
| 108 | +entityContextTuple = ("entity", "http://example.org/entity/") |
| 109 | + |
| 110 | +ignoreEntityInstantiationList = ["interest"] |
| 111 | +
|
| 112 | +result = mapper.map(canon=canon_json, |
| 113 | + ontos=ontos, |
| 114 | + context=context, |
| 115 | + entityContextTuple=entityContextTuple, |
| 116 | + ignoreEntityInstantiationList=ignoreEntityInstantiationList) |
| 117 | + |
| 118 | +print(result) |
| 119 | +``` |
| 120 | + |
| 121 | + |
| 122 | +### Canonical JSON example |
| 123 | +Essentially a canonical json (shorthand: canon json) is an ontology independent linked data json file similar to json-ld. |
| 124 | + |
| 125 | +``` |
| 126 | +[{ |
| 127 | + "Group": { |
| 128 | + "listHandler": ["member"], |
| 129 | + "member": [{ |
| 130 | + "Person": { |
| 131 | + "hasIdentifier": "a" |
| 132 | + } |
| 133 | + }, { |
| 134 | + "Person": { |
| 135 | + "hasIdentifier": "b" |
| 136 | + } |
| 137 | + }] |
| 138 | + } |
| 139 | +}, { |
| 140 | + "Person": { |
| 141 | + "hasIdentifier": "a", |
| 142 | + "additionalTypes": ["Agent"], |
| 143 | + "name": "Alice", |
| 144 | + "interest": "http://purl.org/dc/terms/BibliographicResource" |
| 145 | + } |
| 146 | +}, { |
| 147 | + "Person": { |
| 148 | + "hasIdentifier": "b", |
| 149 | + "name": "Bob", |
| 150 | + "knows": [{ |
| 151 | + "Person": { |
| 152 | + "name": "Charlie" |
| 153 | + } |
| 154 | + }, { |
| 155 | + "Person": { |
| 156 | + "name": "Dave" |
| 157 | + } |
| 158 | + }] |
| 159 | + } |
| 160 | +}] |
| 161 | +``` |
| 162 | + |
| 163 | +The json-keys are strings that match with rdfs:labels in the provided terminology-boxes. |
| 164 | + |
| 165 | +Special keys that must not be used as a rdfs:label are "hasIdentifier", "additionalTypes" and "listHandler". |
| 166 | + |
| 167 | +JSON values can be primitive datatypes, in which case the provided value is directly written as a value in the assertion box. |
| 168 | +If the json values are json objects themselves, it indicates that a new class instance will be created except the label is part of the parameter "ignoreEntityInstantiationList" in which case the provided value will kept in the abox as is. (see the objectproperty "interest" in the example) |
| 169 | + |
| 170 | +For each element in an json array a new class instance will be created and added to the domain of the object. |
| 171 | +If the label is marked in "listHandler", the array will be handled as a ordered list (see "member" in the example). There can be multiple labels dedicated as a list. It only is valid within the same json object. |
| 172 | + |
| 173 | +"hasIdentifier" is used to cross reference class instances within the canon json. If you reference a class instance at another point, you must use it or else it will create two different class instances with different uuids. |
| 174 | + |
| 175 | +"additoinalTypes" will add more subclasses to the class instance apart from the label that is used as the key. |
| 176 | + |
| 177 | +The prefix and namespace that is prefixed to the uuids of the class instance IRIs ist provided via the "entityContextTuple" parameter. |
| 178 | + |
| 179 | +### Output example |
| 180 | +``` |
| 181 | +@prefix dcterms: <http://purl.org/dc/terms/> . |
| 182 | +@prefix entity: <http://example.org/entity/> . |
| 183 | +@prefix foaf: <http://xmlns.com/foaf/0.1/> . |
| 184 | +@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . |
| 185 | +@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . |
| 186 | +
|
| 187 | +entity:ef4cc7abeebf4ca2b0db3e9625bfaba9 a foaf:Group ; |
| 188 | + rdfs:label "ef4cc7 Group" ; |
| 189 | + rdfs:member ( entity:e78d4393bc3d4decb38e2c335294b480 entity:b2c30466cbf34bc18429d3aa9e3d7a8b ) . |
| 190 | +
|
| 191 | +entity:e78d4393bc3d4decb38e2c335294b480 a dcterms:Agent, |
| 192 | + foaf:Person ; |
| 193 | + rdfs:label "e78d43 Person" ; |
| 194 | + foaf:interest dcterms:BibliographicResource ; |
| 195 | + foaf:name "Alice" . |
| 196 | + |
| 197 | +entity:b2c30466cbf34bc18429d3aa9e3d7a8b a foaf:Person ; |
| 198 | + rdfs:label "b2c304 Person" ; |
| 199 | + foaf:knows entity:c466dac731024d0a88d493ceaab70ea1, |
| 200 | + entity:e054f53709a84c27aade3b2c097375e5 ; |
| 201 | + foaf:name "Bob" . |
| 202 | +
|
| 203 | +entity:e054f53709a84c27aade3b2c097375e5 a foaf:Person ; |
| 204 | + rdfs:label "e054f5 Person" ; |
| 205 | + foaf:name "Charlie" . |
| 206 | +
|
| 207 | +entity:c466dac731024d0a88d493ceaab70ea1 a foaf:Person ; |
| 208 | + rdfs:label "c466da Person" ; |
| 209 | + foaf:name "Dave" . |
| 210 | +``` |
| 211 | + |
| 212 | +The order of searching for the label is: |
| 213 | +1. skos:altlabel |
| 214 | + a. en |
| 215 | + b. en-us |
| 216 | + c. en-gb |
| 217 | + d. nolang |
| 218 | +2. rdfs:label |
| 219 | + a. en |
| 220 | + b. en-us |
| 221 | + c. en-gb |
| 222 | + d. nolang |
| 223 | +3. skos:altlabel |
| 224 | + a. de |
| 225 | +4. rdfs:label |
| 226 | + b. de |
| 227 | + |
| 228 | + |
| 229 | + |
| 230 | +### Standalone program |
| 231 | + |
| 232 | +This python module can also be used as a standalone running program using command line arguments and files to work with without the need of using it as a module in another python program. |
| 233 | + |
| 234 | +#### Parameters |
| 235 | + |
| 236 | +#### Using python call with parameters |
| 237 | + |
| 238 | +| Parameter | Description | Multiple Possible | Required | |
| 239 | +| -------- | -------- | -------- | -------- | |
| 240 | +| -h, --help | show help message and exit | No | Yes | |
| 241 | +| -o ONTOS, --ontologies ONTOS | ONTOS is the path to the ontology turtle file | Yes | Yes | |
| 242 | +| -j, --jsoncanon | CANON_JSON is the path to the canonical json file | No | Yes | |
| 243 | +| -c, --context | CONTEXT is the path to the context json file | No | Yes | |
| 244 | +| -p, --entitycontextprefix | Prefix which is used for the generated entities | No | Yes | |
| 245 | +| -e, --entitycontext | URI which is used for the generated entities | No | Yes | |
| 246 | +| -i, --ignoreinstantiation | Label which are not instantiated as entities | Yes | No | |
| 247 | +| -w, --writefilepath | Filepath to write the generated turtle instead of print it on console | No | No | |
| 248 | + |
| 249 | +#### Example call |
| 250 | +`python3 path/to/agnosticmapper/agnosticmapper.py -o agnosticmapper/agnosticmapper/example/foaf.ttl -o agnosticmapper/agnosticmapper/example/rdf-schema.ttl -o agnosticmapper/agnosticmapper/example/dublin_core_terms.ttl -j agnosticmapper/agnosticmapper/example/foaf_canon.json -c agnosticmapper/agnosticmapper/example/context.json -p entity -e "https://example.org/entity/" -i interest -w /tmp/myabox.ttl` |
| 251 | + |
| 252 | +#### Building standalone program with it's requirements using pyinstaller |
| 253 | + |
| 254 | +It is also possible to build a standalone program where all requirements are included and can be exeucted on the command line directly. |
| 255 | + |
| 256 | +Install requirement |
| 257 | +`pip install pyinstaller` |
| 258 | + |
| 259 | +Go to the directory of the agnosticmapper module and execute the pyinstaller |
| 260 | + |
| 261 | +``` |
| 262 | +$ cd path/to/agnosticmapper/agnosticmapper |
| 263 | +``` |
| 264 | + |
| 265 | +The structure of the directory should look like this: |
| 266 | +``` |
| 267 | +$ ls . |
| 268 | +.... |
| 269 | +agnosticmapper.py |
| 270 | +__init__.py |
| 271 | +... |
| 272 | +``` |
| 273 | + |
| 274 | +Build the standalone program file: |
| 275 | +``` |
| 276 | +python3 -m PyInstaller agnosticmapper.py |
| 277 | +``` |
| 278 | + |
| 279 | +You find the executable now in the `dist` directory. |
| 280 | + |
| 281 | +The usage and parameters are the same as the call above when you call the python program directly. |
0 commit comments