Import functionality

Welcome Forums nodegoat user forum Import functionality

Tagged: 

This topic contains 14 replies, has 4 voices, and was last updated by  SarahE 9 months ago.

Viewing 15 posts - 1 through 15 (of 15 total)
  • Author
    Posts
  • #68608

    speter
    Member

    Hi,

    I’m trying to use the import functionality nodegoat offers, but I don’t quite get it.
    I’ve seen there is no video tutorial yet, and the information in the FAQ concerning the import is not detailed enough to help me with my issues.
    Is there another documentation resource, I could seek instruction from?

    Best regards,
    Sven

    #68610

    nodegoat
    Keymaster

    We hope to update the import process in the near future to make it more user friendly. For now, we can give you a couple of tips that should get you started.

    First, create a data design that reflects the needs of your project and manually enter a few Objects that cover the range of differences within your data set. This will give you a good idea of how to best prepare your import process.

    Secondly, prepare the file(s) you want to import. Currently, the best option is to go for a csv file structured like this:

    family_name;given_name;dob;dod;pob;pod
    Štúr;Ľudovít;28-10-1815;12-1-1856;Uhrovec;Modra
    Караџић;Вук;6-11-1787;7-2-1864;Tršić;Vienna
    Grimm;Jacob;4-1-1785;20-9-1863;Hanau;Berlin

    Or a json file structured like this:

    [{
    "family_name": "Štúr",
    "given_name" : "Ľudovít",
    "dob" : "28-10-1815",
    "dod" : "12-1-1856",
    "pob" : "Uhrovec",
    "pod" : "Modra"
    },{
    "family_name": "Караџић",
    "given_name" : "Вук",
    "dob" : "6-11-1787",
    "dod" : "7-2-1864",
    "pob" : "Tršić",
    "pod" : "Vienna"
    },{
    "family_name": "Grimm",
    "given_name" : "Jacob",
    "dob" : "4-1-1785",
    "dod" : "20-9-1863",
    "pob" : "Hanau",
    "pod" : "Berlin"
    }]

    For the creation of UTF-8 encoded csv files, we recommend using LibreOffice. For creating json files easily, OpenRefine can be really helpful. Or use xslt to transform your XML files into one of these structures.

    Once you’ve uploaded this file as a ‘Source’ in the nodegoat import module, you can use its structure when you create a new ‘Import Template’ (we’ve separated the source files from the import templates, so you can use multiple similar source files while reusing one import template).

    You can then map the structure of your source file to your data design in nodegoat. Like this:

    nodegoat Import

    After you’ve saved this template, you can run it. Select the relevant source file and use the tick boxes to specify, for example, whether you want to instantly ‘Create new Objects’. If you do so and hit ‘Next’, the template will run and instantly add new Objects to nodegoat. If you don’t check the box to instantly ‘Create new Objects’, the template will stop at every row and lets you evaluate what you want to do with this row (add or discard).

    While you are instantly creating new Objects, the template halts if it finds any ambiguity. In example we use above, we are referencing Objects of cities that are available in the nodegoat Type ‘City’ by means of strings like ‘Vienna’. As there are many Viennas in the world and nodegoat does not know (by means of this string only) which Vienna you mean, the import process pauses and you have the ability to select which Vienna you mean. You can then save this ‘string to Object pair’ by selecting the checkbox at the end of this line:

    nodegoat Import
    After you’ve hit ‘Commit’, the run continues.

    Importing relational data can best be achieved by importing data per Type. First you import a list of occupations in the Classification ‘Occupation’. Once you start importing persons, you just map the column in the source file where the occupation of the person is specified to the Object Description of the Type Person that relates to the classification Occupation. The import process performs a ‘quick search’ in the related Type, so check the quick search settings for this Type (you can set these in your Data Design using the Object Descriptions).

    Most of the options you can set for each run are to be used when you are enriching or updating data that is already present in nodegoat. You use these in combination with the three tick boxes at the end of each line that you see when you edit your import template. If you first import a list of persons and include an identifier (one or your own, or even better and ID/URI of an authority file like VIAF or WIKIDATA), you can use this identifier in a second run to find the relevant Objects again that are already in nodegoat (tick the box ‘Use data from this column to filter similar Objects.’ in your import template). In your run, you would then select ‘Disallow Object Creation’ and check instantly ‘Append new Object Descriptions’ to import additional information.

    We hope this helps! Please ask if you have any other questions.

    #68625

    speter
    Member

    Thanks a lot for the detailled description. It works and it works really well!

    #68629

    tuba_nick
    Member

    Hello,

    I’ve gotten up to the point where I am importing the information, but I have yet found out how to get the imported data into “Data” section. Once we run the template, the data should be in the “Data” section, correct? Or are there other steps to follow? Sorry if this question seems redundant, I’m just trying to get this working without entering in 325 lines of data.

    Thanks!

    #68630

    nodegoat
    Keymaster

    Yes, your imported data will become available in the ‘Data’ section.

    The best way to test this is to set up your template and run it without any settings set in the run (click the grey button ‘run template’, select your source file and click ‘Next’). The process will just go to the first row in your source file and pause there, showing you all the data from your source file assigned to their corresponding Object Descriptions/Sub-Objects in nodegoat.

    Now you can click ‘Add’ or ‘Discard’. If you click Add, a new Object will be created that should show up in the Data section. If this is not the case, you will need to review your template/source file to see if everything is configured correctly. If the data does show up, you can exit the run and run the template again with the option to instantly ‘Create new Objects’ checked.

    Does that work for you?

    #68894

    SarahE
    Member

    Hello!

    I would like to import sub-objects to my biographical database (birth and death). I made a csv file with a couple of existing object descriptions in order to be able to match the imported data to the existing objects. How do I make sure the import function adds new subobjects rather than overwrite existing objects? I tried for just one object to give it a try and the former object was replace with a new one, deleting existing data.

    Many thanks in advance!

    Sarah

    #68960

    nodegoat
    Keymaster

    Hi! The best way to do this is to work with some kind of unique identifiers for your objects, so you can link new data from a CSV to objects in nodegoat without having to worry about disambiguation. You can use nodegoat IDs for this (to quickly get a list of these, you get an export of the relevant objects and include the IDs there). You can also enter an ID of your own, or preferably (if possible for your data) a pre-existing identifier like VIAF in an object description (https://nodegoat.net/blog.s/12/linked-data-vs-curation-island). Once you have entered one of these IDs in a column you can use this to find the objects you want to enrich with new sub-objects.

    Your CSV would look something like this:

    id;dob;dod;pob;pod
    234;28-10-1815;12-1-1856;Uhrovec;Modra
    631;6-11-1787;7-2-1864;Tršić;Vienna
    135;4-1-1785;20-9-1863;Hanau;Berlin

    In your Type ‘Person’ you have created an object description that contains the IDs that are also present in the CSV file.

    Now you upload the CSV as a source file and you create a new Import Template (like the example above) where you map the column that contains the IDs in the CSV to the object description in your Type ‘Person’ that contains the IDs in nodegoat. If you have used your own IDs you tick the first checkbox ‘Use data from this column to filter similar Objects.’ If you use nodegoat IDs you don’t select an object description and you check the second checkbox ‘Use data from this column to establish a relation via a pre-existing nodegoat object ID.’

    If you now run this template, without selecting any of the ‘instant’ options, the import process should find the first object and present you with the new data that you can append or discard. If no object is found, check if it is searchable through ‘quick search’ in your data design.

    If you want to run this instantly, you can check the box ‘Disallow Object Creation’ since you will be only appending new data to existing objects. And you check the checkbox ‘Append new Sub-Objects’.

    We recommend to first test your setup by running it manually before you run it instantly. It is also good to test your workflow first with a smaller CSV file so you don’t end up with hundreds of objects that have incorrect data.

    Does this help you?

    #69006

    SarahE
    Member

    Hi!
    I have tried several times, using just a couple of my objects, to try and test it, without any luck.
    I’ve tried using names (object descriptions) to filter to similar objects, as well as adding the nodegoat identifier to both the csv and the object descriptions and establishing a relation to a similar nodegoat object id but it does not find it. I do find it with the quick search but it makes the import very much more time-consuming.
    On the other hand, I’ve tried linking the source of my data using nodegoat IDs and it worked straight away. SO I have no idea what I’m doing wrong…

    Any idea?

    Many thanks, again, for your help and support.

    Sarah

    #69009

    nodegoat
    Keymaster

    Okay! It seems our last answer was not complete. To filter on objects with a nodegoat ID you should select the relevant type, leave the object description empty and tick both ‘Use data from this column to filter similar Objects.’ and ‘Use data from this column to establish a relation via a pre-existing nodegoat object ID.’. So the first two checkboxes. That should do the trick! Sorry for the inconvenience.

    #69011

    SarahE
    Member

    Hey! I’m very very sorry but there is no change here when I try this…

    #69012

    nodegoat
    Keymaster

    Ok! Can you perhaps share a screenshot of your import template, so we can see how it’s currently configured?

    #69013

    SarahE
    Member

    Sorry but- how can I upload the screen shots on the forum?

    #69015

    SarahE
    Member

    Hey! I managed in the end.

    Here is what my csv looks like

    Csv birth death SarahE

    Here is the import template


    https://drive.google.com/open?id=0B4hG_Rw_TCbyby10cU96WGJydGc

    And what I get once I try

    https://drive.google.com/open?id=0B4hG_Rw_TCbySHJhRDBVWVg0OXM

    https://drive.google.com/open?id=0B4hG_Rw_TCbyU0pIRzdKeWpDeDA

    Many thanks again!

    • This reply was modified 9 months, 1 week ago by  SarahE.
    • This reply was modified 9 months, 1 week ago by  SarahE.
    • This reply was modified 9 months, 1 week ago by  SarahE.
    • This reply was modified 9 months, 1 week ago by  SarahE.
    #69020

    nodegoat
    Keymaster

    Ok! Thanks for this! What happens if you select the relevant type for the column containing the IDs (so for the column ‘ID’ you select ‘People’) and leave the object description empty?

    #69022

    SarahE
    Member

    Yes! I now get a little notification saying “1 similar object”, if I click on it I can check it’s the right person and then “append to similar”.

    Thank you SO MUCH for your help!

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic.

Network analysis in the historical disciplines