Data Ingest
Overview of where and how data is ingested including procedures for setting up staging points.
The archive ingests data from two locations - live data from the WASTAC downlinks and historical data from the NASA archive.
WASTAC Ingest
The eventual resting place for data is the access point to the iVec petascale data store, which is accessed over the network via pbstore.ivec.org.
The WASTAC reception facility at Murdoch transfers scenes to iVec over the network to pbstore.ivec.org. Two staging servers are available to act as backup buffers should the petascale storage server be inaccessiable for some reason. For additional, resillience one of these staging servers is separate from the iVec facility.
A script (copytoivec.sh) is responsible running the transfers from the murdoch server and staging servers. Information on how to configure and health check it are detailed below.
Scp was selected as the transport because it is the lowest common demoninator protocol available across multiple drop off points. A future improvement may be to use rsync over ssh.
If a transfer from the Murdoch down link fails to succeed, it will not automatically be entered into the archive and would have to be recovered manually from the main WASTAC archive.
Ingest Servers
| FQDN (IP) | Location | Function | Ingest Directory | Ingest Account | Key |
|---|---|---|---|---|---|
| pbstore.ivec.org (192.65.130.205) |
iVec | Long term storage | /pbstore/wastac/incoming | hlynch | n/a |
| wastac-4240.ivec.org (192.65.130.242) |
iVec | First Buffer | /export/incoming | hlynch | public key |
| TBA | Curtin | Second Buffer | TBA | markg | |
| wastac.murdoch.edu.au (134.115.224.66) |
Murdoch | Receiving station | TBA | terascan | public key |
Setting Up copytoivec.sh
copytoivec.sh requires trusted ssh keys to operate.
Several people are involved in configuring a drop off point. In the steps below, DA refers to a user of down-stream systems (rows 1-3) and UA refers to users of the upstream data sources (rows 2-4).
The steps involved are as follows;
- UA - download copytoivec.sh
- UA - run copytoivec.sh first time to create a trusted key
- UA - email the public part of the key to contacts for all of the forward drop off points
- DA - configure key as trusted for scp operations
- UA - wait for the contacts to confirm that they have configured your key as trusted
- UA - verify that authentication works
- UA - configure a job to routinely run copytoivec.sh to transfer files as they arrive
1. UA Download copytoivec.sh
$ wget https://wastac.ivec.org/wastacsvn/scripts/trunk/copytoivec/copytoivec.sh
$ chmod +x copytoivec.sh
2. UA Run copytoivec.sh first time to create a trusted key
NB This must be completed using the user id that you intend to use to receive and transfer files.
myhost$ ./copytoivec.sh First time initialization. ... creating wastac identity ... email this key to xxx@obscured.com to configure authorisation ... then reattempt this script. ... full contents in /home/me/.ssh/id_dsa_ivec.pub ssh-dss AAAAB3NzaC1kc3MAAACBAKphSLPjR556/wz+4p/Ovn5/DIIYUjujsXqG71mCZwXGQOISxsFlBbC3Ucg8zE 0CqjLbvXXO/SBC4GllVpzkU50hQOSinNDxd9UVF8cdbfUrBaxv0R4tdyw3TysaZGKiqLfTSGUC7wzn+lnwICc36W+1 0n8/ZWV+GCh9q2xLsbURAAAAFQD3rCMKTvgaQ9YpGrJfTlcGkbW0XQAAAIEAiinXer/eJEHjB6JwQrNzcArPB5Qbl8 DFTMUrOh2it00gS4sI7Txcvlvzmp/wDu7jUNWyFjhs4IEd7PLn/q90HBqQI3HzSNcx+tvSN/NvEyUvb2Zc9UKf2Wkg 92S/qRundXpJj4c4LnMhTlJVN1Y0bDrv5PsE5rsrliLaRiLNFUoAAACBAJtb4kNKXIATTPJ1YQgQo0kw3+9rS9xpIb fRiRkL63pjDwnOXpyvNZWSo4xQs9WTJJLN+tdfEqFkpH/gojknt4NwZIRQs7sQQD5PlQjFChECNn4ghdZuK1AKJvKK naEaEQi1v7TKyOHN9AZPWLD4hvKO1j4CZ3nlLhe+U/t+G269 Me@myhost
3. UA Email Public Key
Email text printed by copytoivec.sh to all of the upsteam administrators, requesting them to configure it for trusted scp.
4. DA Configure Key As Trusted For SCP Operations
Edit the authorised keys file of the ingest account (ie $HOME/.ssh/authorized_keys2) and add the key in a format similar that as follows.
# downstream trust for xxx@xxx.xxx for copytoivec.sh
from="wastac.murdoch.edu.au,134.115.224.66",command="scp -v -p -t /export/incoming" ssh-dss AAAAB3N
zaC1kc3MAAACBAKphSLPjR556/wz+4p/Ovn5/DIIYUjujsXqG71mCZwXGQOISxsFlBbC3Ucg8zE0CqjLbvXXO/SBC4GllVpzkU50hQOSi
nNDxd9UVF8cdbfUrBaxv0R4tdyw3TysaZGKiqLfTSGUC7wzn+lnwICc36W+10n8/ZWV+GCh9q2xLsbURAAAAFQD3rCMKTvgaQ9YpGrJfT
lcGkbW0XQAAAIEAiinXer/eJEHjB6JwQrNzcArPB5Qbl8DFTMUrOh2it00gS4sI7Txcvlvzmp/wDu7jUNWyFjhs4IEd7PLn/q90HBqQI3
HzSNcx+tvSN/NvEyUvb2Zc9UKf2Wkg92S/qRundXpJj4c4LnMhTlJVN1Y0bDrv5PsE5rsrliLaRiLNFUoAAACBAJtb4kNKXIATTPJ1YQg
Qo0kw3+9rS9xpIbfRiRkL63pjDwnOXpyvNZWSo4xQs9WTJJLN+tdfEqFkpH/gojknt4NwZIRQs7sQQD5PlQjFChECNn4ghdZuK1AKJvKK
naEaEQi1v7TKyOHN9AZPWLD4hvKO1j4CZ3nlLhe+U/t+G269 Me@myhost
Notes:
- the above must cover 2 lines only. the first line is a comment and the remaining text is all one line.
- replace the key (ssh-dss onwards) with the information you received by email
- change the directory used to copy files into (/export/incoming) as appropriate for your system
- consider adding from=ip.ip.ip.ip to limit the source address from which this key can be used
