Skip to content
This repository was archived by the owner on Oct 18, 2019. It is now read-only.

How to submit data to IOOS? #364

Open
robragsdale opened this issue Feb 18, 2015 · 1 comment
Open

How to submit data to IOOS? #364

robragsdale opened this issue Feb 18, 2015 · 1 comment

Comments

@robragsdale
Copy link

This email thread captures how RAs handle data from data collectors that do not have the personnel or resources to handle the submission process themselves. It stems from Rich Signell's post about what an organization unfamiliar with IOOS should do so that the data they collect flows into IOOS.

Rich's Post:
When folks ask "I have obs data. How can I contribute this data to
IOOS", what do we say?

For example, last week I found out (by accident) that there is a
station measuring met variables, water temperature and ADCP currents
at Thompson Island in Boston Harbor. For example:
http://thompson.nortek.no/?view=current_data

This data is not currently in IOOS (they had never heard of IOOS).

What do I tell them?

This is a small collaborative project between Nortek and the Thompson
Island School, so they don't have a lot of resources to be
implementing services as described here:
http://www.ioos.noaa.gov/data/contribute_data.html

Should I just tell them to contact NERACOOS to see if they could help
get their data flowing into IOOS somehow?

I'm posting this here just because I'm wondering what other folks do
in situations like this.

Thanks,
Rich

Regional DAC Manager responses:

PacIOOS
Hi all,

Like Maracoos, PacIOOS has a default "sure, we'll take your data" response, usually given to the contributor by someone not involved with data management. Within the data group, we try to prioritize data sets by applicability (e.g., how ocean-related is it), perceived popularity (who will use it, is it available elsewhere, etc.), and region. This last one might be specific to our region, which is wide-spread and data sparse, but a data set coming in from some remote place would get more interest than something in a region where we already have coverage.

Eoin's point about QC procedures is important. We now add "are the data QC'd and how" to our list of questions to the data provider. In the end, 99% of data that come in like this need some sort of reformatted (conversion to netCDF for example), and that is provided locally. Ideally you get a data provider who is at least a data consumer as well, better yet has a "customer base". In this way you gain not only a data set but an advocate as well. It is less compelling to manage data for someone simply so they can check a box on their NSF annual report. In this latter case we try to extoll the virtues of open data services.

Jim

MARACOOS:
Within Maracoos, we have pretty effectively used the approach “give us the data any way that you can and we’ll figure it out” – the alternate approach of asking the data providers to set up software and provide custom feeds was not productive.

This has been efficient to connect the Asset Map to sub-regional data providers, but the certification process is changing the requirements and we’re currently focused on centralization of sub regional data so qa/qc, archiving and other process can be handled by the RA.

Eoin

SECOORA:
For Secoora, like others probably, if the decision is made for us to directly convert the raw data for the provider, there is a basic decision tree/flow following what datatype it is and associated web services(thredds/ncSOS,erddap,WMS,etc) and file formats(netcdf,png,kml,etc) . That's more community(folks on this listserv) knowledge than a fixed/documented wiki page since those services,formats and associated vocabularies and encodings have their own documentation which change fairly often as well. One example netcdf decision tree is what NODC has provided with their template examples, but we also look at those in combination with what the ncSOS service needs with regards to how dimensions are structured(time outermost,etc) in combination with this requested metadata.

So from an IOOS/RA certification perspective there is documentation online, but not sure that it has been structured in a way or could be structured in a way(without it being a full-time job) that it stays current and tailored for possibly one of several more common usage scenarios for any interested data providers with varied technical skill, interest, background, etc. Technical acronyms that are familiar to folks on this listserv are usually a bewildering acronym soup to most and it's hard to determine how much to simplify things before you start losing metadata,data or functionality of interest depending on the application. Historically the DAC's(buoy, station, hfradar, glider, etc) tend to limit variables such as instrumentation type and provide funding for more operationally-centralized(but geographically dispersed) personnel,server,storage resources and application-oriented data or services - so generally I'd think it would have to be one or several people's jobs at this level to follow, document and provide hardware/software resources for the latest IOOS/RA/DAC developments/best-practices/certifications to help offload what happens currently within the RA's or otherwise. Data conversion/servicing priorities would be a consideration in any case.

@ebridger
Copy link

NERACOOS takes an approach similar to what Eoin mentions above. But NERACOOS has quite a few data providers who are funded or partially funded by NERACOOS dollars and with those we do attempt to enforce stricter standards. In the past they all implemented SOS services for ingest into the NERACOOS aggregator. Now we are moving, more slowly than we had anticipated in insisting on CF 1.6 DSG NetCDF files for timeSeries, profiles, etc. These folks will also need to implement QA/QC and send us the flags as well.
With non funded providers we still take the give us your data and we will work it out attitude. We've already contacted Nortek folks on Thompson Island with Rich's help. We've found ERDDAP a very useful tool for ingesting and serving diverse data formats without too much effort, e.g. CSV files with a little meta-data added.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants