-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathContDataQC_Guide_03_BasicFunctions.Rmd
180 lines (153 loc) · 6.89 KB
/
ContDataQC_Guide_03_BasicFunctions.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
---
title: ContDataQC Guide 01 Basic Functions
author: "[email protected]"
date: "`r Sys.time()`"
output:
html_notebook:
toc: yes
depth: 3
toc_float: no
---
# Purpose
Example code for `ContDataQC` Guide 03 Basic Functions
# Explanation
The `ContDataQC` package contains many functions. This is an overview of the core (basic) functions.
# Purpose of Package
The package was designed to help users handle basic Quality Control (QC) of their continuous data logger information.
# Help
Help files are built into the code for each function.
## Vingette
The vignette pulls together all of the help examples into a single document.
```{r Help_Vignette, eval=FALSE}
library(ContDataQC)
vignette("ContDataQC_Vignette",package="ContDataQC")
```
## List of All Functions
```{r Help_Index, eval=FALSE}
library(ContDataQC)
help(package="ContDataQC")
```
List of all objects in package.
```{r PackageListObjects, echo=TRUE}
ls("package:ContDataQC")
```
# Basic Functions
Examples can be accessed from the help index (above) or with the help search function "?" (see code below).
## ContDataQC
The ContDataQC function takes as input a data frame of continuous data. Most data loggers will output such a file. The examples use the output from Hobo data loggers.
The function can perform multiple sub-function that are intended to be run in sequence.
* GetGageData
+ Downloads to a file continuous data from USGS gages.
* QCRaw
+ Applies basic QC checks on unmodified data.
* Aggregate
+ This function is used to create a single file with a specified date range. It combines data from multiple files and/or trims data from files.
+ These files can then be used for summary statistics.
* SummaryStats
+ Generates outputs of basic statistics for the given file.
The output of each sub-function is different but is generally a new data file and a QC Report.
The QCRaw, Aggregate, and SummaryStats functions are intended to be run on a directory of files and the function performs operations based on the function inputs and file names. However, there are "file" versions of the same functions that will work on a vector of filenames.
```{r Example_ContDataQC, eval=FALSE}
library(ContDataQC)
?ContDataQC
```
All of the examples use data files that ship with the package but they also assume that certain directories are on the user's computer.
* Data1_RAW
* Data2_QC
* Data3_Aggregate
* Data4_SummaryStats
It is also intended that users proceed from one step to the next. For advanced users it is possible to skip steps but your data must be in the expected format for the functions to work as intended.
To create the directories and example data run the code below.
```{r Example_SetUp, eval=FALSE}
# Examples of each operation
# Parameters
Selection.Operation <- c("GetGageData","QCRaw", "Aggregate", "SummaryStats")
Selection.Type <- c("Air","Water","AW","Gage","AWG","AG","WG")
Selection.SUB <- c("Data1_RAW","Data2_QC","Data3_Aggregated","Data4_Stats")
myDir.BASE <- getwd()
# Create data directories
myDir.create <- paste0("./",Selection.SUB[1])
ifelse(dir.exists(myDir.create)==FALSE,dir.create(myDir.create),"Directory already exists")
myDir.create <- paste0("./",Selection.SUB[2])
ifelse(dir.exists(myDir.create)==FALSE,dir.create(myDir.create),"Directory already exists")
myDir.create <- paste0("./",Selection.SUB[3])
ifelse(dir.exists(myDir.create)==FALSE,dir.create(myDir.create),"Directory already exists")
myDir.create <- paste0("./",Selection.SUB[4])
ifelse(dir.exists(myDir.create)==FALSE,dir.create(myDir.create),"Directory already exists")
# Save example data (assumes directory ./Data1_RAW/ exists)
myData <- data_raw_test2_AW_20130426_20130725
write.csv(myData,paste0("./",Selection.SUB[1],"/test2_AW_20130426_20130725.csv"))
myData <- data_raw_test2_AW_20130725_20131015
write.csv(myData,paste0("./",Selection.SUB[1],"/test2_AW_20130725_20131015.csv"))
myData <- data_raw_test2_AW_20140901_20140930
write.csv(myData,paste0("./",Selection.SUB[1],"/test2_AW_20140901_20140930.csv"))
myData <- data_raw_test4_AW_20160418_20160726
write.csv(myData,paste0("./",Selection.SUB[1],"/test4_AW_20160418_20160726.csv"))
myFile <- "config.TZ.Central.R"
file.copy(file.path(path.package("ContDataQC"),"extdata",myFile)
,file.path(getwd(),Selection.SUB[1],myFile))
```
### GetGageData
Download USGS gage data for a specified time period.
```{r Example_GetGageData, eval=FALSE}
myDir.BASE <- getwd()
# Get Gage Data
myData.Operation <- "GetGageData" #Selection.Operation[1]
myData.SiteID <- "01187300" # Hubbard River near West Hartland, CT
myData.Type <- "Gage" #Selection.Type[4]
myData.DateRange.Start <- "2013-01-01"
myData.DateRange.End <- "2014-12-31"
myDir.import <- ""
myDir.export <- file.path(myDir.BASE, "Data1_RAW") #Selection.SUB[1]
ContDataQC(myData.Operation, myData.SiteID, myData.Type, myData.DateRange.Start
, myData.DateRange.End, myDir.import, myDir.export)
```
### QCRaw
Apply flags to data based on QC tests.
```{r Example_QCRaw, eval=FALSE}
myDir.BASE <- getwd()
# QC Raw Data
myData.Operation <- "QCRaw" #Selection.Operation[2]
myData.SiteID <- "test2"
myData.Type <- "AW" #Selection.Type[3]
myData.DateRange.Start <- "2013-01-01"
myData.DateRange.End <- "2014-12-31"
myDir.import <- file.path(myDir.BASE, "Data1_RAW") #Selection.SUB[1]
myDir.export <- file.path(myDir.BASE, "Data2_QC") #Selection.SUB[2]
myReport.format <- "docx"
ContDataQC(myData.Operation, myData.SiteID, myData.Type, myData.DateRange.Start
, myData.DateRange.End, myDir.import, myDir.export
, fun.myReport.format=myReport.format)
```
### Aggregate
Combine or split data based on date range.
```{r Example_Aggregate, eval=FALSE}
myDir.BASE <- getwd()
# Aggregate Data
myData.Operation <- "Aggregate" #Selection.Operation[3]
myData.SiteID <- "test2"
myData.Type <- "AW" #Selection.Type[3]
myData.DateRange.Start <- "2013-01-01"
myData.DateRange.End <- "2014-12-31"
myDir.import <- file.path(myDir.BASE, "Data2_QC") #Selection.SUB[2]
myDir.export <- file.path(myDir.BASE, "Data3_Aggregated") #Selection.SUB[3]
#Leave off myReport.format and get default (html).
ContDataQC(myData.Operation, myData.SiteID, myData.Type, myData.DateRange.Start
, myData.DateRange.End, myDir.import, myDir.export)
```
### SummaryStats
Calculate statistics.
```{r Example_SummaryStats, eval=FALSE}
myDir.BASE <- getwd()
# Summary Stats
myData.Operation <- "SummaryStats" #Selection.Operation[4]
myData.SiteID <- "test2"
myData.Type <- "AW" #Selection.Type[3]
myData.DateRange.Start <- "2013-01-01"
myData.DateRange.End <- "2014-12-31"
myDir.import <- file.path(myDir.BASE, "Data3_Aggregated") # Selection.SUB[3]
myDir.export <- file.path(myDir.BASE, "Data4_Stats") # Selection.SUB[4]
#Leave off myReport.format and get default (html).
ContDataQC(myData.Operation, myData.SiteID, myData.Type, myData.DateRange.Start
, myData.DateRange.End, myDir.import, myDir.export)
```