Skip to content

Commit 4cb7d08

Browse files
committed
remove dbplyr
1 parent 207df16 commit 4cb7d08

6 files changed

+141
-69
lines changed

r/exercises_with_answers.Rmd

+2-3
Original file line numberDiff line numberDiff line change
@@ -91,12 +91,11 @@ dbDisconnect(con)
9191
```
9292

9393

94-
## Exercise: `dbplyr`
94+
## Exercise: `dplyr`
9595

96-
Connect to the dvdrental database. Repeat [Exercise: Joining and Grouping 2](https://github.com/nuitrcs/databases_workshop/blob/master/sql/part2_exercises_with_answers.md) from Part 2 using `dbplyr`.
96+
Connect to the dvdrental database. Repeat [Exercise: Joining and Grouping 2](https://github.com/nuitrcs/databases_workshop/blob/master/sql/part2_exercises_with_answers.md) from Part 2 using `dplyr`.
9797

9898
```{r, echo=TRUE, eval=TRUE}
99-
library(dbplyr)
10099
library(dplyr)
101100
```
102101

r/exercises_with_answers.html

+9-10
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
<meta name="author" content="Christina Maimone" />
1313

14-
<meta name="date" content="2019-08-09" />
14+
<meta name="date" content="2019-08-12" />
1515

1616
<title>R Database Exercises</title>
1717

@@ -377,7 +377,7 @@
377377

378378
<h1 class="title toc-ignore">R Database Exercises</h1>
379379
<h4 class="author">Christina Maimone</h4>
380-
<h4 class="date">2019-08-09</h4>
380+
<h4 class="date">2019-08-12</h4>
381381

382382
</div>
383383

@@ -425,19 +425,18 @@ <h4>Solution</h4>
425425
}
426426
selectedrows</code></pre>
427427
<pre><code>## id name
428-
## 1 269 ZNUET
429-
## 2 475 VPMBT
430-
## 3 409 AZIQX</code></pre>
428+
## 1 99 UIBJN
429+
## 2 236 YLHQP
430+
## 3 596 BYEAM</code></pre>
431431
<p>We used paste function above because we have control over offset – it would be better to use a prepared query, but since we aren’t getting input from a user, it’s not super dangerous.</p>
432432
<p>An alternative approach, which could work well if the table isn’t too big, is to retrieve all of the IDs, and then randomly sample the IDs, and retrieve just those rows.</p>
433433
<pre class="r"><code>dbDisconnect(con)</code></pre>
434434
</div>
435435
</div>
436-
<div id="exercise-dbplyr" class="section level2">
437-
<h2>Exercise: <code>dbplyr</code></h2>
438-
<p>Connect to the dvdrental database. Repeat <a href="https://github.com/nuitrcs/databases_workshop/blob/master/sql/part2_exercises_with_answers.md">Exercise: Joining and Grouping 2</a> from Part 2 using <code>dbplyr</code>.</p>
439-
<pre class="r"><code>library(dbplyr)
440-
library(dplyr)</code></pre>
436+
<div id="exercise-dplyr" class="section level2">
437+
<h2>Exercise: <code>dplyr</code></h2>
438+
<p>Connect to the dvdrental database. Repeat <a href="https://github.com/nuitrcs/databases_workshop/blob/master/sql/part2_exercises_with_answers.md">Exercise: Joining and Grouping 2</a> from Part 2 using <code>dplyr</code>.</p>
439+
<pre class="r"><code>library(dplyr)</code></pre>
441440
<div id="solution-1" class="section level4">
442441
<h4>Solution</h4>
443442
<p>Set your connection information as appropriate for the workshop:</p>

r/r_databases.Rmd

+43-14
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ library(RPostgres)
3535

3636
We connect with a function call like the following.
3737

38-
Note: this code was generated on my local machine connected to a local copy of the database.
38+
Note: this code was generated on my local machine connected to a local copy of the database. Your connection details will be different. Note I also have permissions to modify this database.
3939

4040
```{r}
4141
con <- dbConnect(RPostgres::Postgres(), host="localhost", dbname="dvdrental")
@@ -88,7 +88,7 @@ If you want part of your query to be determined by a variable -- especially if i
8888
```{r}
8989
# YES
9090
myquery <- dbSendQuery(con, "select * from actor where actor_id = $1")
91-
dbBind(myquery, list(5))
91+
dbBind(myquery, list(4))
9292
dbFetch(myquery)
9393
```
9494

@@ -124,8 +124,12 @@ Which are ok, but could get annoying.
124124

125125
If you're not a superuser on the `dvdrental` database, just try connecting to a database you can modify. Then the basic function is `dbSendQuery` for any command you want to execute where you aren't retrieving results.
126126

127+
Note that by default, statements take effect immediately - they are not in a transaction that you need to commit. To use transactions, see below.
128+
127129
```{r, eval=FALSE}
128-
dbSendQuery(con, statement="update actor set actor_id=5000 where actor_id=5")
130+
res <- dbSendQuery(con, statement="update actor set first_name='Jenn' where actor_id=4")
131+
print(res) # contains info on result of update
132+
dbClearResult(res) # prevent warning messages
129133
```
130134

131135
To create a table, you can give it a data frame
@@ -144,6 +148,28 @@ dbRemoveTable(con, "mynewtable")
144148
```
145149

146150

151+
## Transactions
152+
153+
There are also methods for managing transactions if you need: `dbBegin`, `dbRollback`, `dbCommit`. Transactions are key for when you need to be sure that a sequence of SQL commands (e.g. `UPDATE`, `CREATE`, `DROP`, `DELETE`, etc.) execute correctly before they're made permanent (i.e. "committed").
154+
155+
156+
```{r, eval=FALSE}
157+
dbBegin(con)
158+
dbWriteTable(con, "mynewtable", mytbl)
159+
dbRollback(con)
160+
dbGetQuery(con, "SELECT * FROM mynewtable")
161+
```
162+
163+
The above will produce error:
164+
165+
```
166+
Error in result_create(conn@ptr, statement) :
167+
Failed to prepare query: ERROR: relation "mynewtable" does not exist
168+
LINE 1: SELECT * FROM mynewtable
169+
```
170+
171+
because the transaction was rolled back, not committed.
172+
147173
## Close Connection
148174

149175
Connections will get closed when you quit R, but it's good practice to explicitly close them.
@@ -152,17 +178,15 @@ Connections will get closed when you quit R, but it's good practice to explicitl
152178
dbDisconnect(con)
153179
```
154180

155-
## Transactions
156181

157-
There are also methods for managing transactions if you need: `dbBegin`, `dbRollback`, `dbCommit`. Transactions are key for when you need to be sure that a sequence of SQL commands (e.g. `UPDATE`, `CREATE`, `DROP`, `DELETE`, etc.) execute correctly before they're made permanent (i.e. "committed").
158182

159183

160184
# Use `dplyr`
161185

162186
For more complete info, see the [RStudio databases site](http://db.rstudio.com/dplyr/).
163187

164188
```{r, eval=FALSE}
165-
needToInstall <- c("dbplyr", "tidyverse")
189+
needToInstall <- c("tidyverse")
166190
needToInstall <- needToInstall[which(!needToInstall %in% installed.packages())]
167191
if(length(needToInstall) > 0){
168192
sapply(needToInstall, install.packages)
@@ -172,7 +196,6 @@ if(length(needToInstall) > 0){
172196

173197
```{r, message=FALSE, warning=FALSE}
174198
library(tidyverse)
175-
library(dbplyr)
176199
```
177200

178201
First, connect like normal
@@ -193,7 +216,7 @@ If we look at this object, it doesn't have data in it:
193216
str(actortbl)
194217
```
195218

196-
It just has connection information. `dbplyr` will try to perform operations within the database where it can, instead of pulling all of the data into R.
219+
It just has connection information. `dplyr` will try to perform operations within the database where it can, instead of pulling all of the data into R.
197220

198221
Yet you can print the object and see observations:
199222

@@ -229,7 +252,7 @@ rentaltbl %>%
229252
show_query()
230253
```
231254

232-
You can use `collect` to pull down all of the data (tell `dbplyr` to stop being lazy).
255+
You can use `collect` to pull down all of the data (tell `dplyr` to stop being lazy).
233256

234257
```{r, echo=TRUE}
235258
# First, without collecting
@@ -242,14 +265,20 @@ df1
242265
Looks OK, except:
243266

244267
```{r, eval=FALSE}
245-
df[1,]
268+
df1[1,]
246269
```
247270

248271
Gives you:
249272

250-
`Error in df[1, ] : object of type 'closure' is not subsettable`
273+
`Error in df1[1, ] : incorrect number of dimensions`
274+
275+
It's the wrong dimensions because `df1` isn't actually a data.frame:
276+
277+
```{r}
278+
str(df1)
279+
```
251280

252-
Which is a strange error, but it is telling us we need to collect the data.
281+
It is telling us we need to collect the data first to actually pull it into R.
253282

254283
```{r, echo=TRUE}
255284
# Then with collecting
@@ -276,14 +305,14 @@ custtbl %>%
276305
```
277306

278307

279-
You could create a table with `copy_to` (if you have write permissions)
308+
You could create a table with `copy_to` (if you have the correct permissions)
280309

281310
```{r, scho=TRUE, eval=FALSE}
282311
mytbl <-data.frame(number=1:10 , letter=LETTERS[1:10])
283312
copy_to(con, mytbl, "mynewtable")
284313
```
285314

286-
By default, it creates a temporary table. But this is a setting you can change, and you can also specify what columns to index on the table.
315+
By default, it creates a **temporary** table. But this is a setting you can change, and you can also specify what columns to index on the table.
287316

288317

289318
Disconnect like we normally do

0 commit comments

Comments
 (0)