Skip to content

Commit ac84d14

Browse files
committed
feat: infer uid/gid instead of accepting as options
BREAKING CHANGE: the uid gid options are no longer respected or necessary. As of this change, cacache will always match the cache contents to the ownership of the cache directory (or its parent directory), regardless of what the caller passes in. Reasoning: The number one reason to use a uid or gid option was to keep root-owned files from causing problems in the cache. In npm's case, this meant that CLI's ./lib/command.js had to work out the appropriate uid and gid, then pass it to the libnpmcommand module, which had to in turn pass the uid and gid to npm-registry-fetch, which then passed it to make-fetch-happen, which passed it to cacache. (For package fetching, pacote would be in that mix as well.) Added to that, `cacache.rm()` will actually _write_ a file into the cache index, but has no way to accept an option so that its call to entry-index.js will write the index with the appropriate uid/gid. Little ownership bugs were all over the place, and tricky to trace through. (Why should make-fetch-happen even care about accepting or passing uids and gids? It's an http library.) This change allows us to keep the cache from having mixed ownership in any situation. Of course, this _does_ mean that if you have a root-owned but user-writable folder (for example, `/tmp`), then the cache will try to chown everything to root. The solution is for the user to create a folder, make it user-owned, and use that, rather than relying on cacache to create the root cache folder. If we decide to restore the uid/gid opts, and use ownership inferrence only when uid/gid are unset, then take care to also make rm take an option object, and pass it through to entry-index.js.
1 parent 676cb32 commit ac84d14

9 files changed

+232
-90
lines changed

README.md

+34-17
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,16 @@
1-
# cacache [![npm version](https://img.shields.io/npm/v/cacache.svg)](https://npm.im/cacache) [![license](https://img.shields.io/npm/l/cacache.svg)](https://npm.im/cacache) [![Travis](https://img.shields.io/travis/zkat/cacache.svg)](https://travis-ci.org/zkat/cacache) [![AppVeyor](https://ci.appveyor.com/api/projects/status/github/zkat/cacache?svg=true)](https://ci.appveyor.com/project/zkat/cacache) [![Coverage Status](https://coveralls.io/repos/github/zkat/cacache/badge.svg?branch=latest)](https://coveralls.io/github/zkat/cacache?branch=latest)
1+
# cacache [![npm version](https://img.shields.io/npm/v/cacache.svg)](https://npm.im/cacache) [![license](https://img.shields.io/npm/l/cacache.svg)](https://npm.im/cacache) [![Travis](https://img.shields.io/travis/npm/cacache.svg)](https://travis-ci.org/npm/cacache) [![AppVeyor](https://ci.appveyor.com/api/projects/status/github/npm/cacache?svg=true)](https://ci.appveyor.com/project/npm/cacache) [![Coverage Status](https://coveralls.io/repos/github/npm/cacache/badge.svg?branch=latest)](https://coveralls.io/github/npm/cacache?branch=latest)
22

3-
[`cacache`](https://github.com/zkat/cacache) is a Node.js library for managing
3+
[`cacache`](https://github.com/npm/cacache) is a Node.js library for managing
44
local key and content address caches. It's really fast, really good at
55
concurrency, and it will never give you corrupted data, even if cache files
66
get corrupted or manipulated.
77

8-
It was originally written to be used as [npm](https://npm.im)'s local cache, but
9-
can just as easily be used on its own.
8+
On systems that support user and group settings on files, cacache will
9+
match the `uid` and `gid` values to the folder where the cache lives, even
10+
when running as `root`.
11+
12+
It was written to be used as [npm](https://npm.im)'s local cache, but can
13+
just as easily be used on its own.
1014

1115
_Translations: [español](README.es.md)_
1216

@@ -414,13 +418,6 @@ may also use any anagram of `'modnar'` to use this feature.
414418
Currently only supports one algorithm at a time (i.e., an array length of
415419
exactly `1`). Has no effect if `opts.integrity` is present.
416420

417-
##### `opts.uid`/`opts.gid`
418-
419-
If provided, cacache will do its best to make sure any new files added to the
420-
cache use this particular `uid`/`gid` combination. This can be used,
421-
for example, to drop permissions when someone uses `sudo`, but cacache makes
422-
no assumptions about your needs here.
423-
424421
##### `opts.memoize`
425422

426423
Default: null
@@ -498,10 +495,11 @@ Completely resets the in-memory entry cache.
498495
Returns a unique temporary directory inside the cache's `tmp` dir. This
499496
directory will use the same safe user assignment that all the other stuff use.
500497

501-
Once the directory is made, it's the user's responsibility that all files within
502-
are made according to the same `opts.gid`/`opts.uid` settings that would be
503-
passed in. If not, you can ask cacache to do it for you by calling
504-
[`tmp.fix()`](#tmp-fix), which will fix all tmp directory permissions.
498+
Once the directory is made, it's the user's responsibility that all files
499+
within are given the appropriate `gid`/`uid` ownership settings to match
500+
the rest of the cache. If not, you can ask cacache to do it for you by
501+
calling [`tmp.fix()`](#tmp-fix), which will fix all tmp directory
502+
permissions.
505503

506504
If you want automatic cleanup of this directory, use
507505
[`tmp.withTmp()`](#with-tpm)
@@ -514,6 +512,27 @@ cacache.tmp.mkdir(cache).then(dir => {
514512
})
515513
```
516514

515+
#### <a name="tmp-fix"></a> `> tmp.fix(cache) -> Promise`
516+
517+
Sets the `uid` and `gid` properties on all files and folders within the tmp
518+
folder to match the rest of the cache.
519+
520+
Use this after manually writing files into [`tmp.mkdir`](#tmp-mkdir) or
521+
[`tmp.withTmp`](#with-tmp).
522+
523+
##### Example
524+
525+
```javascript
526+
cacache.tmp.mkdir(cache).then(dir => {
527+
writeFile(path.join(dir, 'file'), someData).then(() => {
528+
// make sure we didn't just put a root-owned file in the cache
529+
cacache.tmp.fix().then(() => {
530+
// all uids and gids match now
531+
})
532+
})
533+
})
534+
```
535+
517536
#### <a name="with-tmp"></a> `> tmp.withTmp(cache, opts, cb) -> Promise`
518537

519538
Creates a temporary directory with [`tmp.mkdir()`](#tmp-mkdir) and calls `cb`
@@ -591,8 +610,6 @@ of entries removed, etc.
591610

592611
##### Options
593612

594-
* `opts.uid` - uid to assign to cache and its contents
595-
* `opts.gid` - gid to assign to cache and its contents
596613
* `opts.filter` - receives a formatted entry. Return false to remove it.
597614
Note: might be called more than once on the same entry.
598615

lib/content/write.js

+3-3
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ function pipeToTmp (inputStream, cache, tmpTarget, opts, errCheck) {
121121
function makeTmp (cache, opts) {
122122
const tmpTarget = uniqueFilename(path.join(cache, 'tmp'), opts.tmpPrefix)
123123
return fixOwner.mkdirfix(
124-
path.dirname(tmpTarget), opts.uid, opts.gid
124+
cache, path.dirname(tmpTarget)
125125
).then(() => ({
126126
target: tmpTarget,
127127
moved: false
@@ -134,14 +134,14 @@ function moveToDestination (tmp, cache, sri, opts, errCheck) {
134134
const destDir = path.dirname(destination)
135135

136136
return fixOwner.mkdirfix(
137-
destDir, opts.uid, opts.gid
137+
cache, destDir
138138
).then(() => {
139139
errCheck && errCheck()
140140
return moveFile(tmp.target, destination)
141141
}).then(() => {
142142
errCheck && errCheck()
143143
tmp.moved = true
144-
return fixOwner.chownr(destination, opts.uid, opts.gid)
144+
return fixOwner.chownr(cache, destination)
145145
})
146146
}
147147

lib/entry-index.js

+5-7
Original file line numberDiff line numberDiff line change
@@ -32,9 +32,7 @@ module.exports.NotFoundError = class NotFoundError extends Error {
3232

3333
const IndexOpts = figgyPudding({
3434
metadata: {},
35-
size: {},
36-
uid: {},
37-
gid: {}
35+
size: {}
3836
})
3937

4038
module.exports.insert = insert
@@ -49,7 +47,7 @@ function insert (cache, key, integrity, opts) {
4947
metadata: opts.metadata
5048
}
5149
return fixOwner.mkdirfix(
52-
path.dirname(bucket), opts.uid, opts.gid
50+
cache, path.dirname(bucket)
5351
).then(() => {
5452
const stringified = JSON.stringify(entry)
5553
// NOTE - Cleverness ahoy!
@@ -63,7 +61,7 @@ function insert (cache, key, integrity, opts) {
6361
bucket, `\n${hashEntry(stringified)}\t${stringified}`
6462
)
6563
}).then(
66-
() => fixOwner.chownr(bucket, opts.uid, opts.gid)
64+
() => fixOwner.chownr(cache, bucket)
6765
).catch({ code: 'ENOENT' }, () => {
6866
// There's a class of race conditions that happen when things get deleted
6967
// during fixOwner, or between the two mkdirfix/chownr calls.
@@ -86,13 +84,13 @@ function insertSync (cache, key, integrity, opts) {
8684
size: opts.size,
8785
metadata: opts.metadata
8886
}
89-
fixOwner.mkdirfix.sync(path.dirname(bucket), opts.uid, opts.gid)
87+
fixOwner.mkdirfix.sync(cache, path.dirname(bucket))
9088
const stringified = JSON.stringify(entry)
9189
fs.appendFileSync(
9290
bucket, `\n${hashEntry(stringified)}\t${stringified}`
9391
)
9492
try {
95-
fixOwner.chownr.sync(bucket, opts.uid, opts.gid)
93+
fixOwner.chownr.sync(cache, bucket)
9694
} catch (err) {
9795
if (err.code !== 'ENOENT') {
9896
throw err

lib/util/fix-owner.js

+69-37
Original file line numberDiff line numberDiff line change
@@ -5,83 +5,115 @@ const BB = require('bluebird')
55
const chownr = BB.promisify(require('chownr'))
66
const mkdirp = BB.promisify(require('mkdirp'))
77
const inflight = require('promise-inflight')
8+
const inferOwner = require('./infer-owner.js')
9+
10+
// Memoize getuid()/getgid() calls.
11+
// patch process.setuid/setgid to invalidate cached value on change
12+
const self = { uid: null, gid: null }
13+
const getSelf = () => {
14+
if (typeof self.uid !== 'number') {
15+
self.uid = process.getuid()
16+
const setuid = process.setuid
17+
process.setuid = (uid) => {
18+
self.uid = null
19+
process.setuid = setuid
20+
return process.setuid(uid)
21+
}
22+
}
23+
if (typeof self.gid !== 'number') {
24+
self.gid = process.getgid()
25+
const setgid = process.setgid
26+
process.setgid = (gid) => {
27+
self.gid = null
28+
process.setgid = setgid
29+
return process.setgid(gid)
30+
}
31+
}
32+
}
833

934
module.exports.chownr = fixOwner
10-
function fixOwner (filepath, uid, gid) {
35+
function fixOwner (cache, filepath) {
1136
if (!process.getuid) {
1237
// This platform doesn't need ownership fixing
1338
return BB.resolve()
1439
}
15-
if (typeof uid !== 'number' && typeof gid !== 'number') {
16-
// There's no permissions override. Nothing to do here.
17-
return BB.resolve()
18-
}
19-
if ((typeof uid === 'number' && process.getuid() === uid) &&
20-
(typeof gid === 'number' && process.getgid() === gid)) {
40+
return inferOwner(cache).then(owner => {
41+
const { uid, gid } = owner
42+
getSelf()
43+
2144
// No need to override if it's already what we used.
22-
return BB.resolve()
23-
}
24-
return inflight(
25-
'fixOwner: fixing ownership on ' + filepath,
26-
() => chownr(
27-
filepath,
28-
typeof uid === 'number' ? uid : process.getuid(),
29-
typeof gid === 'number' ? gid : process.getgid()
30-
).catch({ code: 'ENOENT' }, () => null)
31-
)
45+
if (self.uid === uid && self.gid === gid) {
46+
return
47+
}
48+
49+
return inflight(
50+
'fixOwner: fixing ownership on ' + filepath,
51+
() => chownr(
52+
filepath,
53+
typeof uid === 'number' ? uid : self.uid,
54+
typeof gid === 'number' ? gid : self.gid
55+
).catch({ code: 'ENOENT' }, () => null)
56+
)
57+
})
3258
}
3359

3460
module.exports.chownr.sync = fixOwnerSync
35-
function fixOwnerSync (filepath, uid, gid) {
61+
function fixOwnerSync (cache, filepath) {
3662
if (!process.getuid) {
3763
// This platform doesn't need ownership fixing
3864
return
3965
}
40-
if (typeof uid !== 'number' && typeof gid !== 'number') {
41-
// There's no permissions override. Nothing to do here.
42-
return
43-
}
44-
if ((typeof uid === 'number' && process.getuid() === uid) &&
45-
(typeof gid === 'number' && process.getgid() === gid)) {
66+
const { uid, gid } = inferOwner.sync(cache)
67+
getSelf()
68+
if (self.uid === uid && self.gid === gid) {
4669
// No need to override if it's already what we used.
4770
return
4871
}
4972
try {
5073
chownr.sync(
5174
filepath,
52-
typeof uid === 'number' ? uid : process.getuid(),
53-
typeof gid === 'number' ? gid : process.getgid()
75+
typeof uid === 'number' ? uid : self.uid,
76+
typeof gid === 'number' ? gid : self.gid
5477
)
5578
} catch (err) {
79+
// only catch ENOENT, any other error is a problem.
5680
if (err.code === 'ENOENT') {
5781
return null
5882
}
83+
throw err
5984
}
6085
}
6186

6287
module.exports.mkdirfix = mkdirfix
63-
function mkdirfix (p, uid, gid, cb) {
64-
return mkdirp(p).then(made => {
65-
if (made) {
66-
return fixOwner(made, uid, gid).then(() => made)
67-
}
68-
}).catch({ code: 'EEXIST' }, () => {
69-
// There's a race in mkdirp!
70-
return fixOwner(p, uid, gid).then(() => null)
88+
function mkdirfix (cache, p, cb) {
89+
// we have to infer the owner _before_ making the directory, even though
90+
// we aren't going to use the results, since the cache itself might not
91+
// exist yet. If we mkdirp it, then our current uid/gid will be assumed
92+
// to be correct if it creates the cache folder in the process.
93+
return inferOwner(cache).then(() => {
94+
return mkdirp(p).then(made => {
95+
if (made) {
96+
return fixOwner(cache, made).then(() => made)
97+
}
98+
}).catch({ code: 'EEXIST' }, () => {
99+
// There's a race in mkdirp!
100+
return fixOwner(cache, p).then(() => null)
101+
})
71102
})
72103
}
73104

74105
module.exports.mkdirfix.sync = mkdirfixSync
75-
function mkdirfixSync (p, uid, gid) {
106+
function mkdirfixSync (cache, p) {
76107
try {
108+
inferOwner.sync(cache)
77109
const made = mkdirp.sync(p)
78110
if (made) {
79-
fixOwnerSync(made, uid, gid)
111+
fixOwnerSync(cache, made)
80112
return made
81113
}
82114
} catch (err) {
83115
if (err.code === 'EEXIST') {
84-
fixOwnerSync(p, uid, gid)
116+
fixOwnerSync(cache, p)
85117
return null
86118
} else {
87119
throw err

lib/util/infer-owner.js

+80
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
'use strict'
2+
3+
// This is only called by lib/util/fix-owner.js
4+
//
5+
// Get the uid/gid from the cache folder itself, not from
6+
// settings being passed in. Too flaky otherwise, because the
7+
// opts baton has to be passed properrly through half a dozen
8+
// different modules.
9+
//
10+
// This module keeps a Map of cache=>{uid,gid}. If not in the map,
11+
// then stat the folder, then the parent, ..., until it finds a folder
12+
// that exists, and use that folder's uid and gid as the owner.
13+
//
14+
// If we don't have getuid/getgid, then this never gets called.
15+
16+
const BB = require('bluebird')
17+
const fs = require('fs')
18+
const lstat = BB.promisify(fs.lstat)
19+
const lstatSync = fs.lstatSync
20+
const { dirname } = require('path')
21+
const inflight = require('promise-inflight')
22+
23+
const cacheToOwner = new Map()
24+
25+
const inferOwner = cache => {
26+
if (cacheToOwner.has(cache)) {
27+
// already inferred it
28+
return BB.resolve(cacheToOwner.get(cache))
29+
}
30+
31+
const statThen = st => {
32+
const { uid, gid } = st
33+
cacheToOwner.set(cache, { uid, gid })
34+
return { uid, gid }
35+
}
36+
// check the parent if the cache itself fails
37+
// likely it does not exist yet.
38+
const parent = dirname(cache)
39+
const parentTrap = parent === cache ? null : er => {
40+
return inferOwner(parent).then((owner) => {
41+
cacheToOwner.set(cache, owner)
42+
return owner
43+
})
44+
}
45+
return lstat(cache).then(statThen, parentTrap)
46+
}
47+
48+
const inferOwnerSync = cache => {
49+
if (cacheToOwner.has(cache)) {
50+
// already inferred it
51+
return cacheToOwner.get(cache)
52+
}
53+
54+
// the parent we'll check if it doesn't exist yet
55+
const parent = dirname(cache)
56+
// avoid obscuring call site by re-throwing
57+
// "catch" the error by returning from a finally,
58+
// only if we're not at the root, and the parent call works.
59+
let threw = true
60+
try {
61+
const st = lstatSync(cache)
62+
threw = false
63+
const { uid, gid } = st
64+
cacheToOwner.set(cache, { uid, gid })
65+
return { uid, gid }
66+
} finally {
67+
if (threw && parent !== cache) {
68+
const owner = inferOwnerSync(parent)
69+
cacheToOwner.set(cache, owner)
70+
return owner // eslint-disable-line no-unsafe-finally
71+
}
72+
}
73+
}
74+
75+
module.exports = cache => inflight(
76+
'inferOwner: detecting ownership of ' + cache,
77+
() => inferOwner(cache)
78+
)
79+
80+
module.exports.sync = inferOwnerSync

0 commit comments

Comments
 (0)