Skip to content

Commit 89e9bc3

Browse files
Merge pull request #107 from UmbrellaDocs/header-support
Feat: Add support for HTTP headers
2 parents 2c438a5 + 4c1edf0 commit 89e9bc3

9 files changed

+263
-32
lines changed

README.md

+33-15
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Linkspector is a powerful tool for anyone who creates content using markup langu
2121
1. **Enhanced Link Checking with Puppeteer**: It uses [Puppeteer](https://pptr.dev/) to check links in Chrome's headless mode, reducing the number of false positives.
2222
2. **Addresses limitations and adds user-requested features**: It is built to adress the shortcomings in [GitHub Action - Markdown link check](https://github.com/gaurav-nelson/github-action-markdown-link-check) and adds many user requested features.
2323
3. **Single repository for seamless collaboration**: All the code it needs to run is in a single repository, making it easier for community to collaborate.
24-
4. **Focused for CI/CD use**: Linkspector is purposefully tailored to run into your CI/CD pipelines. This ensures that link checking becomes an integral part of your development workflow.
24+
4. **Focused for CI/CD use**: Linkspector ([action-linkspector](https://github.com/UmbrellaDocs/action-linkspector)) is purposefully tailored to run into your CI/CD pipelines. This ensures that link checking becomes an integral part of your development workflow.
2525

2626
## Installation
2727

@@ -112,6 +112,7 @@ Following are the available configuration options:
112112
| [`aliveStatusCodes`](#alive-status-codes) | The list of HTTP status codes that are considered as "alive" links. | No |
113113
| [`useGitIgnore`](#use-gitignore) | Indicates whether to use the rules defined in the `.gitignore` file to exclude files and directories. | No |
114114
| [`modifiedFilesOnly`](#check-modified-files-only) | Indicates whether to check only the files that have been modified in the last git commit. | No |
115+
| [`httpHeaders`](#http-headers) | The list of URLs and their corresponding HTTP headers to be used during link checking. | No |
115116

116117
### Files to Check
117118

@@ -225,6 +226,31 @@ When enabled, Linkspector will use `git` to find the list of modified files and
225226

226227
Also, if no modified files are found in the list of files to check, Linkspector will skip link checking and exit with a message indicating that no modified files have been edited so it will skip checking.
227228

229+
### HTTP headers
230+
231+
The `httpHeaders` option allows you to specify HTTP headers for specific URLs that require authorization. You can use environment variables for secure values.
232+
233+
1. Create a `.env` file in the root directory of your project and add the environment variables. For example:
234+
235+
```env
236+
AUTH_TOKEN=abcdef123456
237+
```
238+
239+
1. Add the `httpHeaders` section to the configuration file and specify the URLs and headers. For example:
240+
241+
```yaml
242+
httpHeaders:
243+
- url:
244+
- https://example1.com
245+
headers:
246+
Foo: Bar
247+
- url:
248+
- https://example2.com
249+
headers:
250+
Authorization: ${AUTH_TOKEN}
251+
Foo: Bar
252+
```
253+
228254
### Sample configuration
229255

230256
```yml
@@ -250,6 +276,12 @@ replacementPatterns:
250276
replacement: '$1/id/$3'
251277
- pattern: "\\[([^\\]]+)\\]\\((https?://example.com)/file\\)"
252278
replacement: '<a href="$2/file">$1</a>'
279+
httpHeaders:
280+
- url:
281+
- https://example1.com
282+
headers:
283+
Authorization: Basic Zm9vOmJhcg==
284+
Foo: Bar
253285
aliveStatusCodes:
254286
- 200
255287
- 201
@@ -304,20 +336,6 @@ To use Linkspector with Docker, follow these steps:
304336
bash -c 'linkspector check -c /path/to/custom-config.yml'
305337
```
306338

307-
## What's planned
308-
309-
- [x] Spinner for local runs.
310-
- [x] Create a GitHub action. See [action-linkspector](https://github.com/UmbrellaDocs/action-linkspector)
311-
- [x] Modified files only check.
312-
- [!] Asciidoc support. (Limited to hyperlinks only)
313-
- [ ] ReStructured Text support.
314-
- [ ] Disable binary files downlaod.
315-
- [x] JSON output for `failed-only` ~~or `all`~~ links.
316-
- ~~[ ] CSV output for `all` links.~~ (dropped for now)
317-
- ~~[ ] Experimaental mode to gather all links and check them in batches to study performance gains.~~ (dropped for now)
318-
- ~~[ ] Proxy support to connect puppeteer to a remote service.~~ (dropped for now)
319-
- ~~[ ] Puppeteer config support.~~ (dropped for now)
320-
321339
## Contributing
322340

323341
If you would like to contribute to Linkspector, please read the [contributing guidelines](/CONTRIBUTING.md).

index.test.js

+2-2
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ test('linkspector should check top-level relative links in Markdown file', async
3434
}
3535

3636
expect(hasErrorLinks).toBe(false)
37-
expect(results.length).toBe(21)
37+
expect(results.length).toBe(22)
3838
})
3939

4040
test('linkspector should track statistics correctly when stats option is enabled', async () => {
@@ -87,7 +87,7 @@ test('linkspector should track statistics correctly when stats option is enabled
8787

8888
// Verify statistics are being tracked correctly
8989
expect(stats.filesChecked).toBeGreaterThan(0)
90-
expect(stats.totalLinks).toBe(21)
90+
expect(stats.totalLinks).toBe(22)
9191
expect(stats.totalLinks).toBe(
9292
stats.httpLinks +
9393
stats.fileLinks +

lib/batch-check-links.js

+22-4
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,22 @@ function createLinkStatus(link, status, statusCode, errorMessage = null) {
2222
}
2323
}
2424

25-
async function processLink(link, page, aliveStatusCodes) {
25+
async function processLink(link, page, aliveStatusCodes, httpHeaders) {
2626
let status = null
2727
let statusCode = null
2828
let errorMessage = null
2929

3030
try {
3131
if (isUrl(link.url)) {
32-
const response = await page.goto(link.url, { waitUntil: 'load' })
32+
const headers =
33+
httpHeaders.find((header) =>
34+
header.url.some((urlPattern) => link.url.includes(urlPattern))
35+
)?.headers || {}
36+
37+
const response = await page.goto(link.url, {
38+
waitUntil: 'load',
39+
headers,
40+
})
3341
statusCode = response.status()
3442
if (aliveStatusCodes && aliveStatusCodes.includes(statusCode)) {
3543
status = 'assumed alive'
@@ -46,7 +54,12 @@ async function processLink(link, page, aliveStatusCodes) {
4654
}
4755

4856
async function checkHyperlinks(nodes, options = {}, filePath) {
49-
const { batchSize = 100, retryCount = 3, aliveStatusCodes } = options
57+
const {
58+
batchSize = 100,
59+
retryCount = 3,
60+
aliveStatusCodes,
61+
httpHeaders = [],
62+
} = options
5063
const linkStatusList = []
5164
const tempArray = []
5265

@@ -150,7 +163,12 @@ async function checkHyperlinks(nodes, options = {}, filePath) {
150163

151164
while (retryCountLocal < retryCount) {
152165
try {
153-
linkStatus = await processLink(link, page, aliveStatusCodes)
166+
linkStatus = await processLink(
167+
link,
168+
page,
169+
aliveStatusCodes,
170+
httpHeaders
171+
)
154172
break
155173
} catch (error) {
156174
retryCountLocal++

lib/handle-links-modification.js

+8-2
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@
99
*
1010
* @returns {Array} The modified nodes.
1111
*/
12+
import { escapeRegExp } from 'lodash-es'
13+
1214
function doReplacements(nodes, opts = {}) {
1315
const { ignorePatterns = [], replacementPatterns = [], baseUrl } = opts
1416

@@ -17,7 +19,9 @@ function doReplacements(nodes, opts = {}) {
1719
// Skip link checking if it matches any ignore pattern
1820
if (
1921
ignorePatterns.some(({ pattern }) => {
20-
const regex = new RegExp(pattern)
22+
// Sanitize the pattern before creating the RegExp
23+
const sanitizedPattern = escapeRegExp(pattern)
24+
const regex = new RegExp(sanitizedPattern)
2125
return regex.test(url)
2226
})
2327
) {
@@ -31,7 +35,9 @@ function doReplacements(nodes, opts = {}) {
3135

3236
// Replace link URL based on replacement patterns
3337
replacementPatterns.forEach(({ pattern, replacement }) => {
34-
url = url.replace(new RegExp(pattern), replacement)
38+
// Sanitize the pattern before creating the RegExp
39+
const sanitizedPattern = escapeRegExp(pattern)
40+
url = url.replace(new RegExp(sanitizedPattern), replacement)
3541
})
3642
node.url = url
3743

lib/validate-config.js

+6-9
Original file line numberDiff line numberDiff line change
@@ -23,15 +23,12 @@ async function validateConfig(config) {
2323
excludedDirs: Joi.array().items(Joi.string()),
2424
fileExtensions: Joi.array().items(Joi.string()),
2525
baseUrl: Joi.string(),
26-
httpHeaders: Joi.object({
27-
url: Joi.string(),
28-
headers: Joi.array().items(
29-
Joi.object({
30-
name: Joi.string().required(),
31-
value: Joi.string().required(),
32-
})
33-
),
34-
}),
26+
httpHeaders: Joi.array().items(
27+
Joi.object({
28+
url: Joi.array().items(Joi.string().uri()).required(),
29+
headers: Joi.object().pattern(Joi.string(), Joi.string()).required(),
30+
})
31+
),
3532
aliveStatusCodes: Joi.array().items(Joi.number()),
3633
ignorePatterns: Joi.array().items(
3734
Joi.object({

linkspector.js

+17
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ import { execSync } from 'child_process'
22
import { readFileSync } from 'fs'
33
import path from 'path'
44
import yaml from 'js-yaml'
5+
import dotenv from 'dotenv'
56
import { validateConfig } from './lib/validate-config.js'
67
import { prepareFilesList } from './lib/prepare-file-list.js'
78
import { extractMarkdownHyperlinks } from './lib/extract-markdown-hyperlinks.js'
@@ -10,6 +11,19 @@ import { getUniqueLinks } from './lib/get-unique-links.js'
1011
import { checkHyperlinks } from './lib/batch-check-links.js'
1112
import { updateLinkStatusObj } from './lib/update-linkstatus-obj.js'
1213

14+
// Load environment variables from .env file
15+
dotenv.config()
16+
17+
// Function to replace placeholders with environment variables
18+
function replaceEnvVariables(config) {
19+
const configString = JSON.stringify(config)
20+
const replacedConfigString = configString.replace(
21+
/\$\{(\w+)\}/g,
22+
(_, name) => process.env[name] || ''
23+
)
24+
return JSON.parse(replacedConfigString)
25+
}
26+
1327
// Function to check if git is installed
1428
function isGitInstalled() {
1529
try {
@@ -44,6 +58,9 @@ export async function* linkspector(configFile, cmd) {
4458
throw new Error('Failed to parse the YAML content.')
4559
}
4660

61+
// Replace environment variables in the configuration
62+
config = replaceEnvVariables(config)
63+
4764
try {
4865
const isValid = await validateConfig(config)
4966
if (!isValid) {

package-lock.json

+20
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

+2
Original file line numberDiff line numberDiff line change
@@ -42,12 +42,14 @@
4242
"homepage": "https://github.com/UmbrellaDocs/linkspector#readme",
4343
"dependencies": {
4444
"commander": "^13.1.0",
45+
"dotenv": "^16.4.7",
4546
"github-slugger": "^2.0.0",
4647
"glob": "^11.0.1",
4748
"ignore": "^7.0.3",
4849
"joi": "^17.13.3",
4950
"js-yaml": "^4.1.0",
5051
"kleur": "^4.1.5",
52+
"lodash-es": "^4.17.21",
5153
"ora": "^8.2.0",
5254
"puppeteer": "^24.4.0",
5355
"remark-gfm": "^4.0.1",

0 commit comments

Comments
 (0)