Skip to content
This repository was archived by the owner on Sep 2, 2023. It is now read-only.

Feature: Load ESM over https #127

Closed
viktor-ku opened this issue Jun 9, 2018 · 31 comments
Closed

Feature: Load ESM over https #127

viktor-ku opened this issue Jun 9, 2018 · 31 comments

Comments

@viktor-ku
Copy link

Hi,

Just closed my PR to the Node.js core about this feature. I'd like to propose a loader that could do this thing.

Example

import sum from 'https://gitsomething.com/origin/v1.1.0/main.js'

console.log(sum(1, 1)) // 2

How I think it should work

  1. Import over https reached
  2. Convert this https path (without any filename) to the hash representation (e.g. 'https://gitsomething.com/origin/v1.1.0' to 70b56bcaff85f7836cc4688021cfbfe385550fbbeb6e5e801eb16bd9e66d3134)
  3. Go to the main location with stored node modules over http (in PR it uses /tmp/ or the path specified by NODE_HTTP_MODULES env. Essential is that it is some base dir like node_modules)
  4. Look for the folder with this hashed name or download the package
  5. Look for the file matching the loading one (e.g. main.js in this case)
  6. Load the file content as usual esm

What I don't know yet

How to resolve package dependencies

@xtuc
Copy link
Contributor

xtuc commented Jun 9, 2018

I'm curious to know what do you think about allowing URL in Module specifier in WASM WebAssembly/esm-integration#11

@devsnek
Copy link
Member

devsnek commented Jun 9, 2018

once again i think we need to do this the standardised way: html.spec.whatwg.org/multipage/webappapis.html#fetching-scripts

there are 1000000000000 different things that servers do with compression, transport security, content types, etc and node trying to figure all that out itself won't help anyone.

also why do you want to cache files?

@xtuc
Copy link
Contributor

xtuc commented Jun 9, 2018

@devsnek Yes that's right, just wanted to bring that up because I think that's important to all agree on it.

also why do you want to cache files?

Sorry where's mentioned caching?

@devsnek
Copy link
Member

devsnek commented Jun 9, 2018

@xtuc steps 3-5 are working on files cached in /tmp/ (which won't work on plenty of oses)

i'm unclear as to why source text is leaving node's memory

@benjamingr
Copy link
Member

benjamingr commented Jun 9, 2018

@devsnek presumably for the same reason ry caches files in deno - so it’s fast and consistent between loads and doesn’t delay startup time.

@devsnek
Copy link
Member

devsnek commented Jun 9, 2018

@benjamingr wouldn't you just download the files in that case? i would expect an https import to hit the server every time.

@ljharb
Copy link
Member

ljharb commented Jun 9, 2018

It must; in case there’s redirects or updated content. It might get a 304, of course.

@viktor-ku
Copy link
Author

The reason behind caching is to avoid going to the internet if you already have this module downloaded. Why do you expect node to re-download module every time?

  • If you specify url with certain version (see example) then you wouldn't need it to be re-fetched
  • If you use master on un-versioned url then you will have this kind of issue, but then we can have a similar to ry idea with --reload command

@viktor-ku
Copy link
Author

About tmp:

Guys, it is a draft. I am not gonna leave it as is. We simply need some convenient place to store this modules. Perhaps we can specify this place with flag or env (like in PR).

So you specify env variable and everything goes there.

@guybedford
Copy link
Contributor

guybedford commented Jun 9, 2018 via email

@viktor-ku
Copy link
Author

@guybedford I don't know how to create a loader at the moment. But I will start to research that

@ljharb
Copy link
Member

ljharb commented Jun 9, 2018

A url is designed to have its contents cached, but urls always still hit the network (service workers aside). I would expect the same in node.

@viktor-ku
Copy link
Author

@ljharb so you will use --reload a lot I guess ⌨️

@ljharb
Copy link
Member

ljharb commented Jun 9, 2018

No, I’d expect that to be the default.

@benjamingr
Copy link
Member

Guys, it is a draft.

Prefer folks, people etc.

@benjamingr
Copy link
Member

@viktor-ku
Copy link
Author

Prefer folks, people etc.

@benjamingr Okay. I will try to remember that. Bad habits I guess

@SMotaal
Copy link

SMotaal commented Jun 12, 2018

@ljharb on the topic of service workers, could abstracting away https implementation into a similar mechanism make sense.

Say I have xyz origins, I create and register a worklet for those (I can use my imaginary favourite npm package to do so), and when a URL is requested, all node has to do is delegate to the worker and expect a Response-like object with content-type and body. It will involve a bit of origin trials... etc. so that node does delegate to the right worklet. It can also be statically defined for packages to use the specific worklet entry point specifiers.

@Fishrock123
Copy link
Contributor

I think this feature would be dangerous to implement.

If people would like to hot-load code from a network we should not make it easy for them. Node.js is too un-sandboxed to make this a viable default or built-in option.

@SMotaal
Copy link

SMotaal commented Jun 27, 2018

@bmek One thing I keep hitting with the 20+ isolated experiments I did with loaders is the fact that sometimes the resolved URL to a module's actual file on disk is not the same as the URL of the module it intends to resolve as. For instance, if a loader transpiles and caches a module in a temporary file, that URL is the resolved URL, however it is should not be the import.meta.url nor should it be the parentURL for it's static linking calls to Loader#<resolve, import, …>(specifier, parentURL).

I believe this will be common beyond experimental, which imho applies to this kind of loader.

@viktor-ku did you explore this further with --loader and if so, were there any other pain-points came across?

@bmeck
Copy link
Member

bmeck commented Jun 27, 2018

@SMotaal for now you can insert import.meta.url = ${JSON.stringify(desiredURL)} at top of your transpiled script.

However this is not enough, will continue why in #140

@guybedford
Copy link
Contributor

Note this contextual handling is exactly why a translate hook is useful.

@bmeck
Copy link
Member

bmeck commented Jun 27, 2018

@guybedford I don't think this implies we need a separate hook / translate, if you could explain that a bit it would help here.

@guybedford
Copy link
Contributor

Translation allows setting the JS source of a resource at a given path, thereby maintining the correct referrer context to that given path.

The alternatives involve large-scale re-resolving all modules that depend on that module into a new ID space, that will then not match the file system.

@bmeck
Copy link
Member

bmeck commented Jun 27, 2018

@guybedford couldn't that be rolled into the same hook? I'm asking because that matches the JS spec closer and seems to alleviate a series of issues.

@guybedford
Copy link
Contributor

@bmeck do you mean combining with resolve or combining with load? A load hook can certainly work as an alternative to any translate hook.

@bmeck
Copy link
Member

bmeck commented Jun 27, 2018

there is too much tribal knowledge in those terms of "load" vs "translate" for me to understand, I've been looking at having:

resolve(referrer : string, specifier : string, hostDefined: *) => Promise<{
  key? : string, // insertion point into cache
  body? : B, // the body, unclear on if this actually needs to be loaded ahead of return, can be on demand?
  data? : * // any message passing between loaders
}>

@guybedford
Copy link
Contributor

Could you not just return a Blob URL directly in such a case?

Blobs can work certainly, but are a bit more work as any importers need to be able to look up the mapping from the file URL to the blob for that resource, which is avoided with a translate.

What I mean by load is basically a fetch hook that can choose to do translation (load effectively covering both concepts).

@guybedford
Copy link
Contributor

Also the issue with resolvers getting data as you've written is that it's a many to one function, so precedence on body / data would apply.

@bmeck
Copy link
Member

bmeck commented Jun 27, 2018

@guybedford blob: URLs are insufficient for cycles: w3c/FileAPI#97 . We want full Blobs with ability to separate reservation of the key so that it can be done ahead of content creation.

Also the issue with resolvers getting data as you've written is that it's a many to one function, so precedence on body / data would apply.

I'm unclear on this, data is purely used as a message passing system for loaders, it wouldn't affect any default loader I would think.

@SMotaal
Copy link

SMotaal commented Jun 27, 2018

@guybedford I am +1 for a translation hook as long as chaining it is handled appropriately (which in my opinion is different from resolve's chaining).

Could you not just return a Blob URL directly in such a case?

My biggest concern in all my testing so far has been unchecked potential for redundant "stringing" garbage and mishandling outdated references to objects because we all do that sometimes. My second one is determinism of resulting URL, ie URL.createObjectURL does not allow for deterministic mapping for deadlock dependencies. So maybe the open web requires this kind of after-the-fact security however, Node's Loader specifically should meter both security and performance overhead.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants