-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Viewing media from different domains - cultural heritage interoperability #72
Comments
Hi! Thanks for filing. It's hard for me to follow with "publisher," "viewer," and "application." Can we reframe as top frame and partitioned iframe? Are you suggesting that the top frame requests storage access on behalf of the partitioned iframe? If so, why is that a requirement? |
Thanks for the super prompt response! I hope the following helps clarify the use case we have. There are no iframes in this scenario. The top frame in the The scenario is, essentially the "deep linking / hot linking" of images, where the images require cookies to be sent across domains. For example, try the demos for these applications: That pull in images from third parties ... but imagine that cookies needed to be sent from those application sites to the image hosting sites. Thus they (called |
In the first example there does not seem to be any third-party cookies (cookies sent to iiif.bodleian.ox.ac.uk)
The authentication looks like it is in the url e.g. https://iiif.bodleian.ox.ac.uk/iiif/image/e58b8c60-005c-4c41-a22f-07d49cb25ede/info.json
If that value is derived from first-party state (i.e. cookies) there should be no problem (with e.g. ITP)? There might be in future with first-party cookie restrictions but they will probably only about reducing expiry times in some circumstances.
|
Although real world applications are usually far more complex, the essence of the problem can be reduced to a very simple scenario. Starting condition - user is logged in at publisher.org and has a session. On a web page at publisher.org, user can view access controlled images because the session cookie is sent with the image request.
Over at viewer.org, the user is working on an art annotation project using this same image from publisher.org. Even though a direct first party request for https://publisher.org/image.jpg works for the user (because they have a session), when that request is initiated from viewer.org, the image is broken because the browser policy no longer permits third party cookies to be sent.
Cross domain interoperability for access controlled images relies on third party cookies. Recently implementations have had to make sure they were using SameSite headers correctly, but it still worked. Real world example: https://wellcomelibrary.org/item/b1987280x The image resources are on a different domain from the web page. Access to these resources requires a cookie, which in this case is granted by publisher.org if you accept the terms (this simple pattern is called "clickthrough" in our spec). This page still works in my version of Chrome, but it doesn't work in Safari any more, and soon won't work in Chrome if third party cookies are removed. We understand that the web feature that enables this interoperability is the same web feature that is abused by trackers. Consensual opt-in to third party cookies via browser API solves this more elegantly. The other part of our spec is about how viewer.org client code learns whether the user has credentials for https://publisher.org/image.jpg without having to know anything about the credentials themselves, which might be a use case for isLoggedIn - but that's a different issue! |
@michael-oneill wrote:
Apologies for being unclear, that was what I meant by "Imagine" ... those examples don't have cookies, but there are many very similar situations where there are cross-site cookies needed, as @tomcrane explains above. |
That is the Storage Access API. Have the image source offer a document you can load in an iframe where there’s a button to request storage access. When the user taps, the image source calls the Storage Access API and the browser will apply its logic for user permission and open up cookie access if the user allows. At this point, the iframe can let the top frame know that it can now load the images. One of the reasons why the top frame is not allowed request storage access on behalf of partitioned sites is that there’s no appropriate time to do so. On page load would take us back to modal prompts as soon as you land on a page. Tapping/clicking somewhere on the page will only be marginally better. I believe we would very quickly get to sites asking the user for permission to allow cookies for tracking on page load or first tap.
It’s highly unlikely that sites will be intentionally able to gain any knowledge about cross-site resources’ unpartitioned state. If that would be allowed, it could be used for fingerprinting similar to login fingerprinting (see “Disables Login Fingerprinting” in our blog post). It would have to be the partitioned resource provider inspecting its state in some form. How to prevent collusion to achieve login fingerprinting is still unresolved. |
This sounds great if I'm reading it right - I was looking at the README which says:
The key thing in the above scenario would be that the top page knows it is OK to set the |
(PS - related topic, you might be interested in how we use an iframe to allow the image source to pass a token to the client application, for it to gain knowledge of the client's current access to the image source's resources. The client uses this token as a proxy for the actual credential - a cookie, usually - on API interactions. The token can't be use to access the protected resources (it is not itself a credential), but it can be used to probe API endpoints which give the HTTP status that the protected resource would return if the user requested it with the credential represented by the token). https://iiif.io/api/auth/1.0/#interaction-for-browser-based-client-applications |
We may still have references to per-frame access in the spec. However, we have switched to a per-page storage access model based on developer feedback and cross-browser discussions: #3. WebKit/Safari has per-page storage access out in current betas: https://webkit.org/blog/11545/updates-to-the-storage-access-api/ I believe both Firefox and Edge are already shipping with per-page storage access.
Yes. But I would point out that the iframe crucially provides an execution context where storage access can be requested. JavaScript running in the iframe is running in the origin for which the desired cookies are stored. That’s how the browser knows which site is requesting access. |
I know it might not be as smooth a UI but have you considered opening another window (i.e. a new tab) to the image serving site with window.open, then posting suitable authentication information back to the window.opener with postMessage? The opened window could inform the user what is happening etc. |
This is pretty much how our current flow works: This flow, while fiddly, continues to work (although a more formal way of doing this would be nice). The exchange of information is still going to be possible, but the simple HTTP requests (e.g., from an I think our next step is to prepare a demonstrator of the auth flow that includes the storage-access request, and see if we can get the flow working again in Safari using that. |
But couldn't you encode the authentication in the src url? Keep it in a first-party cookie between sessions. |
The goals/assumptions of the spec included:
Encoding the auth information into the request URLs for the assets themselves would mean that libraries, museums, universities etc need our auth protocol protecting content resource requests, instead of whatever protocol they use at the moment. With the current spec they don't need to do that; acquisition and format of credentials is external to the spec. The spec's protocol is about information leakage from the publisher to compliant viewers, which is a simpler thing to have running alongside SSO or whatever, for these particular image resources. But yes, you're right - without third party cookies, the current spec would need to be replaced by something completely different - a specific auth protocol that encodes credentials into the request URL, rather than an adjacent protocol for revealing information about the user's ability to access a resource (but crucially, never credentials), independent of the auth protocol in use. |
I understand. So the cookie method would would work, as long as the third-party cookies were partitioned and not ephemeral? |
So this is a lot to process, sorry if this has already been answered, but where does the requirement for supporting plain With iframes, institutions could continue using their access control backend for the images in addition to a JS file that runs the iframe document to check for storage access, display UI to notify the user, etc. |
To enable rich, dynamic interactions with the content via systems like Open Seadragon, where images from multiple publishers can be composited together into a single scene. Similarly, Open Layers is another product with similar functionality, but focused more on maps and overlays (where the overlays and/or images could come from different publishers). Or just to embed them in blog posts (or wherever) without having to deal with iframes :) |
Huh, okay, I get that first point, thanks, but I'm somewhat skeptical that in the latter case this would really be more work for the embedder. In the end, is there a big difference whether I copy-paste (or generate) an iframe vs. an image element into my HTML? (Without having domain knowledge of your field) I would suspect that there is an opportunity for covering most use cases you're describing (except ones that need access to the raw image file) through "standardized" embedded iframes for this use case, which use the storage access api. I guess my question is whether what you describe the above comment is very common or if, say, 80% of embeds could safely be replaced by simple iframes. |
The dynamic interactions is 99.many9s% of the usage, especially when it comes to authenticated / authorized access. The embedding is certainly the 0.many0s1% case :) |
The real-world user experience of IIIF is usually in a viewer like these:
The third party cookies issue is the same in the simple scenario of an HTML Crucially, whether for simple images or deep zoom, the client application is in charge of the user experience - where and how the images are displayed and laid out, what size(s) they are, etc. The For example: On the left I have zoomed into van Gogh's eye; the client viewer application on projectmirador.org is causing the browser to make making many (hundreds, even) of image tile requests to iiif.harvardartmuseums.org. |
This comment was marked as spam.
This comment was marked as spam.
Hi there, thanks for adding this to the agenda! We have a little more material. One is a sequence diagram that shows the (pre-storage-api, assumption of 3rd part cookies) flow. It's a fairly conceptual diagram, rather verbose, I hope it conveys our IIIF Auth spec: We also have stripped-down versions of client and server implementations. Usually the client and server are more complex but here they just provide a single image. This is not an impl of the spec, it dispenses with everything that isn't relevant to exploring this problem. https://tomcrane.github.io/iiif-auth-client/ There are three versions of the server. One without any nod to storage access - the crucial point is that an image such as https://iiif-auth-server.herokuapp.com/img/open.jpg has a service description: https://iiif-auth-server.herokuapp.com/img/open.jpg/info.json The other two versions are WIP experiments with storage API. When hasStorageAccess is false, we find requestStorageAccess will immediately fail if the user has had no interaction with the publisher site before (we can't even ask for it). One question in this thread is whether we couldn't just show the images in the publisher's iframe. The simplicity of this demo doesn't help explain why that isn't sufficient but I hope the more complex examples linked earlier do demonstrate this. We are open to ANY approach / flow / interaction patterns that meet the use cases, whether they are refinements of the existing flow or even if they are radically different. If they can be based on web standards that accommodate the use cases, rather than playing catch up with evolving 3rd party policy in browsers, then we will have a simpler and more stable spec. |
To try to summarise, independently of APIs and specific techniques... Our spec depends on, in a cross-context setting:
...regardless of the access control mechanism protecting that resource (assuming though, that it depends on cookies). |
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
@tomcrane Hello again, we were recently revisiting this issue and wondering if you had any updates to share, were you able to work around your issues? If not, I noticed that @bvandersloot-mozilla recently published a proposal for non-iframe / popup auth Storage Access which seems like it fits nicely onto your scenario: https://github.com/bvandersloot-mozilla/top-level-storage-access Any thoughts? :) Thanks! |
@johannhof Apologies for missing this notification. We recently published an update to our specification which gets round the current issues of whether a browser sends third party cookies - e.g., by ensuring that the user has established a trust relationship with the publisher by having performed a significant gesture at the publisher in a first party context. This also anticipates future uncertainty by not assuming that cookies are being used. We use the term authorizing aspect to describe the aspect of the HTTP request that grants access (which may be something ambient like IP address). However, it doesn't get round the fundamental problem described above which is encountered especially in this cultural heritage context. Site A at
This recent post brings this into focus: None of the remediations really apply to this very simplest expression of the problem above. Site B, the publisher, should not need any knowledge of Site A, or vice versa - no registration, lists of permitted third party contexts etc. Of course, this very simplest expression of the problem is the same mechanism as used by tracking cookies. But we had hoped that the storage access API approach - of separating out "good" third party cookies from "bad" - would allow continued support for this most simple of cross domain content reuse. Even if that means the user granting permission by some explicit interaction, as well as or instead of the implicit permission granted by the user having performed a significant gesture. Returning to https://github.com/bvandersloot-mozilla/top-level-storage-access This certainly looks very promising as it reintroduces the notion of asking for permission - site-a can ask "is it OK for me to send third party cookies to site-b.com?" It introduces a symmetrical requirement, that the browser, while on site B, can be told that this is OK too. And we have a workflow that takes the user to site B in a first party context at some point. I will need some more time to digest all this but wanted to get a response in as soon as I noticed your comment - and many thanks again for considering the use case. Where is the correct place to comment on and ask questions about this proposal? |
Hi @tomcrane - I've published a proposal for HTTP request/response headers that may be helpful here: https://github.com/cfredric/storage-access-headers Please take a look and let me know if the proposal would be helpful for this, and/or how it could be improved! I acknowledge that it doesn't fully solve the problem presented here, but it might be a step forward anyway. (I also used the IIIF use case as a motivating example in my explainer; I'm happy to remove that or tweak it if the proposal isn't actually helpful for you.) |
Thank you @johnwilander for pointing us at the work of this group.
Our motivating use case is:
The user is logged in to Image Publisher with domain name
image-publisher.example
and is now visitingviewer.example
, which allows users to view images from multiple image publishers using the accounts they have with those publishers. The user taps/clicks on a link to an image fromimage-publisher.example
in order to view it at full resolution, which requires authenticated access via cookies. The onclick event handler in theviewer.example
application calls the Storage Access API to request cookie access needed to authenticate the user cross-site toimage-publisher.example
. The user has not usedviewer.example
before and thus gets prompted to allow or disallow storage access, decides to allow storage access, and the browser retrieves the full image from image-publisher.example, rendered to the user via an<img>
tag (or<canvas>
, or similar) in the viewer.example page.This use case is further described in this 3 minute read.
Differences from current storage-access use cases
In current storage-access use cases, the publisher's code (in an iframe) is asking permission for the browser to send cookies to it (the publisher); whereas here the application page is asking permission for the browser to send cookies to the publisher. The application doesn't need access to the cookies, just that they be sent to the publisher to establish the identity of the user.
API
(staying as closely as possible to the existing example but acknowledging that it might be a different API)
Here the code belongs to the viewing application; the
document
is the viewer itself, rather than an embedded iframe from the publisher.The text was updated successfully, but these errors were encountered: