Skip to content

WICG/cross-origin-storage

Repository files navigation

Explainer for the Cross-Origin Storage (COS) API

Cross-Origin Storage (COS) logo, consisting of a folder icon with a crossing person.

This proposal outlines the design of the Cross-Origin Storage (COS) API, which allows web applications to store and retrieve files across different origins. Building on the File System Living Standard defined by the WHATWG, the COS API facilitates secure cross-origin file storage and retrieval for large assets, such as AI models, WebAssembly (Wasm) modules, and highly popular JavaScript libraries. Taking inspiration from Cache Digests for HTTP/2, the API identifies files by their hashes to ensure integrity.

Tip

Evaluate this proposal

While this API is not yet natively implemented in browsers, you can experiment with the proposed surface today. Install the Cross-Origin Storage extension to inject the navigator.crossOriginStorage polyfill on all pages and test the complete flow. See the source code of the extension and read the instructions for how to test it.

Authors

Participate

The Cross-Origin Storage (COS) API provides a secure mechanism for web applications to store and retrieve large files across different origins. This allows applications to share common assets—such as AI models, Wasm modules, and popular JavaScript libraries—without redundant downloads. Resources are identified by their cryptographic hashes to ensure data integrity. The API reuses concepts like FileSystemFileHandle from the File System Living Standard, specifically tailored for cross-origin scenarios. The following example demonstrates the basic flow for retrieving a file:

// The hash of the desired file.
const hash = {
  algorithm: 'SHA-256',
  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
};
try {
  const [handle] = await navigator.crossOriginStorage.requestFileHandles([hash]);
  // The file exists in Cross-Origin Storage.
  const fileBlob = await handle.getFile();
  // Do something with the blob.
} catch (err) {
  if (err.name === 'NotAllowedError') {
    console.log('The user agent did not grant permission to access the file.');
  } else if (err.name === 'NotFoundError') {
    console.log('The file was not found in Cross-Origin Storage.');
  }
  return;
}

Risk awareness

Caution

The authors acknowledge that storage is usually isolated by origin to safeguard user security and privacy. Storing large resources such as AI models separately for each origin, as required by new use cases, presents a significant scalability and efficiency challenge. For instance, if both example.com and example.org each require the same 8 GB AI model, this would result in 16 GB of downloaded data and a total allocation of 16 GB on the user's device. This proposal introduces mechanisms that uphold protection standards while addressing the inefficiencies of duplicated downloads and storage.

Goals

COS aims to:

  • Provide a cross-origin storage mechanism for web applications to store and retrieve large files such as AI models, Wasm modules, and highly popular JavaScript libraries.
  • Guarantee data integrity and consistency for file identification (see Appendix B).
  • Make the web more sustainable and ethical by reducing redundant downloads of large resources the user agent may already have stored locally.

Non-goals

COS does not aim to:

  • Replace existing storage solutions such as the Origin Private File System, the Cache API, IndexedDB, or Web Storage.
  • Replace content delivery networks (CDNs).
  • Allow cross-origin file access without the possibility for the user agent to intervene.
  • Modify or supersede the same-origin policy.

User research

Feedback from developers working with large AI models, Wasm modules, and highly popular JavaScript libraries has highlighted the need for an efficient way to store and retrieve such large files across web applications on different origins. These developers are looking for a standardized solution that allows files to be stored once and accessed by multiple applications, without needing to download and store the files redundantly. COS ensures this is possible while maintaining privacy and security.

User needs example: Hugging Face

Joshua Lochner (aka. Xenova) from Hugging Face had the following to say in his talk at the 2024 Chrome Web AI Summit:

"One can imagine a browser-based web store for models similar to the Chrome Web Store for extensions. From the user's perspective, they could search for web-compatible models on the Hugging Face hub, install it with a single click, and then access it across multiple domains. Currently, Transformers.js is limited in this regard, since models are cached on a per site or per extension basis."

User needs example: Web Machine Learning Working Group

Participants of the Web Machine Learning Working Group at the W3C in their meeting on September 21, 2023, discussed Storage APIs for caching large models. A proposal named Hybrid AI Explorations listed the following open issues:

"If the model runs on the client, large models need to be downloaded, possibly multiple times in different contexts. This incurs a startup latency."

"Models are large and can consume significant storage on the client, which needs to be managed."

This led to the creation of a dedicated Hybrid AI explainer, which in its introduction states:

"For example, ML models are large. This creates network cost, transfer time, and storage problems. As mentioned, client capabilities can vary. This creates adaptation, partitioning, and versioning problems. We would like to discuss potential solutions to these problems, such as shared caches, progressive model updates, and capability/requirements negotiation."

User needs example: Mozilla

In their standards position on the Writing Assistance APIs, Mozilla engineer Brian Grinstead wrote:

"We acknowledge a downside with this approach related to lack of shared client storage for model weights — it would be a better experience if the browser only had to download large weights one time. We don’t know of a privacy-preserving way to do this, short of high level APIs like these which abstract away the details of inference."

Use cases

Use case 1: Large AI models

Developers working with large AI models can store these models once and access them across multiple web applications. By using the COS API, models can be stored and retrieved based on their hashes, minimizing repeated downloads and storage, ensuring file integrity. An example is Google's Gemma 2 model g-2b-it-gpu-int4.bin (1.35 GB). Another example is Google's Gemma 1.1 7B model gemma-1.1-7b-it (8.60 GB), which can be run in the browser. Yet another example is the Llama-3.1-70B-Instruct-q3f16_1-MLC model (33 GB), which likewise runs in the browser (choose the "Llama 3.1 70B Instruct" model in the picker).

Use case 2: Large Wasm modules

Web applications that utilize large Wasm modules can store these modules using COS and access them across different origins. This enables efficient sharing of files between applications, reducing redundant downloading and improving performance. A notable example is Google's Flutter framework, which uses several Wasm files that are requested millions of times daily across thousands of hosts:

Request (https://gstatic.com/flutter-canvaskit/) Size Hosts Requests
36335019a8eab588c3c2ea783c618d90505be233/chromium/canvaskit.wasm 5.1 MB 1,938 596,900
a18df97ca57a249df5d8d68cd0820600223ce262/chromium/canvaskit.wasm 5.1 MB 1,586 579,380
36335019a8eab588c3c2ea783c618d90505be233/canvaskit.wasm 6.4 MB 1,142 597,240
a18df97ca57a249df5d8d68cd0820600223ce262/canvaskit.wasm 6.4 MB 1,014 288,800

(Source: Google-internal data from the Flutter team: "Flutter engine assets by unique hosts - one day - Dec 10, 2024".)

Use case 3: Highly popular JavaScript libraries and frameworks

Traditionally, bundlers have combined vendor code and user code, leading to low cache hit rates even before the regular HTTP cache was isolated. By bundling vendor code separately and in its entirety (for example, the complete React library) instead of using dead-code elimination, developers can ensure a higher cache hit rate. Storing such files once with the COS API allows multiple web apps to share the same highly popular libraries.

Use case 4: Game engines

Web games built with game engines that have browser support such as Godot or Unity can store the core game engine code in COS and only load game-specific assets such as textures and game logic from the network. Web gaming portals such as WebGamer that host plenty of casual games with a short path to gameplay on different cross-origin iframes can benefit greatly from this.

Potential solution

File Storage Process

The COS API will be available through the navigator.crossOriginStorage interface. Files will be stored and retrieved based on their hashes, ensuring that each file is uniquely identified.

Storing files

  1. Hash the contents of the files using SHA-256 (or an equivalent secure algorithm, see Appendix B). The hash algorithm used is communicated as a valid HashAlgorithmIdentifier.
  2. Request a sequence of FileSystemFileHandle objects for the files, specifying the files' hashes.
  3. Write the files' data to the FileSystemFileHandle objects and store them in Cross-Origin Storage.
Example: Storing a single file
/**
 * Example usage to store a single file.
 */

// The hash of the desired file.
const hash = {
  algorithm: 'SHA-256',
  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
};

// First, check if the file is already in COS.
try {
  const [handle] = await navigator.crossOriginStorage.requestFileHandles([
    hash,
  ]);
  // The file exists in COS.
  const fileBlob = await handle.getFile();
  // Do something with the blob.
  console.log('Retrieved', fileBlob);
  return;
} catch (err) {
  // If the file wasn't in COS, load it from the network and store it in COS.
  if (err.name === 'NotFoundError') {
    // Load the file from the network.
    const fileBlob = await loadFileFromNetwork();
    try {
      const [handle] = await navigator.crossOriginStorage.requestFileHandles(
        [hash],
        {
          create: true,
          // Optional: Only allow these origins to read the file.
          origins: ['https://example.com', 'https://example.org'],
        },
      );
      const writableStream = await handle.createWritable();
      await writableStream.write(fileBlob);
      await writableStream.close();
    } catch (err) {
      // The `write()` failed.
    }
    return;
  }
  // 'NotAllowedError', the user agent did not grant access to the file.
  console.log('The user agent did not grant access to the file.');
}
Example: Restricting resources to specific origins

The origins field is useful for sharing resources between a set of related origins without making them globally available. This option is recommended for proprietary resources or resources for which global COS cache hits are not anticipated. For example, if a company has two related sites, write.example.com and calculate.example.com, that both use the same AI model for proofreading, they can store the model in COS and restrict access to just these two origins. This way, the model is not globally available to all sites that use COS, but only to the two related sites that need it.

// The hash of an AI model for proofreading.
const hash = {
  algorithm: 'SHA-256',
  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
};

// Site `write.example.com` stores the model and restricts it to itself and
// `calculate.example.com`.
const [handle] = await navigator.crossOriginStorage.requestFileHandles([hash], {
  create: true,
  origins: ['https://calculate.example.com', 'https://write.example.com'],
});

// Write the file…

// Now, `calculate.example.com` can request the same hash and it will be found.
// Any other origin NOT in the list (e.g., `https://unrelated.com`) will receive
// a `NotFoundError` when requesting this hash, even if it's stored in COS.
Example: Making a resource globally available

By specifying origins: '*' when storing a file, the file becomes globally available to all origins that use COS. This option is appropriate for widely used resources that many sites are likely to share, such as popular AI models, Wasm modules, or JavaScript libraries. This is an explicit opt-in to avoid developers accidentally making resources globally available, which could lead to cross-site leaks.

// The hash of a very common AI model.
const hash = {
  algorithm: 'SHA-256',
  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
};

const [handle] = await navigator.crossOriginStorage.requestFileHandles([hash], {
  create: true,
  origins: '*',
});

// Write the file…

// Now, any origin can request the same hash and it will be found.
Example: Making a resource available to Same-Site origins only

By omitting the origins option altogether when storing a file, the file becomes available only to Same-Site origins that use COS. This is a good option for resources that are expected to be shared across multiple subdomains of the same site, but not across completely unrelated sites.

// The hash of a company's proprietary AI model.
const hash = {
  algorithm: 'SHA-256',
  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
};

const [handle] = await navigator.crossOriginStorage.requestFileHandles([hash], {
  create: true,
});

// Write the file…

// Now, any Same-Site origin  can request the same hash and it will be found.
Resource visibility upgrades

The visibility of a resource in COS can be upgraded but never downgraded:

  • Restricted to more permissive: If a resource was initially stored with an origins list, any site (including the original storer or a completely different site) can later call requestFileHandles() for the same hash with create: true and change the origins field to a more permissive value. If the user agent verifies the hash matches, the resource is then marked as available according to the new origins value. The new site must still write the full file using the returned FileSystemFileHandle object, to prevent sites from using this behavior to detect whether a file was previously stored.
  • Permissive to more restricted: If a resource is already permissively available in COS, any attempt to store it again with a more restrictive origins list is ignored. The resource remains globally available, and the user agent should log a warning to the console to inform the developer that the restriction was not applied.
Example: Storing multiple files
/**
 * Example usage to store multiple files.
 */

// The hashes of the desired files.
const hashes = [{
  algorithm: 'SHA-256',
  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
}, {
  algorithm: 'SHA-256',
  value: 'ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad',
}];

// First, check if the files are already in COS.
try {
  const handles = await navigator.crossOriginStorage.requestFileHandles(hashes);
  // The files exist in COS.
  for (const handle of handles) {
    const fileBlob = await handle.getFile();
    // Do something with the blob.
    console.log('Retrieved', fileBlob);
  }
  return;
} catch (err) {
  // If the files weren't in COS, load them from the network and store them in
  // COS. The method throws a `NotFoundError` `DOMException` if _any_ of the
  // files is not found.
  if (err.name === 'NotFoundError') {
    try {
      // Load the files from the network.
      const fileBlobs = await loadFilesFromNetwork();
      const handles = await navigator.crossOriginStorage.requestFileHandles(
        hashes,
        {
          create: true,
        },
      );
      handles.forEach((handle, i) => {
        const writableStream = await handle.createWritable();
        await writableStream.write(fileBlobs[i]);
        await writableStream.close();
      });
    } catch (err) {
      // The `write()` failed.
    }
    return;
  }
  // 'NotAllowedError', the user agent did not grant access to the file.
  console.log('The user agent did not grant access to the file.');
}

Retrieving files

  1. Request a sequence of FileSystemFileHandle objects for the files, specifying the files' hashes.
  2. Check if each resource exists in COS and make sure it can be shared without causing privacy issues.
  3. Retrieve the sequence of FileSystemFileHandle objects after the user agent has granted access.

Note

A NotFoundError DOMException does not necessarily mean the file is absent from COS. User agents may suppress availability of a file for privacy reasons (see Availability gating). Callers should handle NotFoundError by falling back to a network fetch, regardless of the cause.

Example: Retrieving a single file
/**
 * Example usage to retrieve a single file.
 */

// The hash of the desired file.
const hash = {
  algorithm: 'SHA-256',
  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
};

try {
  const [handle] = await navigator.crossOriginStorage.requestFileHandles([
    hash,
  ]);
  // The file exists in COS.
  const fileBlob = await handle.getFile();
  console.log('Retrieved file', fileBlob);
  // Do something with the blob.
} catch (err) {
  if (err.name === 'NotFoundError') {
    // Load the file from the network.
    const fileBlob = await loadFileFromNetwork();
    // Return the file as a Blob.
    console.log('Obtained file from network', fileBlob);
    return;
  }
  // 'NotAllowedError', the user agent did not grant access to the file.
  console.log('The user agent did not grant access to the file.');
}
Example: Retrieving multiple files
/**
 * Example usage to retrieve multiple files.
 */

// The hashes of the desired files.
const hashes = [
  {
    algorithm: 'SHA-256',
    value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
  },
  {
    algorithm: 'SHA-256',
    value: 'ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad',
  },
];

try {
  const handles = await navigator.crossOriginStorage.requestFileHandles(hashes);
  // The files exist in COS.
  for (const handle of handles) {
    const fileBlob = await handle.getFile();
    // Do something with the blob.
    console.log('Retrieved file', fileBlob);
  }
} catch (err) {
  if (err.name === 'NotFoundError') {
    // Load the files from the network.
    const fileBlobs = await loadFilesFromNetwork();
    // Do something with the blobs.
    console.log('Obtained files from network', fileBlobs);
    return;
  }
  // 'NotAllowedError', the user agent did not grant access to the files.
  console.log('The user agent did not grant access to the files.');
}

Storing and retrieving a file across unrelated sites

To illustrate the capabilities of the COS API, consider the following example where two unrelated sites want to interact with the same common large language model. The first site stores the model in COS and makes it globally available, while the second site retrieves it.

Site A: Storing a large language model

On Site A, a web application stores a large language model in COS.

// The hash of the desired file.
const hash = {
  algorithm: 'SHA-256',
  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
};

try {
  const [handle] = await navigator.crossOriginStorage.requestFileHandles([
    hash,
  ]);

  // Use the file and return.
  // …
  return;
} catch (err) {
  if (err.name === 'NotFoundError') {
    // Load the file from the network.
    const fileBlob = await loadFileFromNetwork();
    // Compute the control hash using the method in Appendix B.
    const controlHash = await getBlobHash(fileBlob);
    // Check if control hash and known hash are the same.
    if (controlHash !== hash.value) {
      // Downloaded file and desired file are different.
      // …
      return;
    }
    try {
      const [handle] = await navigator.crossOriginStorage.requestFileHandles(
        [hash],
        {
          create: true,
          origins: '*', // Make the file globally available.
        },
      );
      const writableStream = await handle.createWritable();
      await writableStream.write(fileBlob);
      await writableStream.close();

      console.log('File stored.');
    } catch (err) {
      // The `write()` failed.
    }
    return;
  }
  // 'NotAllowedError', the user agent did not grant access to the file.
  console.log('The user agent did not grant access to the file.');
}
Site B: Retrieving the same model

On Site B, entirely unrelated to Site A, a different web application retrieves the same popular model from COS.

// The hash of the desired file.
const hash = {
  algorithm: 'SHA-256',
  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
};

try {
  const [handle] = await navigator.crossOriginStorage.requestFileHandles([
    hash,
  ]);
  const fileBlob = await handle.getFile();
  console.log('File retrieved', fileBlob);
  // Use the fileBlob as needed.
} catch (err) {
  if (err.name === 'NotFoundError') {
    // The file wasn't in COS.
    console.error(err.name, err.message);
    return;
  }
  // 'NotAllowedError', the user agent did not grant access to the file.
  console.log('The user agent did not grant access to the file.');
}
Key points
  • Unrelated sites: The two sites belong to different origins and do not share any context, ensuring the example demonstrates cross-origin capabilities.
  • Strictly opt-in: Site A explicitly opts in to make the file globally available by setting origins: '*' when storing the file. This ensures that the file is not accidentally made available to all sites.
  • Cross-origin sharing: Despite the different origins, the files are securely identified by their hashes, demonstrating the API's ability to facilitate cross-origin file storage and retrieval.

Detailed design discussion

Hashing

The current hashing algorithm is SHA-256, implemented by the Web Crypto API. If hashing best practices should change, COS will reflect the implementers' recommendation in the Web Crypto API.

The hashing algorithm used is encoded in each hash object's algorithm field of the hashes array as a HashAlgorithmIdentifier. This flexible design allows changing the hashing algorithm in the future.

const hashes = [
  {
    algorithm: 'SHA-256',
    value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
  },
];

Web sustainability

In the context of evaluating carbon emissions in digital data usage, current methodologies predominantly utilize a kilowatt-hour (kWh) per gigabyte (GB) framework to estimate the operational energy intensity of data transmission and storage. This approach provides the following energy consumption benchmarks:

  • Network transmission: 0.013 kWh/GB
  • User devices: 0.081 kWh/GB

While this document does not aim to critically assess the precision of these estimates, it is an established principle that minimizing redundant data downloads and storage is inherently beneficial for sustainability. The Ethical Web Principles specifically highlight that the Web "is an environmentally sustainable platform" and suggest "lowering carbon emissions by minimizing data storage and processing requirements" as measures to achieve this. Consequently, one of the key objectives of the COS API is to enhance Web sustainability by reducing redundant large file downloads when such files are possibly already stored locally on the user's device.

Important

In the context of AI, its implications for sustainability efforts are undeniable. It's essential to adhere to Web Sustainability Guidelines when integrating AI solutions. Prior to implementing AI, it's recommended to assess and research visitor needs to ensure that AI is a justifiable and effective solution that truly improves the experience. For example, by increasing user privacy of video calls by applying AI-based background blurring.

Open questions

Concurrency

What should happen if two tabs depend on the same file, check COS, see the file is not in COS, and start downloading? Should this be handled more efficiently? How often does this happen in practice? In the worst case, the file gets downloaded twice, but would then still only be stored once in COS. This proposal does not address this case. In the worst case, the file is downloaded twice but stored only once in COS, which is considered an acceptable outcome.

Partial COS matches

If the developer wants to check if two files A and B, with the hashes hash_A and hash_B are stored in COS, but only one of the two is stored, the API will still fail with a NotFoundError DOMException without revealing the partial match. The current position is that revealing partial matches would complicate error handling, particularly since the expected use cases commonly require all files to be present simultaneously—for example, the tokenizer, configuration files, weights, and graph for an AI model. Partial-match disclosure is also undesirable from a privacy perspective, as it would enable limited enumeration of COS contents.

As an alternative, the developer can always check for each file separately if it is stored in COS. This way, the developer can handle partial matches as they see fit, for example, by only downloading the missing files from the network.

Minimum file size

Should there be a required minimum file size for a file to be eligible for COS? No minimum file size is proposed. It would be trivial to inflate a file's size to meet any such threshold, for example by appending padding bytes or comments.

Handling of eviction

Under critical storage pressure, user agents could offer a dialog that invites the user to manually free up storage. The user agent could also delete files automatically based on, for example, a least recently used approach.

User agents are further expected to provide settings UI through which users can inspect which files are stored in COS and which origins have most or least recently accessed each file. Users may then choose to delete files from COS through this UI.

When the user clears site data, all usage information associated with the origin should be removed from files in COS. If a file in COS, after the removal of usage information, is deemed unused, the user agent may delete it from COS.

Manual COS management

If a user already has manually downloaded a file such as a large AI model, should the user agent offer a way to let the user put the file in COS? This could be an affordance provided by the user agent.

Considered alternatives

Adding a description for each file apart from the hash

To facilitate manual COS management, one approach would be to allow developers to store a human-readable description alongside the resource. Apps could reference to the same file identified by a unique hash using different descriptions. For example, an English site could refer to the g-2b-it-gpu-int4.bin AI model as "Gemma AI model from Google", whereas another Spanish site could refer to it as "modelo de IA grande de Google". Instead, we envision user agents to enrich COS management UI based on the hashes. For example, a user agent could know that a file identified by a given hash is a well-known AI model and optionally surface this information to the user in the user agent settings UI.

Storing files without hashing

Storing files by their names rather than using hashes would risk name collisions, especially in a cross-origin environment. The use of hashes guarantees unique identification of each file, ensuring that the contents are consistently recognized and retrieved. Storing files based on their URLs would work if apps reference the same URLs, for example, on the same CDN, but wouldn't work if apps reference the same file stored at different locations.

Manually accessing files from a local disk

Different origins can manually open the same file on disk, either using the File System Access API's showOpenFilePicker() method or using the classic <input type="file"> approach. This requires the file to be stored once, and access to the file can then be shared as explained in Cache AI models in the browser. While this works, it's manual and error-prone, as it requires the user to know what file to choose from their hard drive in the file picker.

Integrating cross-origin storage in the fetch() API

On the server, cross-origin isolation is not really a problem. At the same time, server runtimes like Node.js, Bun, or Deno implement fetch() as well. To avoid fragmentation and to keep the present fetch() API simple, it does not make sense to add COS to fetch(). Since fetch() is URL-based, this would also not solve the case where the same file is stored at different locations.

Integrating cross-origin storage in the Cache API

The Cache API is fundamentally modeled around the concepts of Request or URL strings, and Response, for example, Cache.match() or Cache.put(). In contrast, what makes COS unique is that it uses file hashes as the keys to files to avoid duplicates.

Solving the problem only for AI models

AI models are admittedly the biggest motivation for working on COS, so one alternative would be to solve the problem exclusively for AI models. A question that arises in the context is how it would be enforced that files actually be AI models? Given this question, this approach does not seem like a good fit, and the non-AI use cases are well worth addressing, too.

Additionally, common AI inference solutions like Transformers.js rely on WebAssembly in the underlying ONNX Runtime, which is true independent of the backend, WebGPU or Wasm. The same applies to MediaPipe, which requires Wasm files as so-called WasmFileset objects for its various MediaPipe Tasks APIs.

Security and privacy considerations

See the complete questionnaire for details.

Security considerations

Access is scoped to individual files, each identified by their hash. Developers cannot arbitrarily access any random files or obtain the complete list of resources in COS, ensuring limited and precise access control. Files are uniquely identified by their cryptographic hashes (for example, SHA-256), ensuring data integrity. Hashes prevent tampering with the file contents, that is, a site can be sure it gets the same contents from COS as if it had downloaded the file itself, as COS guarantees that each file's contents matches its hash. For enhanced protection, user agents can check file hashes against virus databases like VirusTotal, and integrate with in-browser security features like Safe Browsing even before storing a file.

User agents are expected to provide settings UI for managing COS files, showing stored files and their associated origins. Users can manually evict files or clear all COS data, maintaining control over their storage.

User agents are expected to enrich settings UI based on the file hashes. For example, a user agent could know that a file identified by a given hash is a well-known AI model and optionally surface this information to the user in the settings UI.

Privacy considerations

User agents are expected to make this API available only in contexts where third-party cookies are enabled.

Cross-site probing

If a file is only used on certain kinds of websites, an attacker can discover that the user visited those sites by checking for the file's presence. For example, if someone has a game engine stored in COS, they probably play games on the web, which an attacker might exploit, for example, for targeted advertising. The attacker site would need to probe hashes of resources it's interested in. The origins field mitigates this risk by allowing origins to restrict resource access to a specific set of trusted origins, ensuring the resource is not globally "probeable". Sites are expected to use this field for proprietary resources or when global COS cache hits are not expected.

Beyond the origins field, user agents apply availability gating as a second line of defense: even for globally available resources, the user agent may decline to confirm a file's presence if the resource has not been encountered on a sufficient number of distinct origins.

User agents are expected to implement safeguards against such attacks, for example, by limiting the number of probes, or by returning false negatives when a site known to be malicious is probing. Each call to requestFileHandles() can be considered a probe, independent of the number of files requested, and user agents can limit the number of probes per site or even block probes from sites known to be malicious. Counting calls with multiple requested files as one single probe is acceptable, as the API does not reveal which file was (not) found, but just fails with a NotFoundError DOMException. Therefore, the attacker would still need to make multiple calls to probe for multiple files, which is more easily detectable and more easily blocked by user agents.

Availability gating

User agents are expected to implement an availability gating mechanism that may suppress the presence of a file in COS even when the file is physically stored there. requestFileHandles() must return a NotFoundError DOMException when the user agent determines that revealing the file's presence would constitute a privacy risk, regardless of whether the file is actually present.

User agents should maintain an allowlist of well-known resources—such as AI model weights published by recognized model hubs—that are unconditionally eligible for cross-origin availability disclosure. For resources not on the allowlist, user agents should only confirm a file's presence if it has met a popularity threshold and been encountered on a minimum number of distinct origins, ensuring that no file unique to a small set of sites can be used as a cross-site identifier. Resources that do not meet the popularity threshold are treated as absent: the user agent returns a NotFoundError DOMException as if the file were not stored in COS at all.

Developers must NOT rely on a NotFoundError as definitive proof that a file is absent from COS. A NotFoundError MAY indicate that the user agent has withheld confirmation of the file's presence for privacy reasons.

Cross-site leaks

User agents are also expected to implement safeguards against developers trying to store potentially state-revealing resources in COS through console warnings. For example, if the user agent detects that a site is trying to store a resource with a hash that is unique or uncommon, it can warn the developer that this might be a privacy risk.

Cache flooding

Sites are prevented from flooding the cache in an attempt to evict other sites' resources. Each site can only store a limited amount of data in COS, and if a site tries to exceed this limit, the user agent can block the attempt and log a warning to the console.

Fingerprinting detection

User agents are also expected to use (on-device) machine learning to identify possible fingerprinting attempts. For example, if a site crafts unique hashes for each user (which hints at fingerprinting), user agents can detect this and block the COS probing attempt. Some user agents have successfully applied this technique to silence notification spam.

The knowledge an attacker can gain about a user depends heavily on the popularity of the resources stored in COS. If a user has a very popular resource stored, such as a common AI model, a large Wasm module, or a popular JavaScript library, the attacker can only learn that the user visited one of the many sites that use this resource, which is not very useful information. If a user has a very uncommon or even unique resource stored, the attacker can learn that the user visited one of the few sites (or the only site) that use this resource, which is more useful information. However, user agents are expected to implement safeguards against such attacks, as described above.

Stakeholder feedback / opposition

  • Web Developers: Expressed support for enabling sharing of large files without redundant downloads and storage, particularly large AI models, large Wasm modules, and highly popular JavaScript libraries.

References

Acknowledgements

Many thanks for valuable feedback from:

  • Yash Raj Bharti, independent freelancer
  • Joshua Lochner, Hugging Face

Many thanks for valuable inspiration or ideas from:

  • Kenji Baheux, Google Chrome
  • Kevin Moore, Google Chrome

Appendices

Appendix A: Full IDL

interface mixin NavigatorCrossOriginStorage {
  [SameObject, SecureContext] readonly attribute CrossOriginStorageManager crossOriginStorage;
};
Navigator includes NavigatorCrossOriginStorage;

[Exposed=(Window,Worker), SecureContext]
interface CrossOriginStorageManager {
  Promise<sequence<FileSystemFileHandle>> requestFileHandles(
      sequence<CrossOriginStorageRequestFileHandleHash> hashes,
      CrossOriginStorageRequestFileHandleOptions options = {});
};

dictionary CrossOriginStorageRequestFileHandleHash {
  DOMString value;
  DOMString algorithm;
}

dictionary CrossOriginStorageRequestFileHandleOptions {
  optional boolean create = false;
  optional (USVString or sequence<USVString>) origins;
}

Appendix B: Blob hash with the Web Crypto API

async function getBlobHash(blob) {
  const hashAlgorithmIdentifier = 'SHA-256';

  // Get the contents of the blob as binary data contained in an ArrayBuffer.
  const arrayBuffer = await blob.arrayBuffer();

  // Hash the arrayBuffer using SHA-256.
  const hashBuffer = await crypto.subtle.digest(
    hashAlgorithmIdentifier,
    arrayBuffer,
  );

  // Convert the ArrayBuffer to a hex string.
  const hashArray = Array.from(new Uint8Array(hashBuffer));
  const hashHex = hashArray
    .map((byte) => byte.toString(16).padStart(2, '0'))
    .join('');

  return {
    algorithm: hashAlgorithmIdentifier,
    value: hashHex,
  };
}

// Example usage:
const fileBlob = await fetch('https://example.com/ai-model.bin').then(
  (response) => response.blob(),
);
getBlobHash(fileBlob).then((hash) => {
  console.log('Hash:', hash);
});

Appendix C: Frequently asked questions (FAQ)

Question: Does this API help with resuming downloads? What if downloading a large file fails before the file ends up in COS?

Answer: Managing downloads is out of scope of this proposal. COS can work with complete or with sharded files that the developer stores in COS as separate blobs and then assembles them after retrieval from COS. This way, downloads can be handled completely out-of-bounds, and developers can, for example, leverage the Background Fetch API or regular fetch() requests with Range headers to download large files.

Question: How does this API help with popular JavaScript libraries like jQuery or React?

Answer: Bundlers have historically combined vendor and application code, causing low cache hit rates. By bundling vendor code separately and completely (e.g., all of React) instead of applying dead-code elimination, a higher cache hit rate can be achieved. While JavaScript libraries used to be very fragmented, modern bundling strategies (where vendor code is bundled separately and completely) make them well-suited for COS to ensure high cache hit rates and improved performance across different applications.

Question: What other API is this API shaped after?

Answer: The COS API is shaped after the File System Standard's getFileHandle() function (FileSystemDirectoryHandle.getFileHandle(name, options) which returns a FileSystemFileHandle). Instead of the name parameter in `getFileHandle()`, in COS, there is the hashes array that fulfills the equivalent function of uniquely identifying a set of files in COS. If options.create is not set or is set to false, the user agent will return handles for the files identified by the hashes value. If and only if options.create is set to true, the user agent will return handles that can be written to. Optionally, when options.create is true, developers can also provide a list of origins to restrict who can later read the resource, or make the resource globally available.

Question: Would the first site that added a file be seen as the authority?

Answer: No, each site has the same powers. If the user stops using the first site that has put a given file into COS, but continues using another site that depends on the same file, the file would stay around. Only if no site depends on the file anymore, the user agent may consider the file for manual or automatic removal from COS if it's under storage pressure or based on regular storage house keeping.

Question: Can workers access Cross-Origin Storage?

Answer: Yes, the COS API is available in workers, and the same principles apply. For example, a worker can call `navigator.crossOriginStorage.requestFileHandles()` to request access to files in COS, and if granted access, it can read from or write to those files using the returned `FileSystemFileHandle` objects. This allows workers to also benefit from shared resources in COS, such as large AI models or Wasm modules, without needing to download them separately.

About

Explainer for the Cross-Origin Storage (COS) API

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors