My first Chrome extension
TL;DR
Extensions are not hard; reverse-engineering the web page you extend is hard. There are some gotchas, though.
I have been developing software for almost 20 years but have never developed a real Chrome extension. The technology behind it is called WebExtensions API and it is a cross-browser technology. With minor modifications, it is designed to work on Chrome, Microsoft Edge, Opera, Vivaldi, and possibly Safari. That is the theory.
It was 10 years ago that, out of curiosity, I created a Hello World extension but have never developed browser extensions since. There was no use case for me where a browser extension made sense. I associate them with React Dev Tools, Wappalyzer, Ad blockers, and the like, but nothing that fit my needs. That has changed now.
Use case for a browser extension
I am working on a project that aims to extend an existing CRM web app with AI capabilities. After brainstorming how to integrate the additional functionality, I chose a browser extension. The extension can augment the web app so that users do not even recognize that the additional functionality comes from someone other than the CRM vendor. I hope to achieve this.
Extensions scared me
I do not know why, but I have always avoided browser extensions. Perhaps because I am not familiar with them, or because they seem hacky. They inject code into existing apps, and developers make assumptions that do not hold true, breaking the extension. These thoughts run through my mind.
Lifecycle of an extension
To overcome my fear, I studied the MDN and Chrome documentation about browser extensions. The documentation is excellent and not complicated, especially for web developers familiar with the DOM and other browser APIs.
An extension typically consists of JavaScript files, resource files, and a manifest.json. This is the only mandatory file, containing metadata such as name and description, and specifying which JavaScript files run when. An extension has full access to the DOM of a web page.
Assumptions for DOM access
The content_script defines the JavaScript files to be executed when a page loads. The matches key defines which pages trigger the execution of a JavaScript file. This is very flexible, allowing targeting any web page or just one specific URL.
What remains is when the JavaScript file gets executed by the browser. It happens every time the browser loads a page. When you type a URL in the browser's address bar and press Enter, the browser executes the extension's JavaScript file. Reloading the page also executes it.
Does the JavaScript file run before the DOM is available, after, or somewhere in between? By default, it runs when the browser determines it is best. This is typically when the DOM is ready and resources like images are loaded. This helps keep the impact of extensions on page load times low. You can change this with the run_at directive.
Another special feature of extensions is that each executes JavaScript in its own sandbox. You cannot access any global JavaScript variables from the web page or from other extensions.
This covers most of what you need to know if your only goal is to modify the DOM of a web page.
CORS restriction
Until the late late 2010s, you could fetch any URL from the content_script. For better security, this has changed and the latest CORS policies are applied. There is a way for an extension to fetch from any URL: you must do it from a background script, which has no CORS restrictions, and pass messages between the content and background scripts. This approach is more complex than the previous method.
What we have so far:
- The browser executes JavaScript files on every page load.
- The
matcheskey restricts which URLs trigger the execution of JavaScript files. - The
run_atdirective specifies when JavaScript files are executed, typically after the DOM is ready. - JavaScript files in the
content_scriptcannot fetch any URL because of CORS restrictions. - Use background scripts to bypass the browser's CORS policy.
Examples
Given this manifest, here are examples of how you can program your JavaScript files.
manifest.json:
"content_scripts": [
{
"matches": ["https://www.example.com/*"],
"js": ["content.js"],
"run_at": "document_end"
}
],
content.js
// You can reliably access and modify the page's DOM
const someElement = document.querySelector('.some-class')
// For SPAs (single page applications) you can observe DOM changes like this
const observer = new MutationObserver((mutations) => {
mutations.forEach((mutation) => {
if (mutation.addedNodes.length) {
const button = document.createElement('button')
...
document.body.appendChild(button)
}
});
})
observer.observe(document.body, {
childList: true,
subtree: true,
attributes: true,
})
You do not need to wrap your code in document.addEventListener('DOMContentLoaded', () => {...}).
There are many more capabilities of browser extensions, such as executing operations in the background and hooking into browser events like tab opening, which are unavailable to web pages. I do not currently need them, so I will stop here. Readers are encouraged to explore further.
Conclusion
I discovered that browser extensions are not scary. Through documentation reading and experimentation, I learned the basics needed to achieve my goal. The primary challenge in developing a browser extension lies in analyzing and reverse-engineering the page you want to extend.