Why URLs and Origins Matter
A URL (Uniform Resource Locator) is the structured address you give to a browser, an API client, or a server-side HTTP library to identify a resource and how to reach it. Even small differences in a URL can change which server is contacted, which application receives the request, which handler runs, what data is returned, and what security rules apply in the browser.
An origin is a security boundary used primarily by browsers. Many web platform rules (for example, whether JavaScript can read a response, whether cookies are sent, and whether storage is shared) are evaluated at the origin level. Understanding exactly how a URL is broken into parts and how an origin is derived from it helps you debug issues like “request goes to the wrong place,” “cookies not sent,” “CORS blocked,” “redirect loop,” and “cache misses.”
The Main Parts of a URL
Most URLs you use for HTTP services follow this general shape:
scheme://host:port/path?query#fragmentNot every part is always present, but the order and separators are consistent. Let’s define each piece and what it influences.
Scheme
The scheme tells the client what protocol to use and how to interpret the rest of the URL. For web traffic, the most common schemes are http and https.
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
http: Unencrypted HTTP over TCP (commonly port 80 by default).https: HTTP over TLS (commonly port 443 by default). The scheme affects security expectations and browser behavior (for example, many APIs require a secure context).
Other schemes exist (for example, ws/wss for WebSocket, ftp, file, mailto), but in a web server context you’ll mostly reason about http and https.
Important practical detail: changing only the scheme from http to https changes the origin, changes default port assumptions, and changes how cookies with the Secure attribute behave.
Host
The host identifies the server you intend to reach. It is typically a domain name (like api.example.com) but can also be an IP address (like 203.0.113.10 or [2001:db8::10] for IPv6).
Hosts are case-insensitive in practice for DNS names, but you should treat them as normalized to lowercase to avoid subtle mismatches in logging, caching, or security checks.
Hosts can include subdomains, and subdomains often map to different applications or environments:
www.example.commight serve the marketing site.app.example.commight serve the web app.api.example.commight serve JSON APIs.
In HTTP/1.1 and HTTP/2, the host is also conveyed in the request (via the Host header in HTTP/1.1, and via :authority in HTTP/2). This is what enables virtual hosting: multiple sites on the same IP address.
Port
The port selects which service on the host you want to talk to. If you omit the port, the client uses the scheme’s default:
httpdefaults to port80httpsdefaults to port443
When you explicitly include a port, it becomes part of the origin and can affect routing and security decisions. For example:
https://example.comandhttps://example.com:443are equivalent in meaning, but many systems normalize away the explicit default port.https://example.com:8443is a different origin fromhttps://example.com.
Ports are especially common in development and internal services:
http://localhost:3000for a frontend dev serverhttp://localhost:8080for a backend API
Path
The path identifies a resource within the server/application. It begins with / and can contain multiple segments:
/products/123/reviewsFrom the server’s perspective, the path is commonly used for routing (mapping to a controller/handler) and sometimes for selecting static files. From a reverse proxy’s perspective, the path can be used for path-based routing (for example, send /api to one upstream and / to another).
Paths are generally case-sensitive on the web (even if some backends treat them differently). Treat /Users and /users as potentially different routes.
Two practical details that often cause bugs:
- Trailing slash:
/docsand/docs/are different paths. Many frameworks redirect one to the other, but you should not assume they are equivalent. - Dot segments:
/a/b/../ccan be normalized to/a/c. Clients and proxies may normalize, but don’t rely on inconsistent behavior for security checks.
Query
The query begins after ? and contains additional parameters. It is commonly used for filtering, pagination, tracking, and optional inputs:
?page=2&sort=price&in_stock=trueQuery parameters are part of the URL and are sent to the server as part of the request target. They often influence caching and routing logic. Many caches treat different query strings as different cache keys, but some CDNs allow you to ignore certain parameters (for example, ignore utm_* tracking parameters).
Important: the query is not inherently ordered semantically, but many systems treat it as a raw string. That means ?a=1&b=2 and ?b=2&a=1 may be equivalent for your application logic but different for caching layers or signature verification schemes.
Fragment
The fragment begins after # and is used by the client (typically the browser) to refer to a portion of a document or a client-side route state:
#section-3Key rule: for normal HTTP requests, the fragment is not sent to the server. It is processed client-side. This is why you cannot rely on fragments for server-side routing, authentication, or logging. If you see a fragment in the address bar, your server will not see it in the request line.
Fragments are commonly used for:
- Scrolling to an element with a matching
idin HTML documents. - Single-page applications that use hash-based routing (for example,
https://example.com/#/settings).
Origins: The Browser’s Security Grouping
An origin is defined as the tuple:
(scheme, host, port)Path, query, and fragment are not part of the origin.
Examples:
https://example.com/accounthas originhttps://example.com:443(port implied).https://example.com:8443/accounthas originhttps://example.com:8443.http://example.comhas originhttp://example.com:80.https://api.example.comis a different origin fromhttps://example.combecause the host differs.
Why this matters: browsers isolate many capabilities by origin. If you load a page from one origin and try to fetch data from another origin, the browser may restrict access unless the server explicitly allows it. Similarly, cookies and storage are scoped in ways that often align with origin boundaries (with some cookie rules using domain scoping, which is related but not identical).
Same-Origin vs Same-Site (Don’t Confuse Them)
Same-origin means scheme, host, and port all match. Same-site is a related concept used for cookie and request context decisions and is based on registrable domain (for example, app.example.com and api.example.com are often considered same-site but not same-origin). You will frequently see issues where something “works across subdomains” for cookies but still fails for JavaScript access due to same-origin rules.
Step-by-Step: Parse a URL Like a Debugger
When you’re troubleshooting, don’t eyeball a URL. Parse it systematically and write down each component.
Example 1
https://api.example.com:8443/v1/users/42?expand=teams&limit=10#profile- Scheme:
https - Host:
api.example.com - Port:
8443(explicit) - Path:
/v1/users/42 - Query:
expand=teams&limit=10 - Fragment:
profile - Origin:
https://api.example.com:8443
Debug implications:
- If a browser page from
https://app.example.comcalls this API, it is cross-origin (host differs and port differs). - The server will receive the path and query, but not the fragment.
Example 2 (default port)
http://localhost/test?x=1- Scheme:
http - Host:
localhost - Port:
80(implicit default) - Path:
/test - Query:
x=1 - Origin:
http://localhost:80
Debug implication: if your dev server is actually listening on 3000, this URL will not reach it. You must use http://localhost:3000/test?x=1.
How Each Component Affects Server Routing and Reverse Proxies
Scheme and reverse proxies
Even if your application only listens on plain HTTP behind a reverse proxy, the external URL may be https. Many frameworks need to know the “original scheme” to generate correct absolute URLs, redirects, and cookie attributes. If the app thinks the scheme is http when the user is actually on https, you may see:
- Redirects to
http://(downgrading security or causing mixed content problems). - Cookies missing the
Secureattribute when they should have it.
Practically, this is why deployments often forward a header indicating the external scheme (commonly X-Forwarded-Proto) and configure the app to trust it only from known proxies.
Host and virtual hosting
On a single IP, a reverse proxy can host multiple domains. The host determines which site configuration is used. If the host header is wrong, you can land on the wrong site or get a default “unknown host” response.
Practical debugging checklist when a request hits the wrong app:
- Verify the URL host matches the intended domain.
- Verify the request’s
Hostheader (or:authority) matches. - Check reverse proxy routing rules that match on host.
Port and environment separation
Ports are frequently used to separate environments or services on the same host. If you run multiple services on one machine, the port is the primary selector. In containerized setups, you also have to distinguish between container port and published host port; the URL must use the published port.
Path-based routing
Reverse proxies and API gateways often route by path prefix:
/api/goes to an API service/static/goes to a static file server or CDN origin/goes to the frontend app
Small path differences can break routing. For example, if the proxy expects /api/ but the client calls /API/, a case-sensitive match may fail and route to the wrong backend.
Query Strings in Practice: Encoding, Repetition, and Safety
Percent-encoding basics
URLs are limited to a subset of ASCII characters in their raw form. When you need to include spaces or reserved characters, they must be encoded. For example, a space in a query value is commonly encoded as %20 (or + in some form-encoding contexts):
?q=hello%20worldReserved characters like & and = have special meaning in query strings, so if they are part of a value, they must be encoded:
?note=fish%26chipsRepeated keys and arrays
Many APIs accept repeated keys:
?tag=red&tag=blue&tag=greenOthers use bracket conventions:
?tag[]=red&tag[]=blueThere is no single universal standard for how servers interpret these. When you design an API, document the expected format and test it with your framework’s parser.
Query strings and sensitive data
Because query strings are part of the URL, they often end up in logs, browser history, bookmarks, monitoring tools, and referrer headers (depending on referrer policy). Avoid putting secrets (API keys, passwords, one-time tokens) in query parameters. Prefer headers or request bodies for sensitive values.
Fragments in Practice: Client-Side Only, But Still Important
Although fragments are not sent to the server, they can still affect user experience and client-side routing. Two common patterns:
- Document anchors:
https://example.com/docs#installscrolls to the element withid="install". - Hash routing:
https://example.com/#/settings/profilelets a single HTML page handle multiple “routes” without server involvement.
Debugging tip: if you see a 404 from the server for a single-page app route like /settings/profile, switching to hash routing can avoid server configuration changes, but it changes URL semantics and may affect analytics and SEO. Alternatively, configure the server to serve the SPA entry point for unknown paths.
Common URL Normalization Pitfalls
Default ports and origin comparisons
When comparing origins, normalize default ports. A browser treats https://example.com as the same origin as https://example.com:443. But some application code compares strings and mistakenly treats them as different. Prefer using a URL parser and comparing structured components.
Trailing slashes and redirects
If your server redirects /docs to /docs/, that redirect can change relative URL resolution in the browser. For example, relative links behave differently depending on whether the base path ends with a slash. When you see broken relative asset paths, check whether the page URL ends with / and whether a redirect occurred.
Case sensitivity
Hosts are effectively case-insensitive, but paths are generally case-sensitive. A link to /Images/logo.png may work on a case-insensitive filesystem in development but fail in production on a case-sensitive filesystem or router.
Internationalized domain names (IDN)
Some domain names contain non-ASCII characters. Internally, they are represented using punycode (an ASCII encoding). Most modern clients handle this automatically, but logs and security filters may see the punycode form. If you do allow user-supplied URLs, be careful about look-alike characters and validate/normalize domains before applying allowlists.
Hands-On: Determine Whether Two URLs Are Same-Origin
Use this repeatable process whenever you’re unsure whether browser same-origin rules apply.
Step-by-step checklist
- Parse both URLs into scheme, host, and port.
- If a port is missing, substitute the default for the scheme (
80forhttp,443forhttps). - Compare scheme, host, and port. All three must match for same-origin.
- Ignore path, query, and fragment for the origin decision.
Practice comparisons
https://example.com/avshttps://example.com/b: same-origin (path differs only).https://example.comvshttp://example.com: different origin (scheme differs).https://example.comvshttps://www.example.com: different origin (host differs).https://example.comvshttps://example.com:8443: different origin (port differs).https://example.com:443vshttps://example.com: same-origin (default port normalization).
Hands-On: Build Correct URLs in Code (Avoid String Concatenation Bugs)
Many production bugs come from manually concatenating strings to form URLs (double slashes, missing encoding, incorrect query separators). Prefer a URL builder in your language.
Example: JavaScript (URL and URLSearchParams)
const url = new URL('https://api.example.com/v1/search');url.searchParams.set('q', 'hello world');url.searchParams.set('limit', '10');console.log(url.toString());// https://api.example.com/v1/search?q=hello+world&limit=10What this gives you:
- Correct encoding of spaces and reserved characters.
- Correct placement of
?and&. - A structured way to read and modify components.
Example: Python (urllib.parse)
from urllib.parse import urlparse, urlencode, urlunparseparsed = urlparse('https://example.com/items')query = urlencode({'page': 2, 'q': 'fish & chips'})built = urlunparse((parsed.scheme, parsed.netloc, parsed.path, '', query, ''))print(built)# https://example.com/items?page=2&q=fish+%26+chipsNotice how & inside the value becomes %26, preventing it from being misread as a parameter separator.
Hands-On: Understand Relative URLs and Base Paths
Browsers resolve relative links based on the current document URL. This is mostly a client-side concern, but it affects how you structure paths on the server and how redirects behave.
Step-by-step resolution examples
Assume the current page is:
https://example.com/docs/guide/index.html- Relative link
images/logo.pngresolves tohttps://example.com/docs/guide/images/logo.png - Relative link
/images/logo.pngresolves tohttps://example.com/images/logo.png(leading slash means “from the origin root”) - Relative link
../apiresolves tohttps://example.com/docs/api
Practical debugging: if assets fail to load after you move a page deeper into a path hierarchy, check whether your HTML uses root-relative paths (/assets/app.css) or document-relative paths (assets/app.css).
Quick Reference: What Reaches the Server?
When a browser makes an HTTP request, the server-side application typically receives:
- Scheme: not directly in the request line, but it is implied by the connection (or forwarded by a proxy).
- Host: via
Host/:authority. - Port: implied by the connection and sometimes visible via proxy headers.
- Path: yes.
- Query: yes.
- Fragment: no.
This mental model helps you quickly answer questions like “why doesn’t my server see #token=...?” and “why does my app generate the wrong absolute URL behind a proxy?”