.Net URL Redirect Checker
Hi,
“Today, we’ll try to explain how to trace the final URL in a browser. Have you ever wondered, ‘Where does this link actually lead?’ I recently encountered this issue while addressing security concerns.” I work for a cyber security company in London. I had to analyze a URL for safety. I had to determine if it was malicious or not. You can find lots of free services for that like virustotal But what about URL redirect caseses? There are many Redirect types 301 Redirect, Javascript Redirect, Meta Refresh Redirect, and Url Shortener like Bitly.
“It’s not a dead-end; it’s a redirection.” — Kristen Wetherell
First, what is URL redirection? Also known as URL forwarding, it is a technique used on the World Wide Web to make a web page accessible through multiple URL addresses. When a web browser tries to open a redirected URL, it is automatically directed to a page with a different URL.
Having a firm grasp on redirects and the many different scenarios in which they are used is essential when it comes to technical SEO.
Use Cases
- Changing Domain: When migrating domains, redirect visitors and link value from the old domain to the new one.
- Merging Domains: Redirect domain variations or acquired competitor domains to a single correct destination.
- Site Restructures: Redirects prevent broken pages and ensure a good user experience when URLs change during a site redesign.
- Avoiding Duplicate Content: Redirects are needed when switching URLs to or from trailing slashes.
- URL Changes: When changing a URL, like replacing an old product with a new one, use a redirect (e.g., URL shortening).
Example Of Redirect Url Site:
Client-Side Redirects
For detecting client-side redirection, we have to use any Headless Browser. I preferred ChromeDriver so I used OpenQA.Selenium and OpenQA.Selenium.Chrome libraries for .Net Application. Create a .Net 9.0 Console application and download the below library from Nuget.
First, ensure that you have the Chrome web browser installed. Next, download the appropriate “chromedriver.exe” from [Chrome for Testing](https://googlechromelabs.github.io/chrome-for-testing/). Before downloading, check your Chrome version and select the matching chromedriver version. Put the “chromedriver.exe” under your application bin folder.
1-)Everything is ready now declare the options of ChomeDriver.
var options = new ChromeOptions();//It is used to set the operating mode of the Chrome browser.
options.AddArgument("--headless");//Runs the browser in headless mode (a visual browser window does not open).
options.AddArgument("--disable-gpu");//Disables GPU acceleration (may be necessary in some cases).
//options.AddArgument("--no-sandbox"); //Disables some security measures; It is optional and is recommended to be used only when necessary.
2-)Create a driver and set some Timeouts so as not to wait for infinity.
using (var driver = new ChromeDriver(options))
{
try
{
driver.Manage().Timeouts().PageLoad = TimeSpan.FromSeconds(15); // Max Page Load Time
driver.Manage().Timeouts().ImplicitWait = TimeSpan.FromSeconds(10); // MAX Waiting for staff to be found
3-) We set the tracer URL and max redirect count. Because I wanted to limit this to 10 in cases where there are URLs that are deliberately redirected to each other. We created httpClient and set a timeout for 404 pages. And we checked the limit of redirection count.
string currentUrl = "http://bit.ly/49Qm9JN";
int redirectsCount = 10;
using (var httpClient = new HttpClient(new HttpClientHandler { AllowAutoRedirect = false }))
{
httpClient.Timeout=TimeSpan.FromSeconds(10);
while (redirectsCount < maxRedirect)
{
Meta Refresh Redirect
Meta refresh redirects, often used for countdowns (e.g., “Redirecting in 5 seconds…”), are defined in the page’s code and serve as alternatives to server-side redirects. However, they have drawbacks and can confuse users.
- ”string metaRedirectUrl = GetMetaRefreshRedirectUrl(driver.PageSource)”: This line Passes the HTML page source “driver.PageSource” to find the URL for the meta refresh redirect. If a redirect URL is found, it will be stored in “metaRedirectUrl”.
2. “if (!string.IsNullOrEmpty(metaRedirectUrl))”: This checks if a valid meta refresh redirect URL was found (i.e., `metaRedirectUrl` is not null or empty).
3. ”redirectResults.Add(new RedirectResult { … })”: If a redirect URL is found, a new `RedirectResult` is created and added to the `redirectResults` collection. We will declare redirectResults in the next part.
- StatusCode = “206”: This could represent a partial content or custom status indicating a redirect.
- Url = metaRedirectUrl: The URL of the redirect found.
- Type = “Meta Refresh Redirect”: Labels the type of redirect as “Meta Refresh Redirect”.
4. “currentUrl” = metaRedirectUrl”: Updates the `currentUrl` to the new redirect URL.
5. “redirectsCount++”: Increments a counter (`redirectsCount`) to keep track of the number of redirects processed.
6. “continue”: Skips the remaining code in the current loop iteration and moves to the next one, ensuring that the current URL is updated and the redirect is processed.
// Meta Refresh Redirect
string metaRedirectUrl = GetMetaRefreshRedirectUrl(driver.PageSource);
if (!string.IsNullOrEmpty(metaRedirectUrl))
{
redirectResults.Add(new RedirectResult
{
StatusCode = "206",
Url = metaRedirectUrl,
Type = "Meta Refresh Redirect"
});
currentUrl = metaRedirectUrl;
redirectsCount++;
continue;
}
Darling, a rejection is never a failure. It’s simply a redirection towards something better. — Azalea Design Co.
GetMetaRefreshRedirectUrl():
The “GetMetaRefreshRedirectUrl()” function extracts the redirect URL from a meta refresh tag in the HTML page source. It searches for the `<meta http-equiv=”refresh”>` tag, locates the URL using the `URL=` parameter, and returns the redirect URL. If no valid URL is found, it returns `null`. You can use the Regex pattern like below. It is up to you.
static string GetMetaRefreshRedirectUrl(string pageSource)
{
//string pattern = @"<meta\s+http-equiv=['""]?refresh['""]?\s+content=['""]?.*?['""]?.*?>";
//MatchCollection matches = Regex.Matches(pageSource, pattern, RegexOptions.IgnoreCase);
int metaIndex = pageSource.IndexOf("http-equiv=\"refresh\"", StringComparison.OrdinalIgnoreCase);
if (metaIndex >= 0)
{
int urlStartIndex = pageSource.IndexOf("URL=", metaIndex, StringComparison.OrdinalIgnoreCase) + 4;
if (urlStartIndex > 4)
{
int urlEndIndex = pageSource.IndexOfAny(new[] { '"', '\'', '>' }, urlStartIndex);
if (urlEndIndex > urlStartIndex)
return pageSource.Substring(urlStartIndex, urlEndIndex - urlStartIndex).Trim();
}
}
return null;
}
RedirectResult: This is the result of detached redirect Url link, status code and type
public class RedirectResult
{
public string StatusCode { get; set; } = string.Empty;
public string Url { get; set; } = string.Empty;
public string Type { get; set; } = string.Empty;
}
HTTP Redirect:
This code handles HTTP redirects as follows:
1. It sends an HTTP request “GetAsync()” to the `currentUrl`.
2. It checks the response status code and extracts the redirect URL from the “Location” header, if present.
3. It adds a “RedirectResult” to `redirectResults` with the status code, URL, and redirect type (either “Permanent” for 301 or “Temporary” for other status codes).
4. If a redirect URL is found, it updates `currentUrl` to the new URL, increments the `redirectsCount`, and continues processing the next redirect.
// HTTP Redirect
var response = await httpClient.GetAsync(currentUrl);
int statusCode = (int)response.StatusCode;
string? redirectUrl = response.Headers.Location?.ToString();
redirectResults.Add(new RedirectResult
{
StatusCode = statusCode.ToString(),
Url = currentUrl,
Type = redirectsCount == 0 ? "Initial URL" : "HTTP Redirect (" + (statusCode == 301 ? "Permanent" : "Temporary") + ")"
});
if (!string.IsNullOrEmpty(redirectUrl))
{
currentUrl = redirectUrl;
redirectsCount++;
continue;
}
Embedded Redirect
This code detects and handles embedded URL redirects in a loop. If an embedded redirect is found, it logs the redirect details, updates the current URL, and continues processing. If not, it navigates to the current URL using a web driver.
// Embedded Redirect
string embeddedUrl = GetEmbeddedRedirectUrl(currentUrl);
if (!string.IsNullOrEmpty(embeddedUrl))
{
redirectResults.Add(new RedirectResult
{
StatusCode = "200",
Url = embeddedUrl,
Type = "Embedded URL Redirect"
});
currentUrl = embeddedUrl;
redirectsCount++;
continue;
}
driver.Navigate().GoToUrl(currentUrl);
“Don’t fear redirection; fear staying stagnant.”
— Anonymous
GetEmbeddedRedirectUrl():
This method extracts and returns the first valid absolute URL found in the query parameters of a given URL. If no such URL exists, it returns `null`.
static string GetEmbeddedRedirectUrl(string url)
{
Uri uri;
if (Uri.TryCreate(url, UriKind.Absolute, out uri))
{
var query = HttpUtility.ParseQueryString(uri.Query);
foreach (var key in query.AllKeys)
{
if (Uri.TryCreate(query[key], UriKind.Absolute, out Uri embeddedUri))
{
return embeddedUri.ToString();
}
}
}
return null;
}
Javascript Redirect
This code checks for a JavaScript-based redirect URL in the page source. If found, it logs the redirect, updates the current URL, increments the redirect count, and continues processing. If no redirect is found, it exits the loop.
// JavaScript Redirect
string jsRedirectUrl = GetJavaScriptRedirectUrl(driver.PageSource);
if (!string.IsNullOrEmpty(jsRedirectUrl))
{
redirectResults.Add(new RedirectResult
{
StatusCode = "200",
Url = jsRedirectUrl,
Type = "JavaScript Redirect"
});
currentUrl = jsRedirectUrl;
redirectsCount++;
continue;
}
break;
GetJavaScriptRedirectUrl():
This method searches the page source for a JavaScript redirect (using `window.location`). If found, it extracts and returns the URL from the JavaScript code. If no redirect is found, it returns `null`.
static string GetJavaScriptRedirectUrl(string pageSource)
{
int jsIndex = pageSource.IndexOf("window.location", StringComparison.OrdinalIgnoreCase);
if (jsIndex >= 0)
{
int urlStartIndex = pageSource.IndexOfAny(new[] { '"', '\'' }, jsIndex) + 1;
int urlEndIndex = pageSource.IndexOfAny(new[] { '"', '\'' }, urlStartIndex);
if (urlStartIndex > 0 && urlEndIndex > urlStartIndex)
return pageSource.Substring(urlStartIndex, urlEndIndex - urlStartIndex).Trim();
}
return null;
}
Conclusion:
In the realm of cybersecurity, analyzing only the URL captured in a web request is insufficient. Tracing all redirected URLs is essential, as they may lead to hidden threats. This article provides detailed code solutions for detecting and handling various redirects, such as HTTP, Meta Refresh, JavaScript, and embedded redirects, using .NET tools like Selenium WebDriver and HttpClient. It offers a rare, comprehensive guide, combining real-world cybersecurity applications with code, making it a highly valuable and unique resource.
See you until the next article.
“If you have read so far, first of all, thank you for your patience and support. I welcome all of you to my blog for more!”
Github Source: https://github.com/borakasmer/UrlTraceTool