CSP-CERT® Resources:
Detecting Suspicious and Phishing Websites by understanding the URL structure

by CSP-CERT® Blue Team
posted April 2018



In this article, we will see the structure of a URL (Uniform Resource Locator) and how we can use it to detect and identify suspicious and phishing websites.

One of the common mistakes people make while surfing on the Internet is NOT CHECKING THE ADDRESS BAR of the web browser. The second is not having any understanding of what the URL on the address bar means.

The URL is typically structured in this format:

protocol://hostname/path/resource

The hostname can be broken down into the following format which is commonly seen when surfing the internet:

protocol://subdomain.domain.tld/path/resource

An example of how a typical URL looks like in a browser’s address bar is shown below:



Here we can see that the URL on the browser is:

https://www.cspcert.ph/practices/dfir.html


Let us dissect its URL structure:


For the purposes of this article, let us define each of the elements in the figure (see examples from above):

    Protocol – The mode of communication used by the Server and Client (examples: http, https)
    Subdomain – Any name directly under the domain name (to the left of the dot from the SLD, example: www)
    Domain Name – Composed of the SLD and the TLD (example: cspcert.ph)
    SLD – Stands for “Second-Level Domain” and is any name directly under the TLD (to the left of the dot from the TLD, example: cspcert)
    TLD – Stands for “Top-Level Domain” as defined in RFC 1591 (examples: com, net, org, ph)
    Hostname – Can be an IP address (more on this in another article) but typically comprised of the subdomain and domain name (example: www.cspcert.ph)
    Resource Path – The path to the file being accessed (example: practices)
    Resource – The file being accessed (example: dfir.html)

It is important to note that if accessing a specific organization’s website, its domain name should be consistent with the organization’s.

A phishing page will show indicators that it is not a legitimate website used by the organization.

An example of a phishing site can be seen below:



Notice that although in the above figure, the protocol is supposedly “Secure” (https), the subdomain (facebook4777) may apparently be legitimate, but the domain name itself (webnode.com) is not consistent with the organization’s website, as well as the favicon.

See example below of the legitimate website:



Discussion on Digital Certificate Types will be taken up in another article.

However, it is common to see that if a website is legitimate, it will have an EV (Extended Validation) Digital Certificate. This means that you would see the organization’s name along with the Security Lock icon as shown in the example below:



Another example of a fairly convincing Phishing website can be seen below. Pointed with red arrows are indicators that the site is not legitimate.


Here we see the following indicators:

1. The site does not have an EV Digital Certificate like the legitimate website

2. The domain name is actually “growellconsultancy.co.in” instead of “bpiexpressonline.com”

3. The word “bpiexpressonline” is just a subdomain of the domain “growellconsultancy.co.in” which is not the genuine domain name “bpiexpressonline.com”.


Recommendations:

1. Always check the whole URL of the website you are visiting

2. Verify that the website is legitimate based on:

- EV Digital Certificate (The organization’s name is stated on the Address Bar)

- The Domain Name is correct and consistent with organization’s website.

We hope this article helps you. Thanks for your time.