[Python] How to get full url from shortened url

[python]-how-to-get-full-url-from-shortened-url

Sometimes when you scrape a website, you may have encountered the fact that the website returns shortened URLs to sources from other websites.

As in this case, for example, https://upflix.pl/r/Qb64Ar this link consists of a domain and some random characters. The way a shortened link works is that it redirects you to another page. Therefore, the status_code that our query returns is 302

Sometimes it happens that we need a full URL to get that can do this with a few lines of Python code and the requests library.

pip install requests

We will use the head method to perform this function
This method is similar to get with the difference that it does not return any content, only headers.

response = requests.head(short_url)

After executing the query, we can check the headers that were returned.
There is information here such as:

  • date
  • type of website content
  • character encoding
  • FULL LINK
    and many other information you can see below.
{'Date': 'Thu, 16 Nov 2023 00:43:13 GMT', 'Content-Type': 'text/html; charset=UTF-8', 'Connection': 'keep-alive', 'location': 'https://www.imdb.com/title/tt14060708/', 'vary': 'Origin', 'x-powered-by': 'PHP/7.3.33', 'x-frame-options': 'SAMEORIGIN', 'CF-Cache-Status': 'DYNAMIC', 'Report-To': '{"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=bgvCMcMQg1ZkjanlgqzemKUHHthalhb%2FAT72Q58O8a22eFmkeb%2FyeeIMfKkGFwt8WmkMB6dv28F1G2CdH134Kilk%2BcdQNweIZ3O%2FN9KlQf1A2VF%2Bm3yYT89rvjU%3D"}],"group":"cf-nel","max_age":604800}', 'NEL': '{"success_fraction":0,"report_to":"cf-nel","max_age":604800}', 'Strict-Transport-Security': 'max-age=15552000; includeSubDomains; preload', 'X-Content-Type-Options': 'nosniff', 'Server': 'cloudflare', 'CF-RAY': '826bb2ce597ebfda-WAW', 'alt-svc': 'h3=":443"; ma=86400'}

Full code

import requests

def get_full_url(short_url: str) -> Optional[str]
    response = requests.head(short_url)
    if response.status_code == 302:
        headers = response.headers
        return headers["location"]

    return None 

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
how-to-raise-your-first-round

How to Raise Your First Round

Next Post
elevating-your-mobile-app-dreams:-expert-react-native-development-for-ios-and-android

Elevating Your Mobile App Dreams: Expert React Native Development for iOS and Android

Related Posts