About this Cloud Hub Solution:
Broken links can be detrimental to your website's user experience and search engine visibility. Google discourages broken links and downgrades your SEO reputation accordingly. To avoid this, it's essential to ensure that your website doesn't contain links to broken content or non-functional pages.
The 404 Watch API is a powerful link checker that can help you identify and fix broken links on your website. With its advanced features, you can:
- Respect nofollow attributes and check external links
- Discard query parameters and hash parameters
- Check images, JS, and CSS files for broken links
- Whitelist and exclude domains from checking
- Trigger a callback URL when the link checking is done, or poll the results
Whitelisting Domains
If you have multiple domains served from a single site, you can add them to the whitelisted domains list while creating the link checker job. This will treat them as internal resources.
Excluding domains and URLs from checking
You can exclude specific domains from link checking by adding them to the excluded_domains_list or excluded_urls_list variables while creating the link checker job.
Callback when finished
Link checking can be a time-consuming process, especially if you have multiple URLs and assets on your site. To address this, the 404 Watch API uses an asynchronous approach.
You can create a link checking job using the POST /job endpoint and receive an ID in response. You can then poll the GET /job/{id} endpoint to monitor the ongoing link checking process or retrieve the results for finished jobs.
Alternatively, you can provide a callback URL when creating the link checker job via the POST /job endpoint. This will allow the API to call the provided callback URL (via HTTP POST) automatically when the process is completed.
You can also provide a callback_security variable when creating the link checker job, which will be placed in the HTTP header using the X-Callback-Secret header. This can be used for authentication purposes.
Sample Request for creating a new link checker job
Below is a sample request that can be sent to the POST /job endpoint. It includes various configuration parameters for optimizing the link checking process.
curl --location --request POST 'https://api.apilayer.com/404_watch/job' \--header 'Content-Type: application/json' \--header 'apikey: YOUR API KEY' \--data-raw '{ "url": "https://p1.rs", "levels": 2, "fetch_external": false, "check_images": false, "check_css": true, "callback": "https://mydomain.com/callback", "callback_secret": "supersecret_key", "check_js": true, "whitelisted_domains_list": [ "apilayer.com" ]}'
When called you'll get a response such as below:
{ "id": "c3f7b23e-a239-4af4-b9ec-698a3a6d0a21"}
You can use this id for querying the results via GET /job/{id} endpoint.
curl --location --request GET 'https://api.apilayer.com/404_watch/job/c3f7b23e-a239-4af4-b9ec-698a3a6d0a21' \--header 'apikey: YOUR API KEY'
The response contains comprehensive details about the ongoing process and the results. See below:
{ "id": "c3f7b23e-a239-4af4-b9ec-698a3a6d0a21", "created_at": 1609681539, "status": "finished", "url": "https://apilayer.com", "progress": { "discovered": 117, "checked": 117, "percentage": 100.0 }, "status_codes": { "503": 4, "200": 110 }, "content_types": { "image/svg+xml": 10, "image/png": 25, "text/css": 6, "text/html": 60, "image/jpeg": 5, "application/javascript": 8, "application/x-javascript": 1 }, "options": { "callback_secret": null, "check_css": true, "max_levels": 3, "check_js": true, "max_links": 1000, "excluded_domains_list": [], "fetch_nofollow": false, "excluded_urls_list": [], "fetch_external": true, "whitelisted_domains_list": "assets.apilayer.com", "omit_query_params": false, "callback": null, "omit_hash_params": true, "check_images": true }}
Date variables above (created_at) are timestamps. Getting the details for each link that is checked. If you wish to get all the links that is discovered and been checked using the GET /job/{id}/links endpoint. See the following example.
curl --location --request GET 'https://api.apilayer.com/404_watch/job/c3f7b23e-a239-4af4-b9ec-698a3a6d0a21/links' \--header 'apikey: YOUR KEY'
The response contains all the links as well as the details for content types and http status codes. You may filter and use it the way you desire.
{ "job_id": "d0de484e-c18f-4ee8-b84e-4ba63907e283", "status": "finished", "created_at": 1609681539, "links": [ { "url": "https://apilayer.com", "content_type": "text/html", "is_timeout": false, "http_status": 200, "fetched_at": 1609681563 }, { "url": "https://assets.apilayer.com/apis/image_similarity.png", "content_type": "image/png", "is_timeout": false, "http_status": 200, "fetched_at": 1609681578 }, { "url": "https://apilayer.com/marketplace/description/textgears-api", "content_type": "text/html", "is_timeout": false, "http_status": 200, "fetched_at": 1609681592 }, { "url": "https://apilayer.com/marketplace/category/text-processing-apis", "content_type": "text/html", "is_timeout": false, "http_status": 200, "fetched_at": 1609746185 }, { "url": "https://js.hs-scripts.com/7564526.js", "content_type": "application/javascript", "is_timeout": false, "http_status": 200, "fetched_at": 1609746223 }, { "url": "https://apilayer.com/marketplace/tag/spelling", "content_type": "text/html", "is_timeout": false, "http_status": 200, "fetched_at": 1609746233 }, { "url": "https://apilayer.com/marketplace/tag/text-tools", "content_type": "text/html", "is_timeout": false, "http_status": 200, "fetched_at": 1609746256 }, { "url": "https://textgears.com/assets/img/logos/apple/120.png", "content_type": "image/png", "is_timeout": false, "http_status": 200, "fetched_at": 1609746270 }, { "url": "https://apilayer.com/assets/css/documentation.css?6", "content_type": "text/css", "is_timeout": false, "http_status": 200, "fetched_at": 1609746374 }, { "url": "https://apilayer.com/assets/js/marketplace/marketplace.js?52", "content_type": "application/javascript", "is_timeout": false, "http_status": 200, "fetched_at": 1609746577 } ], "query": { "limit": 10, "offset": 0, "page": 0, "total_count": 117 }}