URLs Gone Bad: Fixing Broken Links

Link rot is a term that describes the tendency of URLs to fail over time because the page has been deleted or moved. Because of the lag between writing and publication, link rot can be a problem even before content is posted, so all URLs should be checked for dead links.

URL-Checking Software

A number of tools exist that check and verify URLs. Well-written URL-checking software can report dead links and some redirected (forwarded) links. Even if a URL appears to work and is not obviously broken, however, the content of the page may have changed, and it may no longer be what the author intended to cite. Because of this issue and others explained below, it’s not possible to detect all bad links automatically. Therefore, a copy editor should check all URLs manually at least once during the publication process, verifying that the content of the page matches the citation context.

The Link Works, but Is the Content Right?

The content of a web page can change at any time. For example, many government agencies report statistics regularly, archiving or deleting the old report once the new one becomes available. Here’s a reference from the Administration on Aging:

Department of Health and Human Services. A profile of older Americans: 2007. http://www.aoa.gov/AoARoot/Aging_Statistics/Profile/index.aspx. Accessed April 7, 2012.

This URL is valid, but the title on the page is now “A Profile of Older Americans: 2011.” As you scroll down the page there are links to previous profiles, including the 2007 version cited by the author, which now has a new URL:

http://www.aoa.gov/AoAroot/Aging_Statistics/Profile/2007/index.aspx

In this case no change is really needed; the URL is still valid, and the report the author referenced is available from this same page. If there’s any doubt about whether the content of the URL is what the author meant to cite, he or she should review and verify changed URLs (assuming time permits).

Broken Link? Google It

Here’s a sentence from an author-submitted manuscript that includes a broken URL:

To search for gene network pathways, we searched BioCarta, KEGG, and Reactome pathways and available software programs (https://www.affymetrix.com/products/software/compatible/pathway.affx).

A good tool to investigate and repair a broken link is Google. Search the title or a detailed description, in this case “Affymetrix compatible gene network pathways software.” The first result of this search,

http://www.affymetrix.com/partners_programs/genechip_compatible/genechip_compatible.affx

looks promising, and, if you click on the Pathway tab on this page, the heading is “Pathway/Network Analysis,” indicating that this new URL fits the citation context well.

It’s Really Broken: Keep the URL but Kill the Hyperlink

If a title or detailed description search fails to find a match for a source with a broken URL, ask the author to provide another source or another way to access the same source. If there’s not time to consult the author or if the author replies that there is no other way to access the reference, leave the URL in place to indicate how the author accessed the source, but remove the hyperlink.

Here’s an example from another author-submitted reference list, with the URL in place but the hyperlink removed:

Kittler H. Dermatoscopy: introduction of a new algorithmic method based on pattern analysis for diagnosis of pigmented skin lesions. Dermatopathol Practical Conceptual. 2007;13(1). http://www.derm101.com/dynaweb/resources/milestones/49057/@Generic__BookView/49067;cs=chapview.wv;ts=chaptoc.tv;chap=dpc1301a03. Accessed February 15, 2010.

A detailed title search doesn’t help in this case. There are no obvious errors in the URL. Also, entering the author’s name in the site search box provided at the home page, http://www.derm101.com/, produces this result: “The term ‘kittler’ was not found.”

Typos in URLs

Copy editors must also be alert for typos, especially errors in punctuation. These are common, and some of them, like the first example below, can be spotted and repaired before checking the reference:

Bad URL: http://www.hospitalcompare.hhs.gov/Hospital/Search/Welcomeasp

Good URL: http://www.hospitalcompare.hhs.gov/Hospital/Search/Welcome.asp

Bad URL: http://www.plosmedicine.org/article/info:doi/10.1371/journal-pmed.0050120

Good URL: http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0050120

Bad URL: https://www.cdc.gov/nchs/nhanes.htm

Good URL: http://www.cdc.gov/nchs/nhanes.htm

Redirected URLs

If, when you attempt to verify a URL, the landing page has a URL different than the one you tested, the URL has been redirected, or forwarded. Redirection is used most often to allow URLs to be updated without breaking the old URLs. The cited URL jumps automatically to the updated URL. To identify redirected URLs, check the browser address window. Here are 2 examples:

Cited URL: http://wwwn.cdc.gov/travel/yellowbookch4-hepb.aspx Redirected to: http://wwwnc.cdc.gov/travel/yellowbook/2012/chapter-3-infectious-diseases-related-to-travel/hepatitis-b.htm

The Rule: When URLs are redirected, it’s preferable to cite the destination URL, both because redirection is sometimes temporary and because redirection sometimes fails.

The Exception to the Rule: Another tye of redirection is the use of vanity URLs to promote a product or a brand. For example, the vanity URL http://jama.com redirects to http://jama.jamanetwork.com/journal.aspx. Unlike updated URL redirects, vanity URLs should be left intact.

More Information

See §3.15, Electronic References in the AMA Manual of Style (pp 64-67 in print) for more information about the correct format for URLs in electronic references.—Paul Frank