WARC format has been accepted as the international standard as per ISO 28500:2009 guidelines. For years, the heritage organizations around the world have been busy attempting to find a standard container format that would be able to carry a varied number of data objects for management, exchange, and storage. The ARC format has been around since 1996, and has been used to hold files for harvest through the internet, but the problem has been with standardization. The WARC is an extension of the ARC format, providing support for more data objects that ARC ever could.

WARC format provides several advantages over ARC. These include:

  1. Recording of HTTP request headers.
  2. Recording of the arbitrary metadata.
  3. Allocation of an identifier to identify each contained file separately.
  4. Management of the duplicated and migrated records, as well as the segmentation of the records.
  5. Storage of every kind of digital content, whether retrieved using HTTP, or any other protocol.

The decision to extend the existing ARC format was taken after the discussions and experiences of International Internet Preservation Consortium (IIPC). The principal aim of this organization is to acquire, preserve and ensure accessibility of knowledge and information of all types through internet for future generations. IIPC put together a Standards Working Group in order to develop a document for approval from the International Organization for Standardization.

Over 4 years, the group thus formed worked in tandem with a group of experts from IIPC to improve the initially prepared draft of the document. Bibliothèque nationale de France acted as the convener throughout the procedure. In future the group formed by IIPC will keep revising and creating improved versions of the document as time passes. Some applications available are already compliant with WARC. The Heritrix web crawler, the tools for data management and exchange created by ARC, NutchWAX, the Wayback Machine, and various other search tools are some of those applications. The use of WARC format will undoubtedly improve the efficiency of these applications to quite an extent.

In future, the WARC format is expected to evolve at a constant rate, thanks to its robust archiving capabilities. With time, the international recognition of WARC format, along with its applicability to almost all kinds of digital object is sure to provide great incentives for its use, both within and outside the community dedicated to web-archiving.