The principle of redundancy in information technology (IT) serves to increase the reliability and availability of systems. Fundamentally, redundancy can be intentional or unintentional. As a security concept, it refers to the (intentional) provision of additional resources or components that are designed to ensure the functionality of a system in the event of a failure.
The opposite of this multiple design is referred to as a Single Point of Failure (SPoF): Here, the failure of a single component (for example, a server) leads to the failure of the entire system, as no backup exists. Unintentional redundancy often occurs when unnecessary data duplicates exist: These occupy storage space and complicate data maintenance, which can subsequently lead to inconsistencies.
The redundancy principle aims to increase the availability, reliability, and fault tolerance of IT systems. By implementing redundant systems, companies ensure that their services remain available even in the event of hardware or software failures. Redundancy thus contributes to avoiding downtime and ensuring continuous business operations. As such, the redundancy principle should also be part of a Business Continuity Plan (BCP).
Redundant data are copies or duplicates of data records that are either mirrored or distributed across different locations and servers. This practice serves to increase the availability and security of data. Through various technologies, regular backups, virtualization, or mirroring, it is ensured that no data is lost in the event of hardware failures or other damage. Additionally, redundant data structures enable faster access to information over greater distances and support strategies such as backup and disaster recovery.
Functional redundancy means that multiple systems or components fulfill the same function to increase availability. An example is server redundancy: Here, multiple servers are operated in parallel so that if one server fails, the required tasks are taken over by the other servers. Redundant servers are operated in either active or passive mode: In active mode, all servers share the load, while in passive mode, a server is only activated when another fails. This significantly increases fault tolerance and reduces the risk of downtime.
Network redundancy ensures that by implementing multiple network connections and paths, data transmission is not interrupted even if a network segment fails.
Geographic redundancy involves the distribution of data and services across multiple geographically separated locations. This distribution thus protects against large-scale failures that could be caused by natural disasters, regional power outages, or other serious incidents. Spatial separation minimizes the risk that a single event could affect the entire operation (compare also SPoF/Single Point of Failure). The principle of geographic redundancy is applied in data centers, among other ways, through distribution across multiple locations, which are often in different countries or even on different continents.
Within the data center as well, redundancy is critical and refers to the multiple design of technical components and the infrastructure layout. It can include several measures:
These measures help ensure that a data center’s services remain available even when individual components fail. The combination of geographic redundancy and local infrastructure redundancy thus increases security in the data center.