What is a Data Center?
What is a Data Center?
A data center (also referred to as DC from here on) is a facility to centralize shared IT operations and equipment of an organization. The purpose of a DC is to store, process, and disseminate data and applications. Some of the services – that are usually used by organizations – that a DC can provide are:
- Storage and management of data
- Backup and recovery of data
- Hosting productivity applications (e.g. e-mails)
- Processing e-commerce transactions in high volumes
- Powering online gaming servers and communities
- Doing extremely heavy tasks such as processing big data, training machine learning applications, and processing other types of AI.
Data Center VS Cloud VS Server Farm
Simply put, a data center is an on-premise hardware, storing a company’s data on its own hardware; On the other hand, clouds are off-premise and put your data in a public cloud. Again, oversimplifying the differences, a server farm is like the little sister of a DC. In fact, server farms are nothing more than a collection of servers. They can be either a bitcoin mining site or a small render farm used by a rich freelance 3d artist.
What are the core components of data centers?
A data center is made up of three main components; compute, storage, and network. However, they are just a small part of a modern DC. Beyond the primary components, support infrastructure is an essential part of a DC.
Servers are the heart of a data center; all the computations of different tasks rely on them. A server can be physical, virtualized, distributed across containers, or remote nodes. The design of a server should meet the performance expectations of the task they are supposed to execute. For example, a server that is supposed to run deep learning training tasks needs maximum tensor performance, thus modern GPUs with dedicated tensor acceleration hardware are best suited for the massive amount of matrix multiplication they do. On the other hand, if the server is supposed to run scientific simulations, it’s best suited to boost it with modern CPUs with peak AVX512 performance, meeting the peak accuracy.
A data center hosts a massive amount of sensitive information, either for the use of itself, or the use of its user. Reliability, speed, and volume are the three main factors to consider when designing the storage of a data center. One DC owner might prioritize one more than another depending on its needs.
However, it is, of course, not a black-and-white option and needs balancing between different options. Software-defined storage (SAS) just as much as other software-defined solutions can help flex the storage a data center to meet different expectations compared to old techs such as storage-area networks (SAN) and network-attached storage (NAS). Software-defined storages are decoupled from the underlying physical hardware, making them as scalable as their software architecture lets them be. As an example, a SAS can be based on container software architectures such as Docker, which is a very flexible platform with advanced features and compatibility with orchestration platforms such as Kubernetes (link to the article) – which is another service provided by GreenWeb+ -.
Finally, the servers of the data center need to connect to each other, and the DC as a whole needs to connect to the outside world. Here comes the networking. The equipment of a network includes routers, firewalls, switches, and cabling. A properly designed and structured network is expected to manage high volumes of traffic without failing and without performance hits. One of the typical topologies used in DC networks is known as three-tier topology. There is the access layer where the server is residing and the core layer that connects servers to each other. There is a middle aggregate layer that connects the core layer to the access layer. And in the end, there are switches at the edge, connecting the data center to the internet.
Hyperscale network security and software-defined networking can bring the scalability and agility of cloud networks to on-premises networks. Letting them get scaled appropriately as demand increases, and even furthermore, adding container orchestration on top just like how it can be done to the storage device of the DC.
Protecting data centers from all types of vulnerabilities is beyond important as there are critical assets and data stored and getting computed in them. Thus they need a reliable support infrastructure made up of power safety components and ambient control systems. A power safety infrastructure is made up of high-capacity power subsystems, uninterruptible power supplies(UPS) to protect them against irregular voltages, and backup generators for power shortages. The ambient control system must include ventilation and cooling systems, as well as protections such as passive and active fire protection /suppression systems and building security systems.
There exist industry standards for support infrastructures of data centers from organizations such as TIA (Telecommunication Industry Association) and Uptime Institute. These standards will be helpful in the design, construction, and maintenance of their facilities.
Building security systems –discussed above – are not enough for supporting a data center facility. A DC network requires a zero-trust analysis incorporated into its design. firewalls and web application firewalls (WAF), data access controls, Intrusion Prevention Systems (IPS), and Web Application & API Protection (WAAP) systems are important parts of the security facility. They need to be specified properly to ensure their scale meets the demands of the data center.
In case your data center is supposed to use a third-party storage provider (such as cloud services providers), it is important to understand the security measures of the third party. You have to invest as much as needed to achieve the highest possible level of security. The information needs to be kept safe.
How do data centers work?
A data center contains multiple physical or virtual servers that are connected together either internally and or externally through networking and communication equipment. They communicate with each other to access, transfer, and store digital information. Each server is equipped with a processor, storage space, and memory, kind of similar to a personal computer but at a much larger scale. A data center uses software to cluster and in some cases, distribute the server workload across multiple servers.
Basically, a data center is supposed to run applications that are too heavy for a personal computer and even a single server. Over-simplifying, they are one extremely big powerful computer. Although they are used to run specific applications, such as big data processing, AI machine learning training, scientific simulations, and hosting large-scale e-commerce websites and online games, they also have to run their own services. DC services are typically deployed to protect the performance and integrity of the core components of them. We can usually put these services into two general categories.
Network security system
DC network security is the support systems that keep data center operations, applications, and data, safe from threats. These systems appear to be both physical –such as HW firewalls and physical gates in the data center’s physical location- and digital –such as data encryption software and SW firewalls-.
A firewall is a filtering device that separates LAN segments, giving each segment a different security level and establishing a security perimeter that controls the traffic flow between segments . Firewalls are mostly at the internet edge of the network (where the local network meets the global internet) to act as a gate. That is because the internal network is mostly secure, while the internet is an unsecured area of the network; Thus this design lets the firewall meets its main goal, separating the secure and insecure parts of the network. A slow firewall can bottleneck the whole internet connection of the network, so they are expected to have high-performance capabilities.
Intrusion Detection System or in short, IDS, is a real-time system that detects intruders, as well as suspicious activities among the network, and reports them to a monitoring system. They also expected to block and mitigate intrusions in progress and immunize the system against future similar attacks. IDSs have two main components: Sensors and IDS management. Sensors are the software agents that analyze the traffic on the network and the utilization of the data center resources. Then comes the IDS management system that is supposed to administer and configure the sensors. They are also supposed to collect all the alarm information generated by sensors and log them. Basically, sensors are like surveillance tools and IDSs are like the monitoring room and the control center.
Security – as much as other segments of the article – deserves a dedicated article to discuss different aspects. Security starts from the management of the data center owner company /organization, continues with physical protections, and ends with cyber security both physically using hardware, and virtually using software.
Application delivery assurance system
An application delivery assurance system (ADAS) is a network appliance that is used to ensure that the demand of the user of the data center is met. This system includes various mechanisms to provide resiliency and availability. All these are done via automatic failover and load balancing.
Data center consolidation refers to strategies and technologies that are useful for the optimization of the efficiency of IT architectures involved in the DC. Consolidation can be done either physically – by consolidating multiple DCs together – or making a single data center run more effectively and use fewer resources.
Different goals can be achieved using DC consolidation; such goals can be: finite data storage resources, legacy systems that had the potential to improve, and many many more.
Why are data centers important?
Data centers are the infrastructure of almost all important services that are useful to humanity in the 21st century. They support nearly every single computation, data storage, network, and business application for the enterprise. It is so important for businesses that run on computers that the data center itself is the business.
Use cases of data centers
One of the most important parts of an enterprise is its data center. They are supporting business applications and provide all kinds of digitizable services including but not limited to:
There are a lot of data centers specifically designed for data storage. Cloud storage services such as Google Drive, Apple’s iCloud, DropBox, Microsoft OneDrive, and many more are some examples of data centers being used for data storage and management. They are used either for file sharing or saving up space on local storage and also for backup and recovery.
Applications such E-Mail services (Gmail, Outlook, etc.), social networks (Instagram, YouTube, etc.), messengers (WhatsApp, Signal, etc.), and a lot more categories of productivity applications run their host application on a data center. The massive amount of data being transferred every moment and the massive amount of online processing needed for most of those data demands a high-end data center to not fail the demand of the end user. Thus DCs are almost the best solution for them.
A small-sized or medium-sized e-commerce business may not need a data center and probably a cloud server is more than enough for them. But for larger scales of this business, using data is not only preferred but is a necessity. The client might regret using the application if it’s slow, buggy, or even worse, unstable. So a performant data center that is trustable and has minimal downtime is a must for large-scale e-commerce businesses such as Amazon.
Some online games can use their players’ computers as distributed host servers instead of data center. But this is only possible when the developer is sure that most computers can handle the hosting. So lightweight games such as Dota2 can do this, but more demanding games such as Call of Duty® Modern Warfare 2 cannot do this.
Such games need to get optimized – performance-wise – as much as possible to make sure the player is experiencing a smooth gameplay. So adding the processing power required for the host application on top is a wrong decision. Also, this game has a massive amount of players simultaneously playing the game. On the other hand, the smallest amount of latency difference can make the gameplay unfair and make the player leave the game for the better. So having a high-performance data center that is distributed around the world – to ensure the ping is uniform across the world – is an exceptional importance for the developer to ensure the players keep playing their game.
Big data and AI
Big data is data that has these three specifications: big Variety, big Volume, and big Velocity (aka. The 3 Vs). Big data uses vary depending on if it is used by a business, an organization, or a government. No matter the use case, they are always too heavy to be handled by anything other than data centers. Businesses usually need to analyze these data to better understand the consumers, their needs, their feedback, and how they use their products. While organizations might use them for forecasting, finding specific information lying within a massive amount of random data, medical purposes, and more.
Artificial Intelligence, specifically machine learning, can benefit a lot from big data analysis in data centers. So there is a strong relationship between the two. Machine learning can get fed by big data to understand the tasks that it is supposed to do. While in the opposite direction, big data analysis can be accelerated marginally using artificial intelligence.
What are the types of data centers?
The design of data centers is like fingerprints! No two of them are alike neither in design nor in the applications and data they support with their networking, computing, and storage infrastructure. However, here in this article, we will find out the top five most common types of data centers.
A private data center facility that is supposed to support a single organization; is an enterprise data center. These types of DCs are best suited for companies that either have unique demands from their DC, or ones that do enough business to take advantage of vertical economic models. The most important benefit of enterprise DCs is that they are fully customizable to meet the needs and demands of the company owning them. The white space (IT equipment and infrastructure) of these DCs are typically managed by the in-house IT department and only the grey space (back-end DC components and equipment) are out-source-able.
Colocational data centers are also known as multi-tenant data centers. They offer the service to businesses that want to host their servers offsite. Companies that provide colocational DCs provide the proper components – cooling, power, networking, and security – that are needed for hosting the DC as well.
These kinds of DC are mostly suitable for businesses that do not have their own space for an enterprise one. The reason is either space limits, or HR limits (not having a big enough IT department to dedicate a team for the management of the data center). Colocational DCs allow such businesses to redirect financial and personnel resources to other companies. They are especially useful for businesses that need their data center to be distributed across the globe, such as online game developers.
As the name suggests, these are designed to support very large-scale IT infrastructures. They are a rarer kind of data center, and only a handful of 700 hyperscale data centers exist in the world. Although they are limited in number, their power is comparable to a much larger group of non-hyperscale DCs working together. A typical data center has at least 5,000 servers, and 1000m2 of floor space. Like enterprise ones, these are also owned and operated by the first-party company.
Edge & Micro
These types of data centers are small and close to the users. They are supposed to handle real-time data processing, and analysis, and execute the required actions. The structure of this design makes low-latency communication with smart devices and IoT possible.
Modular or Container
A modular data center is a module or a shipping container packaged with plug-and-play, ready-made DC components. The components of such package must include servers, storage, networking equipment, power equipment, security equipment, environment condition controlling systems, and more. They are usually used in construction sites or disaster areas. But they are also used on permanent sites. They will allow organizations to scale their already existing data center quickly to meet the increasing demand.
While technology is growing faster than ever, a large IT-relevant business can demand much more than what a typical cloud may be able to provide in a near future. Thus we, GreenWeb+, decided to provide solutions to those businesses and make the implementation of their data center easier. We provide various types of services for you and your business, making it possible for companies even with small IT teams, to have their own on-premise data center.