Scaling from an IT resource perspective is the ability of the IT resource to handle increased or decreased usage demands.

The following are types of scaling:

Horizontal Scaling – scaling out and scaling in 

Vertical Scaling – scaling up and scaling down

The next two sections briefly describe each.


Horizontal Scaling

The allocating or releasing of IT resources that are of the same type is referred to as horizontal scaling (Figure 1.1). The horizontal allocation of resources is referred to as scaling out and the horizontal releasing of resources is referred to as scaling in. Horizontal scaling is a common form of scaling within cloud environments. Virtual machines are created based on demand for resources.  If the demand for the server resources drops, the VMs are ‘decommissioned’ or turned off.  

Figure 1.1

An IT resource (Virtual Server A) is scaled out by adding more of the same IT resources (Virtual Servers B and C).


Vertical Scaling

When an existing IT resource is replaced by another with higher or lower capacity, vertical scaling is considered to have occurred. Specifically, the replacing of an IT resource with another that has a higher capacity is referred to as scaling up and the replacing an IT resource with another that has a lower capacity is considered scaling down.  This would include increasing or decreasing CPU, RAM, storage and compute-network resource allocation.

Vertical scaling is less common in cloud environments due to the downtime required while the replacement is taking place. 


The table below provides a brief comparison of horizontal and vertical scaling.

Horizontal Scaling

Vertical Scaling

less expensive

(through commodity hardware components)

more expensive

(specialized servers)

IT resources instantly available

IT resources normally instantly available

resource replication and automated scaling

additional setup is normally needed

additional IT resources needed

no additional IT resources needed

not limited by hardware capacity

limited by maximum hardware capacity

Table 1.2 A comparison of horizontal and vertical scaling.


Cloud Services

Cloud services can be rendered and made redundant through scaling [eg. Elastic load balancers, or auto-scalers].

A cloud service is defined as any IT resource that is made remotely accessible via a cloud. Unlike other IT domains that fall under the service technology umbrella, such as service- oriented architecture; the term “service” within the context of cloud computing is quite broad. A cloud service can exist as a simple Web-based software program with a technical interface invoked via the use of a messaging protocol, or as a remote access point for administrative tools or larger environments and other IT resources.

In the figure below the internal circle represents a cloud service which is a simple Web-based software program. A different IT resource symbol may be used in the latter case, depending on the nature of the access that is provided by the cloud service.

Figure 1.3

A cloud service with a published technical interface is being accessed by a consumer outside of the cloud (left). A cloud service that exists as a virtual server is also being accessed from outside of the cloud’s boundary (right). The cloud service on the left is likely being invoked by a consumer program that was designed to access the cloud service’s published technical interface. The cloud service on the right may be accessed by a human user that has remotely logged on to the virtual server.


The driving motivation behind cloud computing is to provide IT resources as cloud services which encapsulate other IT resources, while offering functions for clients to use and leverage remotely.  The acronym XaaS defines the creation of ‘anything’ as a service.  Networks, software programs, server instances, agents or any IT related resource can and will be defined to some extent, as a ‘service’ within a cloud network, with some services offered as privately accessible resources and other services, available ‘publicly’.