In this chapter you will learn how to scale your configuration management system.
Objectives: In this chapter you will learn how to:
Organize your code for large infrastructure;
Apply all or part of your configuration management to a group of nodes;
ansible, config management, scale
Knowledge: Complexity:
Reading time: 30 minutes
We have seen in the previous chapters how to organize our code in the form of roles but also how to use some roles for the management of updates (patch management) or the deployment of code.
What about configuration management? How to manage the configuration of tens, hundreds, or even thousands of virtual machines with Ansible?
The advent of the cloud has changed the traditional methods a bit. The VM is configured at deployment. If its configuration is no longer compliant, it is destroyed and replaced by a new one.
The organization of the configuration management system presented in this chapter will respond to these two ways of consuming IT: "one-shot" use or regular "re-configuration" of a pool of servers.
However, be careful: using Ansible to ensure a pool of servers compliance requires changing work habits. It is no longer possible to manually modify the configuration of a service manager without seeing these modifications overwritten the next time Ansible is run.
Note
What we are going to set up below is not Ansible's favorite terrain. Technologies like Puppet or Salt will do much better. Let's remember that Ansible is a Swiss army knife of automation and is agentless, which explains the differences in performance.
The first thing we have to discuss is the separation between data and Ansible code.
As the code gets larger and more complex, it will be more and more complicated to modify the variables it contains.
To ensure the maintenance of your site, the most important thing is correctly separating the variables from the Ansible code.
We haven't discussed it here yet, but you should know that Ansible can automatically load the variables it finds in specific folders depending on the inventory name of the managed node, or its member groups.
The Ansible documentation suggests that we organize our code as below:
inventories/
production/
hosts # inventory file for production servers
group_vars/
group1.yml # here we assign variables to particular groups
group2.yml
host_vars/
hostname1.yml # here we assign variables to particular systems
hostname2.yml
If the targeted node is hostname1 of group1, the variables contained in the hostname1.yml and group1.yml files will be automatically loaded. It's a nice way to store all the data for all your roles in the same place.
In this way, the inventory file of your server becomes its identity card. It contains all the variables that differ from the default variables for your server.
From the point of view of centralization of variables, it becomes essential to organize the naming of its variables in the roles by prefixing them, for example, with the name of the role. It is also recommended to use flat variable names rather than dictionaries.
For example, if you want to make the PermitRootLogin value in the sshd_config file a variable, a good variable name could be sshd_config_permitrootlogin (instead of sshd.config.permitrootlogin which could also be a good variable name).
Let's focus on a proposal for the organization of files and directories necessary for the proper functioning of a CMS (Content Management System).
Our starting point will be the site.yml file. This file is a bit like the orchestra conductor of the CMS since it will only include the necessary roles for the target nodes if needed:
Of course, those roles must be created under the roles directory at the same level as the site.yml file.
I like to manage my global vars inside a vars/global_vars.yml, even if I could store them inside a file located at inventories/production/group_vars/all.yml
By following the advice given in the Ansible documentation, you will quickly obtain a:
easily maintainable source code even if it contains a large number of roles
a relatively fast, repeatable compliance system that you can apply partially or completely
can be adapted on a case-by-case basis and by servers
the specifics of your information system are separated from the code, easily audit-able, and centralized in the inventory files of your configuration management.