We hope you find this tutorial helpful. In addition to guides like this one, we provide simple cloud infrastructure for developers. Learn more →

How To Understand the Chef Configuration Environment on a VPS

By:
Justin Ellingwood

Introduction

Configuration management tools provide an avenue for deploying consistent, predictable code and configurations to a variety of client computers from a centralized management server. Chef is one of the most popular configuration management tools. It uses Ruby and handles configuration by packing details into what it calls recipes.

Chef provides a way to quickly deploy entire environments instead of only single applications. In any situation where you would install a piece of software and then modify its configuration files, Chef can be used to automate this process.

In this guide, we will provide a general overview of how Chef organizes its files and what tools and systems it uses to accomplish its objectives.

If you would like to follow along, there is a tutorial on how to install Chef 12 on Ubuntu 14.04 here.

Chef Terminology

It is important to understand the different components that make up Chef.

Chef Operating Infrastructure

We will start by discussing the different models that make up the high level deployment strategy.

The Chef system is defined by the roles that each machine or resource plays in the deployment process:

Chef Server: This is the central location that stores configuration recipes, cookbooks, and node and workstation definitions. It is the central machine that every other machine in the organization will use for deployment configuration.
Chef Nodes: Chef nodes are the deployment targets that are configured by Chef. Each node represents a separate, contained machine environment that can be on physical hardware or virtualized.

These operating system environments each contain a Chef client application that can communicate with the Chef Server.

Chef Workstations: Chef workstations are where Chef configuration details are created or edited. The configuration files are then pushed to the Chef server, where they will be available to deploy to any nodes.

The configuration of these different components allows you to have multiple workstations and nodes. Nodes can be configured as soon as they are online and connected to the server.

While the above outline gives the impression that these are separate entities, it is possible for one machine to fulfill two or all of these roles. There is a project called chef-solo which allows you to forgo the use of a server and operate by configuring the computer which it is installed.

Server Details

The server is the central control point that is accessed by all of the other chef machines, whether as a client or a manager. It is basically a large repository or database of all of the configuration details.

It handles connections and permissions from nodes and workstations and organizes data so that it can easily be pulled by clients. The server can also include a web interface in order to manage or configure some details.

Node Details

As mentioned above, a node can be a physical or virtual machine. Its only requirements are that it has access to the network and can communicate with the chef server. The user running the chef software also needs to be able to install software and make system changes.

Each node communicates with the central server using an application called chef-client. This handles pulling data off of the server and executing the configuration steps necessary to get the node into its final state. The chef-client program and the chef server communicate through the use of RSA key-based authentication.

Chef-client uses a tool called ohai to get statistics about the node. These are used in order to set up certain configuration details and populate variables contained within the files.

Workstation Details

A workstation has the tools necessary to create and modify configuration details for any of the available nodes and can communicate with the chef server to make these available.

An important tool to manage chef on a workstation is called knife. Knife acts as a gateway in which you can configure anything that would be stored on the server. It can manage nodes and configurations and can generally be used to access the server in a "chef-specific" way. While it would be possible to log into the server with SSH and make changes to all of the data that it handles manually, this is not really adhering to the processes that chef implements.

Configurations and definitions that are created and modified on a workstation are committed to version control and then pushed to the server. The repository is called the chef-repo. It holds all of the data needed for the configuration of chef.

Chef Repo File Structure

Chef handles its configuration and dependency information on a workstation within a specified directory structure. It is important to understand this hierarchy in order to effectively create recipes and push changes.

As we mentioned above, the server configuration files should be kept in version control in repository referred to as the "chef-repo". This is just a normal directory that contains the chef files.

In this directory, we can find a structure that looks like this:

certificates/: Contains the SSL certificates that can be associated with clients for authentication.
chefignore: Lists the files and directories within the structure that should not be included in the push to the server.
config/: Contains one of the two repository configuration files
- rake.rb: Defines some variable declarations for creating SSL certificates and some general options.
cookbooks/: Contains the cookbooks that configure the infrastructure for your organization.
data_bags/: Contains various data bags for your configuration.

Data bags are protected sub-directories that contain sensitive configuration details. They are only accessible to those nodes that have matching SSL certificates and contain JSON formated files with configuration details.

environments/: Contains a top-level location to contain details for deploying the environment.

Every environment that diverges from the default environment must be defined in this directory.

Rakefile: This file defines the tasks that chef can perform in its configurations.
roles/: Contains files that define the roles that can be assigned to nodes.

Chef Cookbook File Structure

Within the cookbooks directory in the chef-repo, sub-directories define specific cookbooks for applications. Within each separate application configuration directory is a structure that defines how this service should be installed and what changes must be made to make it work correctly.

Within the application, you will find files and definitions that define how an application must be installed and configured.

The metadata.rb or metadata.json files contains metadata information about the service. This includes basic information like the name of the cookbook and the version, but it also is the place where the dependency information is stored. If this cookbook depends on other cookbooks to be installed, it can list them in this file and chef will install and configure them prior to the current cookbook.

The attributes directory contains attribute definitions that can be used to override or define settings for the nodes that will have this service.

The definitions directory contains files that declare resources. This means that you can group functionality together under one heading.

The files directory describes how chef should distribute files throughout the node on which this cookbook is deployed.

The recipes directory contains the "recipes" that define how the service should be configured. Recipes are generally small files that configure specific aspects of the larger system. If a cookbook used to install and configure a web server, a recipe may enable a module or set up a sane firewall default.

The templates directory is used to provide more complex configuration management. You can provide entire configuration files that contain embedded Ruby commands. The variables that are printed can be defined in other files.

Conclusion

While this guide may not help you get started writing your own Chef configurations, it should give you a good overview as to what the individual components are in a complex deployment environment. Once you begin to understand how the node, server, workstation interaction works, and can find your way around the chef-repo, you can begin to start understanding how some of the cookbooks available operate.

In the next aricle, we will discuss how to set up Chef 12 on Ubuntu 14.04 servers. Later on, we will also demonstrate how to create some of your own cookbooks and configure an environment that can be deployed to other machines within your network.

By Justin Ellingwood