Chapter 28. Advanced Topics

Table of Contents

28.1. Datastores
28.2. Locks
28.3. Installing ConfD on a target system
28.4. Configuring ConfD
28.5. Starting ConfD
28.6. ConfD IPC
28.7. Restart strategies
28.8. Security issues
28.9. Running ConfD as a non privileged user
28.10. Storing encrypted values in ConfD
28.11. Disaster management
28.12. Troubleshooting
28.13. Tuning the size of confd_hkeypath_t
28.14. Error Message Customization
28.15. Using a different version of OpenSSL
28.16. Using shared memory for schema information
28.17. Running application code inside ConfD

28.1. Datastores

A datastore is a complete set of configuration parameters for the device stored and manipulated as a single entity.

A datastore can be locked in its entirety with a global write lock.

ConfD supports three different named configuration datastores - running, startup, and candidate. The respective datastores support a set of capabilities as explained below:

running

The running datastore contains the complete configuration currently active on the device. Running can be configured to support the read-write or the writable-through-candidate modes. Writable-through-candidate means that running can only be modified by making changes to the candidate datastore (see below), and by committing these changes to the candidate.

startup

The startup datastore is a persistent datastore which the device reads every time it reboots.

If running is read-write and the device has a startup datastore, a manager can try changes by writing them to running. If things look good, the changes can be made persistent by copying them to startup. This ensures that the device uses the same configuration after reboot.

candidate

The candidate datastore is used to hold configuration data that can be manipulated without impacting the current configuration. The candidate configuration is a full configuration data set that serves as a workspace for creating and manipulating configuration data. Additions, deletions, and changes may be made to this data to construct the desired configuration.

The candidate datastore can be committed, which means that the device's running configuration is replaced with the contents of the candidate datastore.

The candidate can be used in two different modes, with different characteristics:

  • It can be modified without first taking a lock on the datastore. If it is modified outside a lock, it is marked as being dirty. When the candidate is dirty it means that it is (potentially) different from the running configuration. When it is dirty, a lock cannot be taken. It leaves the dirty state by being committed to running, or by discarding all changes (which effectively resets it to the contents of running).

  • If the candidate is not dirty, and a lock is taken, no one but the owner of the lock can modify the database. If changes are made to the candidate while it is locked, and the owner unlocks it (or closes the CLI, Web UI or NETCONF session), all changes are discarded, and the datastore is unlocked.

The candidate can be committed to running with a specified timeout. In this case, running is set to the contents of the candidate. If a second commit, called a confirming-commit, is given within the timeout, the changes are made permanent. If no confirming-commit is given within the timeout period, running is reverted to the state it had before the first commit.

A project using ConfD must choose a valid combination of datastores to support. Which combination to choose depends on the system resources available on the device, and which characteristics the end-product should have.

The following is a list of valid combinations:

running in read-write mode, no startup, no candidate
  • A single, non-volatile datastore is used.

  • Once changes are written to the datastore, they are persistent, and cannot automatically be rolled back.

  • The application needs to react to changes to the database. If CDB is used, this means that the application must use the subscription mechanism.

running in read-write mode and startup
  • startup is stored in non-volatile memory, and running in read-write RAM.

  • The application needs to be written in such a way that it reacts to changes to the database. If CDB is used, this means that the application must use the subscription mechanism.

running in read-write mode and candidate
  • Both running and candidate are stored in non-volatile memory.

  • NOTE: This combination is NOT RECOMMENDED. When a manager reconfigures a node that has the candidate and also read-write running, the manager can never know that running is up to date with the candidate and must thus always (logically) copy running to the candidate prior to modifying the candidate. This introduces unnecessary overhead, and makes automation more complicated.

  • The application needs to react to changes to the database. If CDB is used, this means that the application must use the subscription mechanism.

  • In this mode, running can be modified without going through the candidate. This means that a client that wishes to work with the candidate may need to copy running into the candidate, to ensure that no changes to running are lost when the candidate is committed.

running in writable-through-candidate mode and candidate
  • Both running and candidate are stored in non-volatile memory, but the candidate can efficiently be implemented as a diff against running.

  • The application needs to react to changes to the database. If CDB is used, this means that the application must use the subscription mechanism.

  • In this mode, all changes always go through the candidate, so a client does never have to copy running to candidate in order to not lose any data.

ConfD ensures that running and startup are always consistent, in the sense that the validation constraints defined in the data model hold. The candidate is allowed to be temporarily inconsistent, but if it is committed to running, it must be valid.

ConfD by default implements the datastores chosen in CDB. However, ConfD can also be configured to use an external database. If an external database is used, this database must implement the running and startup datastores if applicable. If the candidate is used, it may be implemented with CDB or as an external database.

28.2. Locks

This section will explain the different locks that exist in ConfD and how they interact. It is important to understand the architecture of ConfD with its management backplane, and the transaction state machine as described in Section 7.5, “User sessions and ConfD Transactions” to be able to understand how the different locks fit into the picture.

28.2.1. Global locks

The ConfD management backplane keeps a lock for each datastore: running, startup and candidate. These locks are usually referred to as the global locks and they provide a mechanism to grant exclusive access to the datastore the lock guards.

The global locks are the only locks that can explicitly be taken through a northbound agent, for example by the NETCONF <lock> operation, or by calling maapi_lock().

A global lock can be taken for the whole datastore, or it can be a partial lock (for a subset of the datamodel). Partial locks are exposed through NETCONF and MAAPI.

An agent can request a global lock to ensure that it has exclusive write-access to a datastore. When a global lock is held by an agent it is not possible for anyone else to write to the datastore the lock guards - this is enforced by the transaction engine. A global lock on a datastore is granted to an agent if there are no other holders of it (including partial locks), and if all dataproviders approve the lock request. Each dataprovider (CDB and/or external dataproviders) will have its lock() callback invoked to get a chance to refuse or accept the lock. The output of confd --status includes locking status. For each user session locks (if any) per datastore is listed.

28.2.2. Transaction locks

A northbound agent starts a user session towards ConfD's management backplane. Each user session can then start multiple transactions. A transaction is either read/write or read-only and is always started against a specific datastore.

The transaction engine has its internal locks, one for every datastore. These transaction locks exists to serialize configuration updates towards the datastore and are separate from the global locks.

As a northbound agent wants to update a datastore with a new configuration it will implicitly grab and release the transactional lock corresponding to the datastore it is trying to modify. The transaction engine takes care of managing the locks, as it moves through the transaction state machine and there is no API that exposes the transactional locks to the northbound agents.

When the transaction engine wants to take a lock for a transaction (for example when entering the validate state) it first checks that no other transaction has the lock. Then it checks that no user session has a global lock on that datastore. Finally each dataprovider is invoked by its trans_lock() callback.

28.2.3. Northbound agents and global locks

In contrast to the implicit transactional locks, some northbound agents expose explicit access to the global locks. This is done a bit differently by each agent.

The management API exposes the global locks by providing maapi_lock() and maapi_unlock() functions (and the corresponding maapi_lock_partial() maapi_unlock_partial() for partial locking). Once a user session is established (or attached to) these functions can be called.

In the CLI the global locks are taken when entering different configure modes as follows:

configure exclusive

When the candidate datastore is enabled both the running and candidate global locks will be taken.

configure exclusive

When the candidate datastore is disabled and the startup datastore is enabled both running (if enabled) and startup global locks are taken.

configure private | shared

Does not grab any locks

The global locks are then kept by the CLI until either the configure mode is exited, or in the case of commit confirmed <timeout> the lock is released when it returns.

The Web UI behaves in the same way as the CLI (it presents three edit tabs called "Edit private", "Edit exclusive", and "Edit shared" which corresponds to the CLI modes described above).

The NETCONF agent translates the <lock> operation into a request for the global lock for the requested datastore. Partial locks are also exposed through the partial-lock rpc.

28.2.4. External data providers

Implementing the lock() and unlock() callbacks is not required of an external dataprovider. ConfD will never try to initiate the trans_lock() state transition (see the transaction state diagram in Section 7.5, “User sessions and ConfD Transactions”) towards a data provider while a global lock is taken - so the reason for a dataprovider to implement the locking callbacks is if someone else can write (or lock for example to take a backup) to the data providers database.

28.2.5. CDB

CDB ignores the lock() and unlock() callbacks (since the data-provider interface is the only write interface towards it).

CDB has its own internal locks on the database. The running and startup datastore each has a single write and multiple read locks. It is not possible to grab the write-lock on a datastore while there are active read-locks on it. The locks in CDB exists to make sure that a reader always gets a consistent view of the data (in particular it becomes very confusing if another user is able to delete configuration nodes in between calls to get_next() on YANG list entries).

During a transaction trans_lock() takes a CDB read-lock towards the transactions datastore and write_start() tries to release the read-lock and grab the write-lock instead.

A CDB external client (usually referred to as an MO, managed object) implicitly takes a CDB read-lock between cdb_start_session() and cdb_end_session() on the specified datastore (running or startup). This means that while an MO is reading, a transaction can not pass through write_start() (and conversely a CDB reader can not start while a transaction is in between write_start() and commit() or abort()).

The Operational store in CDB does not have any locks. ConfD's transaction engine can only read from it, and the MO writes are atomic per write operation.

28.2.6. Lock impact on user sessions

When a session tries to modify a data store that is locked in some way, it will fail. For example, the CLI might print:

admin@host% commit
Aborted: the configuration database is locked
[error][2009-06-11 16:27:21]
        

Since some of the locks are short lived (such as a CDB read lock), ConfD can be configured to retry the failing operation for a short period of time. If the data store still is locked after this time, the operation fails.

To configure this, set /confdConfig/commitRetryTimeout in confd.conf.

28.3. Installing ConfD on a target system

The ConfD installation package contains both binaries for the target system and a development environment including documentation. Many of these files are not needed on a target, and can be excluded. Additional files can be removed depending on the feature configuration on the target.

In the following description, $CONFD_DIR refers to the directory where ConfD has been installed.

A minimal example set of files on a target system can be:

$CONFD_DIR/bin/confd
$CONFD_DIR/bin/confd_cli
$CONFD_DIR/etc/confd/*
$CONFD_DIR/lib/confd/bin/confd.boot
$CONFD_DIR/lib/confd/lib/core/*
$CONFD_DIR/lib/confd/lib/cli/*
      

$CONFD_DIR/bin/confd_cli is the command line interface (CLI) agent program and can be removed together with $CONFD_DIR/lib/confd/lib/cli/ if the CLI is not used.

$CONFD_DIR/etc/confd/ contains configuration files.

Support for Symmetric Multiprocessing (SMP) introduces some overhead both in CPU and memory usage, and in order to give optimal performance in all scenarios, the installation includes two separate executables for the ConfD daemon, $CONFD_DIR/lib/confd/erts/bin/confd (no SMP support) and $CONFD_DIR/lib/confd/erts/bin/confd.smp (with SMP support). If ConfD will always be run either with or without SMP support, one of these executables can be removed. See also the --smp option in the confd(1) manual page

If $CONFD_DIR/lib/confd/erts/bin/confd is removed, ConfD will always run with SMP support, although with a single thread on a single-processor system or if it is started with --smp 1. If $CONFD_DIR/lib/confd/erts/bin/confd.smp is removed, ConfD will never run with SMP support, and the --smp option has no effect other than refusing to start the daemon if the argument is bigger than 1.

Files associated with certain features can be removed if the system is set up not to use them:

The CLI agent
$CONFD_DIR/bin/confd_cli
$CONFD_DIR/lib/confd/lib/cli/
The NETCONF server
$CONFD_DIR/lib/confd/lib/netconf/
The Web UI and REST server
$CONFD_DIR/lib/confd/lib/webui/
The Web UI frontend
$CONFD_DIR/var/confd/webui/
The SNMP agent and gateway
$CONFD_DIR/bin/smidump
$CONFD_DIR/lib/confd/lib/snmp/

smidump is only used for producing YANG files - it is not used by ConfD itself, and therefore not likely to be needed on the target.

The integrated SSH server

The integrated SSH server is not needed if OpenSSH is used to terminate SSH for NETCONF and the CLI:

$CONFD_DIR/lib/confd/lib/core/ssh*
The confdc compiler

The compiler can be removed unless we plan to to compile YANG files on the host.

$CONFD_DIR/bin/confdc
$CONFD_DIR/lib/confd/lib/confdc
The AAA bridge

See documentation on AAA - basically this is a pre-compiled example program which probably won't be used on target:

$CONFD_DIR/lib/confd/lib/core/capi/priv/confd_aaa_bridge

28.4. Configuring ConfD

When ConfD is started, it reads its configuration file and starts all subsystems configured to start (such as NETCONF, CLI etc.). If a configuration parameter is changed, ConfD can be reloaded by issuing:

$ confd --reload

This command also tells ConfD to close and reopen all log files, which makes it suitable to use from a system like logrotate.

There is also another way, whereby the ConfD configuration parameters that can be changed in runtime are loaded from an external namespace. Thus allowing the user to store ConfD's configuration in ConfD (specifically in CDB) itself. This will be described further down.

28.4.1. Using the configuration file

On a typical system, the configuration data resides in ConfD's database CDB. Some of the parameters in the configuration are intended for the target OS environment, such as the IP address of the management interface. The OS reads this information from its own configuration files, such as /etc/conf.d/net. This means that the application typically reads this data from CDB, and generates configuration files needed by the system before starting them. If a manager changes one of these parameters, the application subscribes to changes in CDB, regenerates the files, and restarts the system daemons. This mechanism can also be used for the configuration of ConfD itself. The application must subscribe to changes to any parameter affecting ConfD (such as management IP address), update the ConfD configuration file confd.conf, and then instruct ConfD to reload it.

ConfD comes bundled with a small example tool which can be used to patch confd.conf files: $CONFD_DIR/src/confd/tools/xmlset.c. This tool uses the light-weight Expat XML Parser (http://expat.sourceforge.net/).

This example changes confd.conf to disable the Web UI:

$ xmlset C false confdConfig webui enabled < confd.conf

This example changes confd.conf to removes the encryptedStrings container:

$ xmlset R confdConfig encryptedStrings < confd.conf

28.4.2. Storing ConfD configuration parameters in CDB

In the ConfD distribution in the $CONFD_DIR/src/confd/dyncfg directory the confd_dyncfg.yang YANG module is included. The module defines the namespace http://tail-f.com/ns/confd_dyncfg/1.0 which contains all the ConfD configuration parameters that can be modified in runtime. I.e. it is a subset of the namespace that defines the ConfD configuration file (confd.conf).

To enable the feature of storing ConfD's configuration in CDB the setting /confdConfig/runtimeReconfiguration has to be set to namespace in the configuration file. This instructs ConfD to read all its "static" configuration from the configuration file, and then load the rest of the configuration from the confd_dyncfg namespace (which must be served by CDB). A requirement is that the confd_dyncfg.fxs is in ConfD's loadPath. It is also advisable to have a suitable _init.xml file in ConfD's CDB directory.

The best way to understand how to use this feature is the example confdconf/dyncfg in the bundled example collection.

In most cases the interesting use of this feature is to be able to expose a particular aspect of ConfD's configuration to the end-user and hide the rest. This can be achieved by combining the use of the --export none flag when compiling the confd_dyncfg.yang module with the use of the symlink feature (exactly how they work are explained in Section 10.7, “Hidden Data”). The snmpa/6-dyncfg example in the example collection shows how to expose a small subset of the SNMP agent configuration (as well as some minor aspects of the CLI parameters) in a private namespace.

For example, if we want to be able to expose the ConfD's built-in SNMP agents listen port as an end-user configurable as the leaf /sys/snmp-port, we could write a YANG model like this:

container sys {
        tailf:symlink snmp-port {
          tailf:path "/dyncfg:confdConfig/dyncfg:snmpAgent/dyncfg:port";
        }
      }

When a transaction containing changes to /confdConfig is committed ConfD will pick up the changes made and act accordingly. Thus there is no longer a need for confd --reload except for closing/re-opening of log-files (as described above) or to update the fxs files for sub-agents.

When /confdConfig/runtimeReconfiguration is set to namespace, any settings in confd.conf for the parameters that exist in the confd_dyncfg namespace are ignored, with one exception: the configuration under /confdConfig/logs. This configuration is needed before CDB has started, and ConfD will therefore initially use the settings from confd.conf, with the CDB settings taking precedence once CDB has started (i.e. when the transition to phase1 is completed).

28.5. Starting ConfD

By default, ConfD starts in the background without an associated terminal. If it is started as confd --foreground, it starts in the foreground attached to the current terminal. This feature can be used to start ConfD from a process manager. In order to properly stop ConfD in the foreground case, close ConfD's standard input, or use confd --stop as usual. When ConfD is started in the foreground, the commands confd --wait-phase0 and confd --wait-started can be used to synchronize the startup sequence. See below for more details.

If startup or candidate with confirming-commit is used, the system might need to use a configuration which is different from the previous running when it reboots. An example of this is if startup is used, and a manager writes a configuration into running which renders the device unstable, and it is rebooted. It might be that the management IP address used by the OS is not the one that should be used (if it was changed before reboot). We'd like to be able to change this address in the OS configuration files before bringing up the interface. But we don't know the address until ConfD has been started, and ConfD itself needs to listen to this address. To solve this dilemma, ConfD's startup sequence can be split into several phases. The first phase brings up the ConfD daemon, but no subsystems that listen to the management IP address (such as NETCONF and CLI). This phase must be started after the loopback interface has been brought up, since the loopback interface is used to communicate between the application and ConfD.

It is also necessary to use the start phases when CDB is used and semantic validation via external callbacks has been implemented. CDB will validate the new configuration when ConfD is started without an existing database, as well as when a schema upgrade has caused configuration changes. This validation is done on the transition to phase1, which means that validation callbacks must be registered before this.

Note

If an application has both validation callbacks and other callbacks (e.g. data provider), and uses the same daemon structure and control socket through all the phases, it must register all the callbacks in phase0. This is because the confd_register_done() function (see confd_lib_dp(3)) must be called after all registrations are done, and no callbacks will be invoked before this function has been called. The tables below reflect this requirement, but it is also possible to register all callbacks in phase0, which may simplify the startup sequence (however CDB subscribers can not be added until phase1).

The sequence to start up the system should be like this:

  1. bring up the loopback interface

  2. confd --start-phase0

  3. start applications that implement validation callbacks

  4. confd --start-phase1

  5. start remaining applications, read from CDB

  6. potentially update confd.conf and do confd --reload

  7. bring up the management interface

  8. confd --start-phase2

Note that if ConfD is started without any parameters, it will bring up the entire system at once.

This table summarizes the different start-phases and what they do.

Table 28.1. ConfD Start Phases

Command line When command returns ConfD has After which application can/should
confd --start-phase0
  • If upgrading or initializing CDB, created an initial transaction.

  • If upgrading or initializing, the application can modify the initial transaction

  • Register validation callbacks

  • Possibly register external data-providers, transformations, etc (see Note above)

  • Setup notification sockets

  • Connect to HA

confd --start-phase1
  • If upgrading or initializing CDB, committed initial transaction

  • Make HA state transitions

  • Register remaining external data-providers, transformation callbacks, etc (see Note above)

  • Add CDB subscribers

confd --start-phase2
  • Bound and started listening to NETCONF, CLI, Web UI, and SNMP addresses / ports

  • Allowed initiation of MAAPI user sessions


This table summarizes the different start-phases when ConfD is started in the foreground.

Table 28.2. ConfD Start Phases, running in foreground

Command line When command returns ConfD has After which application can/should
confd --foreground --start-phase0 This command never returns.
confd --wait-phase0
  • If upgrading or initializing CDB, created an initial transaction.

  • If upgrading or initializing, the application can modify the initial transaction

  • Register validation callbacks

  • Possibly register external data-providers, transformations, etc (see Note above)

  • Setup notification sockets

confd --start-phase1
  • If upgrading or initializing CDB, committed initial transaction

  • Connect to HA

  • Register remaining external data-providers, transformation callbacks, etc (see Note above)

  • Add CDB subscribers

confd --start-phase2
  • Bound and started listening to NETCONF, CLI, Web UI, and SNMP addresses / ports

  • Allowed initiation of MAAPI user sessions


28.6. ConfD IPC

Client libraries connect to ConfD using TCP. We tell ConfD which address to use for these connections through the /confdConfig/confdIpcAddress/ip (default value 127.0.0.1) and /confdConfig/confdIpcAddress/port (default value 4565) elements in confd.conf. It is possible to change these values, but it requires a number of steps to also configure the clients. Also there are security implications, see section Security issues below.

Some clients read the environment variables CONFD_IPC_ADDR and CONFD_IPC_PORT to determine if something other than the default is to be used, others might need to be recompiled. This is a list of clients which communicate with ConfD, and what needs to be done when confdIpcAddress is changed.

ClientChanges required
Remote commands via the confd command Remote commands, such as confd --reload, check the environment variables CONFD_IPC_ADDR and CONFD_IPC_PORT.
CDB and MAAPI clients The address supplied to cdb_connect() and maapi_connect() must be changed.
Data provider API clients The address supplied to confd_connect() must be changed.
confd_cli

The Command Line Interface (CLI) client, confd_cli, checks the environment variables CONFD_IPC_ADDR and CONFD_IPC_PORT. Alternatively the port can be provided on the command line (using the -P option).

NOTE: confd_cli is provided as source, in $CONFD_DIR/src/confd/cli, so it is also possible to re-compile it using the new address as default.

Notification API clients The new address must be supplied to confd_notifications_connect()

To run more than one instance of ConfD on the same host (which can be useful in development scenarios) each instance needs its own IPC port. For each instance set /confdConfig/confdIpcAddress/port in confd.conf to something different.

There are two more instances of ports that will have to be modified, NETCONF and CLI over SSH. The netconf (SSH and TCP) ports that ConfD listens to by default are 2022 and 2023 respectively. Modify /confdConfig/netconf/transport/ssh and /confdConfig/netconf/transport/tcp, either by disabling them or changing the ports they listen to. The CLI over SSH by default listens to 2024; modify /confdConfig/cli/ssh either by disabling or changing the default port.

28.6.1. Using a different IPC mechanism

We can set up ConfD to use a different IPC mechanism than TCP for the client library connections, as well as for the communication between ConfD nodes in a HA cluster. This can be useful e.g. in a chassis system where ConfD runs on a management blade, while the managed objects run on data processing blades that may not have a TCP/IP implementation.

There are several requirements that must be fulfilled by such an IPC mechanism:

  • It must adhere to the standard socket API, with SOCK_STREAM semantics. I.e. it must provide an ordered, reliable byte stream, with connection management via the connect(), bind(), listen(), and accept() primitives.

  • It must support non-blocking operations (requested via fcntl(O_NONBLOCK)), for accept() as well as for read and write operations. Ideally non-blocking connect() should also be supported, but this is not currently used by ConfD in this case.

  • It must support the use of poll() for I/O multiplexing.

For ConfD to be able to use this mechanism without knowledge of address format etc, we must provide C code in the form of a shared object, which is dynamically loaded by ConfD. The interface between ConfD and the shared object code is defined in the ipc_drv.h file in the $CONFD_DIR/src/confd/ipc_drv directory in the release. The shared object must be named ipc_drv_ops.so and installed in the $CONFD_DIR/lib/confd/lib/core/confd/priv directory of the ConfD installation, see the sample Makefile in the ipc_drv directory. The interface is implemented via the confd_ext_ipc_init() function. This function must be provided by the shared object, and it must return a pointer to a callback structure defined in the shared object:

struct confd_ext_ipc_cbs {
    int (*getaddrinfo)(char *address,
                       int *family, int *type, int *protocol,
                       struct sockaddr **addr, socklen_t *addrlen,
                       char **errstr);
    int (*socket)(int family, int type, int protocol, char **errstr);
    int (*getpeeraddr)(int fd, char **address, char **errstr);  /* optional */
    int (*connect)(char *address, char **errstr);
    int (*bind)(char *address, char **errstr);
    void (*unbind)(int fd);                                     /* optional */
};

The structure must provide (i.e. have non-NULL function pointers for) either both of the getaddrinfo() and socket() callbacks, or both of the connect() and bind() callbacks - it may of course provide all of them. The getpeeraddr() and unbind() callbacks are optional. If both getaddrinfo() and socket() are provided, the shared object can also be used by applications using the C APIs to connect to ConfD (see e.g. the confd_cmd.c source code in the $CONFD_DIR/src/confd/tools directory).

All the callbacks except unbind() can report an error by returning -1, and in this case optionally provide an error message via the errstr parameter. If an error message is provided, errstr must point to dynamically allocated memory - ConfD will free it through a call to free(3) after reporting the error.

getaddrinfo()

This callback should parse the given text-format address (see below). If the parsing is successful, the callback should return 0 and provide data that can be used for the socket() callback and for the standard bind(2) and/or connect(2) system calls via the family, type, protocol, addr, and addrlen parameters. The structure pointed to by addr must be dynamically allocated - ConfD will free it after use through a call to free(3).

socket()

This callback should create a socket, and if successful return the socket file descriptor.

getpeeraddr()

This optional callback should create a text representation of the address of the remote host/node connected via the socket fd, and if successful return 0 and provide the text-format address via the address parameter. The main purpose of the callback is to make it possible to use the maapi_disconnect_remote() function (see the confd_lib_maapi(3) manual page), but the provided address will also be used in e.g. HA status and notifications, and will be included in ConfD debug dumps.

connect()

This callback should create a socket, connect it to the given address (see below), and if successful return the socket file descriptor.

bind()

This callback should create a socket, bind it to the given address (see below), and if successful return the socket file descriptor.

unbind()

This is an optional callback that can be used if we need to do any special cleanup when a bound socket is closed. In this case the callback must also close the file descriptor - otherwise the function pointer can be set to NULL, and ConfD will close the file descriptor.

Two examples using this interface are provided in the $CONFD_DIR/src/confd/ipc_drv directory. One of them (ipc_drv_unix.c) uses AF_UNIX sockets, and implements only the connect(), bind(), and unbind() callbacks. The other (ipc_drv_etcp.c) actually uses standard AF_INET/AF_INET6 TCP sockets just like the "normal" ConfD IPC - this can be meaningful if we need to set some non-standard socket options such as Linux SO_VRF for all IPC sockets. This example implements the getaddrinfo(), socket(), and getpeeraddr() callbacks.

An older version of this interface (also defined in ipc_drv.h) used a confd_ipc_init() function and a struct confd_ipc_cbs callback structure. This interface is deprecated, but will continue to be supported. The main differences are that the old interface lacks the getaddrinfo(), socket(), and getpeeraddr() callbacks, and that any error message would be provided via a static errstr structure element.

To enable the use of this alternate IPC mechanism for the client library connections, we need to set /confdConfig/confdExternalIpc/enabled to "true" in confd.conf. This causes any settings for /confdConfig/confdIpcAddress/ip and /confdConfig/confdIpcAddress/port to be ignored, and we can instead specify the address to use in /confdConfig/confdExternalIpc/address. The address is given in text form, and ConfD passes it to the getaddrinfo(), bind(), and/or connect() callbacks without any interpretation.

If we want to use the alternate IPC for the inter-node HA communication, we can in the same way set /confdConfig/ha/externalIpc/enabled and /confdConfig/ha/externalIpc/address in confd.conf. Additionally the HA API uses a struct that holds a node address:

struct confd_ha_node {
    confd_value_t nodeid;
    int af;               /* AF_INET | AF_INET6 | AF_UNSPEC */
    union {               /* address of remote note */
        struct in_addr ip4;
        struct in6_addr ip6;
        char *str;
    } addr;
    char buf[128];        /* when confd_read_notification() and        */
                          /* confd_ha_status() populate these structs, */
                          /* if type of nodeid is C_BUF, the pointer   */
                          /* will be set to point into this buffer     */
    char addr_buf[128];   /* similar to the above, but for the address */
                          /* of remote node when using external IPC    */
                          /* (from getpeeraddr() callback for slaves)  */
};

When this struct is used to specify the address of the master in the confd_ha_beslave() call, the af element should be set to AF_UNSPEC, and the str element of the addr union should point to the text form of the master node's address. When the struct is used to deliver information from ConfD, in the HA event notifications and the result of a confd_ha_status() call, af will also be set to AF_UNSPEC, but str will be NULL for slave nodes unless a peer address has been provided via the getpeeraddr() callback.

The client changes we need to do are analogous to those listed in the table above for the case of using a different IP address and/or port for TCP - the differences are:

  • Instead of CONFD_IPC_ADDR and CONFD_IPC_PORT, the environment variable CONFD_IPC_EXTADDR is used to specify the address. This should be in the same form as used in confd.conf, and if the variable is set it causes any CONFD_IPC_ADDR and CONFD_IPC_PORT settings to be ignored.

  • The confd_cli program also needs to be told where to find the shared object that it should use for the connect() operation. This is done via the CONFD_IPC_EXTSOPATH environment variable, i.e. it typically needs to be set to $CONFD_DIR/lib/confd/lib/core/confd/priv/ipc_drv_ops.so.

  • Provided that the getaddrinfo() and socket() callbacks are provided by the shared object, the confd_cmd, confd_load, and maapi commands included in the release can also use the shared object if the CONFD_IPC_EXTSOPATH environment variable is set. Otherwise these programs will assume that any setting of environment CONFD_IPC_EXTADDR is the pathname of an AF_UNIX socket.

As noted above, confd_cli is provided as source, so we can alternatively modify it to support the alternate IPC mechanism "natively". This is also the case for confd_cmd, confd_load, and maapi.

Note

If we rebuild confd_cli or the other commands from source, but want to keep the support for alternate IPC via the environment variables and shared object, the preprocessor macro EXTERNAL_IPC must be defined. This can be done by un-commenting the #define in the source, or by using a -D option to the compiler.

28.6.2. Restricting access to the IPC port

By default, the clients connecting to the ConfD IPC port are considered trusted, i.e. there is no authentication required, and we rely on the use of 127.0.0.1 for /confdConfig/confdIpcAddress/ip to prevent remote access. In case this is not sufficient, it is possible to restrict the access to the IPC port by configuring an access check.

The access check is enabled by setting the confd.conf element /confdConfig/confdIpcAccessCheck/enabled to "true", and specifying a filename for /confdConfig/confdIpcAccessCheck/filename. The file should contain a shared secret, i.e. a random character string. Clients connecting to the IPC port will then be required to prove that they have knowledge of the secret through a challenge handshake, before they are allowed access to the ConfD functions provided via the IPC port.

Note

Obviously the access permissions on this file must be restricted via OS file permissions, such that it can only be read by the ConfD daemon and client processes that are allowed to connect to the IPC port. E.g. if both the ConfD daemon and the clients run as root, the file can be owned by root and have only "read by owner" permission (i.e. mode 0400). Another possibility is to have a group that only the ConfD daemon and the clients belong to, set the group ID of the file to that group, and have only "read by group" permission (i.e. mode 040).

To provide the secret to the client libraries, and inform them that they need to use the access check handshake, we have to set the environment variable CONFD_IPC_ACCESS_FILE to the full pathname of the file containing the secret. This is sufficient for all the clients mentioned above, i.e. there is no need to change application code to support or enable this check.

Note

The access check must be either enabled or disabled for both the ConfD daemon and the clients. E.g. if /confdConfig/confdIpcAccessCheck/enabled in confd.conf is not set to "true", but clients are started with the environment variable CONFD_IPC_ACCESS_FILE pointing to a file with a secret, the client connections will fail.

28.7. Restart strategies

If the ConfD daemon is shut down, all applications connected to the ConfD daemon must enter an indefinite reconnect loop. If ConfD has been configured to use a startup datastore, all applications keeping configuration data in their run-time state must re-read the configuration data from CDB, when the daemon comes back.

If ConfD has been setup to not use a startup datastore, all applications which keep configuration data in their run-time state can just proceed its processing without any re-read of the configuration data from CDB, when the daemon comes back.

The ConfD daemon must be restarted if .fxs files in a running system are to be changed. It is not enough to issue a:

$ confd --reload

Before we restart the daemon we need to stop all applications relying on the .fxs files that are updated. Whenever the daemon is up and running the stopped applications can be restarted.

Applications which do not rely on the updated .fxs files can safely be kept running. However, be sure to follow the startup datastore reconnect strategy above.

28.8. Security issues

ConfD requires some privileges to perform certain tasks. The following tasks may, depending on the target system, require root privileges.

  • Binding to privileged ports. The confd.conf configuration file specifies which port numbers ConfD should bind(2) to. If any of these port numbers are lower than 1024, ConfD usually requires root privileges unless the target operating system allows ConfD to bind to these ports as a non-root user.

  • If PAM is to be used for authentication, the program installed as $CONFD_DIR/lib/confd/lib/core/pam/priv/epam acts as a PAM client. Depending on the local PAM configuration, this program may require root privileges. If PAM is configured to read the local passwd file, the program must either run as root, or be setuid root. If the local PAM configuration instructs ConfD to run for example pam_radius_auth, root privileges are possibly not required depending on the local PAM installation.

  • If the CLI is used and we want to create CLI commands that run executables, we may want to modify the permissions of the $CONFD_DIR/lib/confd/lib/core/confd/priv/cmdptywrapper program.

    To be able to run an executable as root or a specific user, we need to make cmdptywrapper setuid root, i.e.:

    1. # chown root cmdptywrapper

    2. # chmod u+s cmdptywrapper

    Failing that, all programs will be executed as the user running the confd daemon. Consequently, if that user is root we do not have to perform the chmod operations above.

    The same applies for executables run via actions, but then we may want to modify the permissions of the $CONFD_DIR/lib/confd/lib/core/confd/priv/cmdwrapper program instead:

    1. # chown root cmdwrapper

    2. # chmod u+s cmdwrapper

ConfD can be instructed to terminate NETCONF over clear text TCP. This is useful for debugging since the NETCONF traffic can then be easily captured and analyzed. It is also useful if we want to provide some local proprietary transport mechanism which is not SSH. Clear text TCP termination is not authenticated, the clear text client simply tells ConfD which user the session should run as. The idea is that authentication is already done by some external entity, such as an SSH server. If clear text TCP is enabled, it is very important that ConfD binds to localhost (127.0.0.1) for these connections.

Client libraries connect to ConfD. For example the CDB API is TCP based and a CDB client connects to ConfD. We instruct ConfD which address to use for these connections through the confd.conf parameters /confdConfig/confdIpcAddress/ip (default address 127.0.0.1) and /confdConfig/confdIpcAddress/port (default port 4565).

ConfD multiplexes different kinds of connections on the same socket (IP and port combination). The following programs connect on the socket:

  • Remote commands, such as e.g. confd --reload

  • CDB clients.

  • External database API clients.

  • MAAPI, The Management Agent API clients.

  • The confd_cli program

All of the above are considered trusted. MAAPI clients and confd_cli should supposedly authenticate the user before connecting to ConfD whereas CDB clients and external database API clients are considered trusted and do not have to authenticate.

Thus, since the confdIpcAddress socket allows full unauthenticated access to the system, it is important to ensure that the socket is not accessible from untrusted networks. However it is also possible to restrict access to this socket by means of an access check, see Section 28.6.2, “Restricting access to the IPC port” above.

28.9. Running ConfD as a non privileged user

A common misfeature found on UN*X operating systems is the restriction that only root can bind to ports below 1024. Many a dollar has been wasted on workarounds and often the results are security holes.

Both FreeBSD and Solaris have elegant configuration options to turn this feature off. On FreeBSD:

$ sysctl net.inet.ip.portrange.reservedhigh=0
      

The above is best added to your /etc/sysctl.conf

Similarly on Solaris we can just configure this. Assuming we want to run ConfD under a non-root user "confd". On Solaris we can do that easily by granting the specific right to bind privileged ports below 1024 (and only that) to the "confd" user using:

$ /usr/sbin/usermod -K defaultpriv=basic,net_privaddr confd
      

And check the we get what we want through:

$ grep confd /etc/user_attr
confd::::type=normal;defaultpriv=basic,net_privaddr
      

Linux doesn't have anything like the above. There are a couple of options on Linux. The best is to use an auxiliary program like authbind http://packages.debian.org/stable/authbind or privbind http://sourceforge.net/projects/privbind/

These programs are run by root. To start confd under e.g authbind we can do:

privbind -u confd /opt/confd/confd-2.7/bin/confd \
    -c /etc/confd.conf
      

The above command starts confd as user confd and binds to ports below 1024

28.10. Storing encrypted values in ConfD

Using the tailf:des3-cbc-encrypted-string or the tailf:aes-cfb-128-encrypted-string built-in types it is possible to store encrypted values in ConfD (see confd_types(3)). The keys used to encrypt these values are stored in confd.conf. Whenever an encrypted leaf is read using the CDB API or MAAPI it is possible to decrypt the returned string using the confd_decrypt() function. When the keys in confd.conf are changed, the encrypted values will not be decryptable any longer, so care must be taken to re-install the values using the new keys. This section will provide an example on how to do this.

The encrypted values can only be decrypted using confd_decrypt(), which only works when ConfD is running with the correct keys, so the procedure to update the encrypted values is:

  1. Read all the encrypted values and decrypt them

  2. Stop the ConfD daemon

  3. Restart it with the new encryption keys

  4. Write back the values in clear-text, which will cause ConfD to encrypt them again

A very simple YANG model to store encrypted strings could be:

module enctest {
    namespace "http://www.example.com/ns/enctest";
    prefix e;
    import tailf-common {
        prefix tailf;
    }

    container strs {
        list str {
            key nr;
            max-elements 64;
            leaf nr {
                type int32;
            }
            leaf secret {
                type tailf:aes-cfb-128-encrypted-string;
                mandatory true;
            }
        }
    }
}

The we could write a function which would read all the encrypted leafs and save the clear-text equivalent. Such a function (without error checking) could look like this:

static void install_keys(struct sockaddr_in *addr)
{
    struct confd_daemon_ctx *dctx;
    int ctlsock = socket(PF_INET, SOCK_STREAM, 0);
    dctx = confd_init_daemon(progname);
    confd_connect(dctx, ctlsock, CONTROL_SOCKET, (struct sockaddr*)addr, sizeof (*addr));
    confd_install_crypto_keys(dctx);
    close(ctlsock);
    confd_release_daemon(dctx);
}

static void get_clear_text(struct sockaddr_in *addr, FILE *f)
{
    int rsock = socket(PF_INET, SOCK_STREAM, 0);
    int i, n;

    install_keys(addr);

    cdb_connect(rsock, CDB_READ_SOCKET, (struct sockaddr*)addr, sizeof(*addr));
    cdb_start_session(rsock, CDB_RUNNING);
    cdb_set_namespace(rsock, smp__ns);
    n = cdb_num_instances(rsock, "/strs/str");
    for(i=0; i<n; i++) {
        int nr;
        char cstr[BUFSIZ], dstr[BUFSIZ];

        cdb_get_str(rsock, cstr, sizeof(cstr), "/strs/str[%d]/secret", i);
        cdb_get_int32(rsock, &nr, "/strs/str[%d]/nr", i);
        memset(dstr, 0, sizeof(dstr));
        confd_decrypt(cstr, strlen(cstr), dstr);
        fprintf(f, "/strs/str{%d}/secret=$0$%s\n", nr, dstr);
    }
    cdb_end_session(rsock),
    cdb_close(rsock);
}

Note the prefixing of the clear-text output of $0$ - this is what indicates to the ConfD daemon that the strings are in clear text, causing it to encrypt them when we install them again.

Now the opposite function, reading lines on the form "keypath=value" and using the maapi_set_elem2() function to write them back to the ConfD daemon.

static void set_values(struct sockaddr_in *addr, FILE *f)
{
    int msock = socket(PF_INET, SOCK_STREAM, 0);
    int th;
    struct confd_ip ip;
    const char *groups[] = { "admin" };

    maapi_connect(msock, (struct sockaddr*)addr, sizeof(*addr));
    ip.af = AF_INET;
    inet_aton("127.0.0.1", &ip.ip.v4);
    maapi_start_user_session(msock, "admin", progname,
                             groups, sizeof(groups) / sizeof(*groups),
                             &ip, CONFD_PROTO_TCP);

    maapi_start_trans(msock, CONFD_RUNNING, CONFD_READ_WRITE);
    maapi_set_namespace(msock, th, smp__ns);
    for (;;) {
        char *key, *val, line[BUFSIZ];
        if (fgets(line, sizeof(line), f) == NULL) {
            break;
        }
        key = line;
        val = strchr(key, (int)'=');
        *val++ = 0; /* NUL terminate the key, make val point to value */
        maapi_set_elem2(msock, th, val, key);
    }
    maapi_apply_trans(msock, th, 0);
    maapi_end_user_session(msock);
    close(msock);
}

Putting it together with this main() function makes a useful utility program for the task at hand.

int main(int argc, char **argv)
{
    char *confd_addr = "127.0.0.1";
    int confd_port = CONFD_PORT;
    struct sockaddr_in addr;
    int c, mode = 0;            /* 1 = get, 2 = set */

    /* Parse command line */
    while ((c = getopt(argc, argv, "gs")) != EOF) {
        switch (c) {
        case 'g':
            mode = 1;
            break;
        case 's':
            mode = 2;
            break;
        default:
            printf("huh?\n");
            exit(1);
        }
    }
    if (!mode) {
        fprintf(stderr, "%s: must provide either -s or -g\n", argv[0]);
        exit(1);
    }
    /* Initialize address to confd daemon */
    {
        struct in_addr in;
        inet_aton(confd_addr, &in);
        addr.sin_addr.s_addr = in.s_addr;
        addr.sin_family = AF_INET;
        addr.sin_port = htons(confd_port);
    }
    confd_init(argv[0], stderr, dbg);

    switch (mode) {
    case 1: get_clear_text(&addr, stdout); break;
    case 2: set_values(&addr, stdin); break;
    }
    exit(0);
}

Using this utility, called crypto_keys, installing new encryption keys could be done using a shell script like this.

# First save clear text version of the keys in a temporary file
crypto_keys -g > TOP_SECRET

# Now stop the daemon
confd --stop

# Install the new AES encryption key (provided to this script in $1)
mv confd.conf confd.conf.old
xmlset C "$1" confdConfig encryptedStrings AESCFB128 key < \
    confd.conf.old  > confd.conf
rm -f confd.conf.old

# Bring the daemon up to start-phase 1
confd -c confd-conf --start-phase0
confd --start-phase1

# Now write back the keys, and remove the temporary file
crypto_keys -s < TOP_SECRET
rm -f TOP_SECRET

# We are done
confd --start-phase2

In this example we are only using AES encryption, and only modifying the key, not the initial vector - but it is easy to extend to use the 3DES keys as well. The xmlset utility (provided as example source in $CONFD_DIR/src/confd/tools) in the ConfD distribution) is used to modify the key in confd.conf. Writing back the encrypted leaf in start phase 1 ensures that no external method (e.g. a NETCONF request) modifies the data before it is re-installed with the new encryption keys.

28.11. Disaster management

This section describes a number of disaster scenarios and recommends various actions to take in the different disaster variants.

28.11.1. ConfD fails to start

CDB keeps its data in two files A.cdb and C.cdb. If ConfD is stopped, these two files can simply be copied, and the copy is then a full backup of CDB. If ConfD is running, we cannot copy the files, but need to use confd --cdb-backup file to copy the two CDB files into a backup file (in gzipped tar format).

Furthermore, if neither A.cdb nor C.cdb exists in the configured CDB directory, CDB will attempt to initialize from all files in the CDB directory with the suffix ".xml".

Thus, there exists two different ways to reinitiate CDB from a previous known good state, either from .xml files of from a CDB backup. The .xml files would typically be used to reinstall "factory defaults" whereas a CDB backup could be used in more complex scenarios.

When ConfD starts and fails to initialize, the following exit codes can occur:

  • Exit codes 1 and 19 mean that an internal error has occurred. A text message should be in the logs, or if the error occurred at startup before logging had been activated, on standard error (standard output if ConfD was started with --foreground). Generally the message will only be meaningful to the ConfD developers, and an internal error should always be reported to Tail-f support.

  • Exit codes 2 and 3 are only used for the confd "control commands" (see the section COMMUNICATING WITH CONFD in the confd(1) manual page), and mean that the command failed due to timeout. Code 2 is used when the initial connect to ConfD didn't succeed within 5 seconds (or the TryTime if given), while code 3 means that the ConfD daemon did not complete the command within the time given by the --timeout option.

  • Exit code 10 means that one of the init files in the CDB directory was faulty in some way. Further information in the log.

  • Exit code 11 means that the CDB configuration was changed in an unsupported way. This will only happen when an existing database is detected, which was created with another configuration than the current in confd.conf.

  • Exit code 12 means that the C.cdb file is in an old and unsupported format (this can only happen if the CDB database was created with a ConfD version older than 1.3, from which upgrading isn't supported).

  • Exit code 13 means that the schema change caused an upgrade, but for some reason the upgrade failed. Details are in the log. The way to recover from this situation is either to correct the problem or to re-install the old schema (fxs) files.

  • Exit code 14 means that the schema change caused an upgrade, but for some reason the upgrade failed, corrupting the database in the process. This is rare and usually caused by a bug. To recover, either start from an empty database with the new schema, or re-install the old schema files and apply a backup.

  • Exit code 15 means that A.cdb or C.cdb is corrupt in a non-recoverable way. Remove the files and re-start using a backup or init files.

  • Exit code 16 means that CDB ran into an unrecoverable file-error while booting (such as running out of space on the device while writing the initial schema file).

  • Exit code 20 means that ConfD failed to bind a socket. By default this means that Confd refuses to start. It is however possible to force Confd to ignore this fatal error situation by enabling the parameter /confdConfig/ignoreBindErrors. Instead a warning is issued and the failing northbound agent is disabled. The agent may be enabled by dynamically re-configuring the failing agent to use another port and restart Confd.

  • Exit code 21 means that some ConfD configuration file is faulty. More information in the logs.

  • Exit code 22 indicates a ConfD installation related problem, e.g. that the user does not have read access to some library files, or that some file is missing.

If the ConfD daemon starts normally, the exit code is 0.

If CDB is reinitialized to factory defaults, it may not be possible to reach the machine over the network. The only way to reconfigure the machine is through a CLI login over the serial console.

If the AAA database is broken, ConfD will start but with no authorization rules loaded. This means that all write access to the configuration is denied. The ConfD CLI can be started with a flag confd_cli --noaaa which will allow full unauthorized access to the configuration. Usage of the ConfD cli with this flag can possibly be enabled for some special UNIX user which can only login over the serial port. Thus --noaaa provides a way to reconfigure the box although the AAA database is broken.

28.11.2. ConfD failure after startup

ConfD attempts to handle all runtime problems without terminating, e.g. by restarting specific components. However there are some cases where this is not possible, described below. When ConfD is started the default way, i.e. as a daemon, the exit codes will of course not be available, but see the --foreground option in the confd(1) manual page.

  • Out of memory: If ConfD is unable to allocate memory, it will exit by calling abort(3). This will generate an exit code as for reception of the SIGABRT signal - e.g. if ConfD is started from a shell script, it will see 134 as exit code (128 + the signal number).

  • Out of file descriptors for accept(2): If ConfD fails to accept a TCP connection due to lack of file descriptors, it will log this and then exit with code 25. To avoid this problem, make sure that the process and system-wide file descriptor limits are set high enough, and if needed configure session limits in confd.conf.

28.11.3. Transaction commit failure

When the system is updated, ConfD executes a two phase commit protocol towards the different participating databases including CDB. If a participant fails in the commit() phase although the participant succeeded in the prepare phase, the configuration is possibly in an inconsistent state.

When ConfD considers the configuration to be in a inconsistent state, operations will continue. It is still possible to use NETCONF, the CLI and all other northbound management agents. The CLI has a different prompt which reflects that the system is considered to be in an inconsistent state and also the Web UI shows this:

  -- WARNING ------------------------------------------------------
  Running db may be inconsistent. Enter private configuration mode and
  install a rollback configuration or load a saved configuration.
  ------------------------------------------------------------------
        

It is slightly more involved using the NETCONF agent. The NETCONF transaction which resulted in a failed commit will fail, but following that the only way to see that the system is considered to be in an inconsistent state is by reading the data defined by tailf-netconf-monitoring.

The MAAPI API has two interface functions which can be used to set and retrieve the consistency status. This API can thus be used to manually reset the consistency state. Apart from this, the only way to reset the state to a consistent state is by reloading the entire configuration.

28.12. Troubleshooting

This section discusses problems that new users have seen when they started to use ConfD. Please do not hesitate to contact our support team (see below) if you are having trouble, regardless of whether your problem is listed here or not.

28.12.1. Installation Problems

Error messages during installation

The installation program gives a lot of error messages, the first few like the ones below. The resulting installation is obviously incomplete.

tar: Skipping to next header
gzip: stdin: invalid compressed data--format violated

Cause: This happens if the installation program has been damaged, most likely because it has been downloaded in 'ascii' mode.

Resolution: Remove the installation directory. Download a new copy of ConfD from our servers. Make sure you use binary transfer mode every step of the way.

28.12.2. Problems Starting ConfD

ConfD terminating with GLIBC error

ConfD terminates immediately with a message similar to the one below.

Internal error: Open failed: /lib/tls/libc.so.6: version
`GLIBC_2.3.4' not found (required by
.../lib/confd/lib/core/util/priv/syst_drv.so)

Cause: This happens if you are running on a very old Linux version. The GNU libc (GLIBC) version is older than 2.3.4, which was released 2004.

Resolution: Use a newer Linux system, or upgrade the GLIBC installation.

ConfD terminating with libcrypto error

  • ConfD terminates immediately with a message similar to this:

    Bad configuration: .../confd.conf:0: cannot dynamically link with
    libcrypto shared library

    Cause: This normally happens due to the OpenSSL package being of the wrong version or not installed in the operating system.

    Resolution: One of

    1. Install the OpenSSL package with the correct version. This is 1.0.0 for Linux releases of ConfD, 0.9.8 or 0.9.7 for some other operating systems. To find out the version to install, run:

      $ ldd $CONFD_DIR/lib/confd/lib/core/crypto/priv/lib/crypto.so

      Note: only the libcrypto shared library (libcrypto.so.N.N.N) is actually required by ConfD.

    2. Provided that a different version of OpenSSL, 0.9.7 or greater, is installed: Rebuild the ConfD components that depend on libcrypto to use this version, as described in Section 28.15, “Using a different version of OpenSSL”.

  • ConfD terminates immediately, or when the Web UI is enabled, with a message similar to:

    Bad configuration: .../confd.conf:0: libcrypto shared library mismatch
    (DES_INT) - crypto.so and libconfd must be rebuilt

    or:

    Bad configuration: .../confd.conf:0: libcrypto shared library mismatch
    (RC4_CHAR) - crypto.so must be rebuilt for support of default setting
    for /confdConfig/webui/transport/ssl/ciphers

    Cause: This happens if the OpenSSL package is of the correct version, but has been built with a configuration parameter that makes the interface incompatible with the build that is expected by ConfD.

    Resolution: Applying resolution 2 above is always sufficient. Applying resolution 1 is also a possibility, but requires that the OpenSSL package is built with the expected configuration parameters. Contact Tail-f support if this method is desired but unsuccessful in solving the problem. In case only the second message (with RC4_CHAR) occurs, yet another way to resolve the issue is to configure a cipher list for /confdConfig/webui/transport/ssl/ciphers in confd.conf (or confd_dyncfg) that does not include any RC4-based ciphers - see confd.conf(5).

28.12.3. Problems Running Examples

Some examples are dependent on features that might only be available on Linux. Before such examples can run, they would have to be ported.

The 'netconf-console' program fails

Sending NETCONF commands and queries with 'netconf-console' fails, while it works using 'netconf-console-tcp'. The error message is below.

You must install the python ssh implementation paramiko in order to use ssh.

Cause: The netconf-console command is implemented using the Python programming language. It depends on the python SSH implementation Paramiko. Since you are seeing this message, your operating system doesn't have the python-module Paramiko installed. The Paramiko package, in turn, depends on a Python crypto library (pycrypto).

Resolution: Install Paramiko (and pycrypto, if necessary) using the standard installation mechanisms for your OS. An alternative approach is to go to the project home pages to fetch, build and install the missing packages.

These packages come with simple installation instructions. You will need root privileges to install these packages, however. When properly installed, you should be able to import the paramiko module without error messages

$ python
...
>>> import paramiko
>>>

Exit the Python interpreter with Ctrl+D.

A workaround is to use 'netconf-console-tcp'. It uses TCP instead of SSH and doesn't require Paramiko or Pycrypto. Note that TCP traffic is not encrypted.

28.12.4. General Troubleshooting Strategies

If you have trouble starting or running ConfD, the examples or the clients you write, here are some troubleshooting tips.

Transcript

When contacting support, it often helps the support engineer to understand what you are trying to achieve if you copy-paste the commands, responses and shell scripts that you used to trigger the problem.

Verbose flag

When ConfD is started, give the --verbose (abbreviated -v) and --foreground flags. This will prevent ConfD from starting as a daemon and cause some messages to be printed on the stdout.

$ confd --verbose --foreground ...
              

Log files

To find out what ConfD is/was doing, browsing ConfD's log files is often helpful. In the examples, they are called 'devel.log', 'confd.log', 'audit.log' and 'confd.log'. If you are working with your own system, make sure the log files are enabled in 'confd.conf'. They are already enabled in all the examples.

Status

ConfD will give you a comprehensive status report if you call

$ confd --status
              

ConfD status information is also available as operational data under /confd-state when the tailf-confd-monitoring.fxs and tailf-common-monitoring.fxs data model files are present in ConfD's loadPath. These files are stored in $CONFD_DIR/etc/confd in the ConfD release, and the functionality thus enabled by default. See the corresponding YANG modules tailf-confd-monitoring.yang and tailf-common-monitoring.yang in the $CONFD_DIR/src/confd/yang directory of the ConfD release for documentation of the provided data. To allow programmatic access to this data via MAAPI without exposing it to end users, the modules can be recompiled with the --export none option to confdc (see confdc (1)).

Note

When recompiling these modules, it is critical that the annotation module tailf-confd-monitoring-ann.yang is used, see $CONFD_DIR/src/confd/yang/Makefile.

Check data provider

If you are implementing a data provider (for operational or configuration data), you can verify that it works for all possible data items using

$ confd --check-callbacks

Debug dump

If you suspect you have experienced a bug in ConfD, or ConfD told you so, you can give Support a debug dump to help us diagnose the problem. It contains a lot of status information (including a full confd --status report) and some internal state information. This information is only readable and comprehensible to the ConfD development team, so send the dump to your support contact. A debug dump is created using

$ confd --debug-dump mydump1
              

Just as in CSI on TV, it's important that the information is collected as soon as possible after the event. Many interesting traces will wash away with time, or stay undetected if there are lots of irrelevant facts in the dump.

Debug error log

Another thing you can do if you suspect you have experienced a bug in ConfD, is to enable the error log. The logged information is only readable and comprehensible to the ConfD development team, so send the log to your support contact.

By default, the error log is disabled. To enable it, add this chunk of XML between <logs> and </logs> in your confd.conf file:

<errorLog>
  <enabled>true</enabled>
  <filename>./error.log</filename>
</errorLog>
                

This will actually create a number of files called ./error.log*. Please send them all to us.

System dump

If ConfD aborts due to failure to allocate memory (see Section 28.11, “Disaster management”), and you believe that this is due to a memory leak in ConfD, creating one or more debug dumps as described above (before ConfD aborts) will produce the most useful information for Support. If this is not possible, you can make ConfD produce a system dump just before aborting. To do this, set the environment variable $CONFD_DUMP to a file name for the dump before starting ConfD. The dumped information is only comprehensible to the ConfD development team, so send the dump to your support contact.

System call trace

To catch certain types of problems, especially relating to system start and configuration, the operating system's system call trace can be invaluable. This tool is called strace/ktrace/truss. Please send the result to your support contact for a diagnosis. Running instructions below.

Linux:

$ strace -f -o mylog1.strace -s 1024 confd ...
              

BSD:

$ ktrace -ad -f mylog1.ktrace confd ...
$ kdump -f mylog1.ktrace > mylog1.kdump
              

Solaris:

$ truss -f -o mylog1.truss confd ...
              
Application debugging

The primary tool for debugging the interaction between applications and ConfD is to give the debug level debug to confd_init() as CONFD_TRACE, see the confd_lib_lib(3) manual page. If more in-depth debugging using e.g. gdb is needed, it may be useful to rebuild the libconfd library from source with debugging symbols. This can be done by using the libconfd source package confd-<vsn>.libconfd.tar.gz that is delivered with the ConfD release. The package includes a README file that describes how to do the build - note in particular the "Application debugging" section.

When debugging application memory leaks with a tool like valgrind, it is often necessary to rebuild libconfd from source, since the default build uses a "pool allocator" that makes the stack trace information for memory leaks from valgrind completely misleading for allocations from libconfd. The details of how to do a build that disables the pool allocator are described in the "Application debugging" section of the README in the libconfd source package.

28.13. Tuning the size of confd_hkeypath_t

The ConfD C API library libconfd uses a C struct for passing keypaths to callback functions:

typedef struct confd_hkeypath {
    int len;
    confd_value_t v[MAXDEPTH][MAXKEYLEN];
} confd_hkeypath_t;

See the section called “XML PATHS” in the confd_types(3) manual page for discussion about how this struct is used. The values used for MAXDEPTH and MAXKEYLEN are 20 and 9, respectively, which should be big enough even for very large and complex data models. However this comes at a cost in memory (mainly stack) usage - the size of a confd_hkeypath_t is approximately 5.5 kB. Also, in some rare cases, we may have a data model where one or both of these values are not large enough.

It is possible to use other values for MAXDEPTH and MAXKEYLEN, but this requires both that libconfd is rebuilt from source with the new values, and that all applications that use libconfd are also compiled with the new values. It is of course possible to just edit confd_lib.h with the new values, but the #define statements for these in confd_lib.h are guarded with #ifndef directives, which means that they can alternatively be overridden without changing confd_lib.h.

Overriding can be done either via -D options on the compiler command line, or via #define statements before the #include for confd_lib.h. For building libconfd itself without source changes, only the -D option method is possible, though. The build procedure supports an EXTRA_CFLAGS make variable that can be used this purpose, see the README file included in the libconfd source package. E.g. we can do the libconfd build with:

$ make EXTRA_CFLAGS="-DMAXDEPTH=10 -DMAXKEYLEN=5"

The -D option method can of course be used when building applications too, but it is probably less error-prone to use the #define method. E.g. if we make sure that none of the application C or C++ files include confd_lib.h (or confd.h) directly, but instead include say app.h, we can have this in app.h:

#define MAXDEPTH 10
#define MAXKEYLEN 5
#include <confd_lib.h>
      

Whenever an application connects to ConfD via one of the API functions (i.e. confd_connect(), cdb_connect(), etc), a check is made that the MAXDEPTH and MAXKEYLEN values used for building the library are large enough for the data models loaded into ConfD. If they are not, the connection will fail with confd_errno set to CONFD_ERR_PROTOUSAGE and confd_lasterr() giving a message with the required minimum values. Whether the connection succeeds or not, the library will also set the global variables confd_maxdepth and confd_maxkeylen to the minimum values required by ConfD. Thus the values can be found by simply printing these variables in any application that connects to ConfD.

28.14. Error Message Customization

The ConfD release includes a XML document, $CONFD_DIR/src/confd/errors/errcode.xml, that specifies all the customizable errors that may be reported in the different northbound interfaces. The errors are classified with a type and a code, and for each error a parameterized format string for the default error message is given.

The purpose of this file is both to serve as a reference list of the possible errors, which could e.g. be processed programmatically when generating end-user documentation, and to provide the basis for error message customization.

All the error messages specified in the file can be customized by means of application callbacks. An application can register a callback for one or more of the error types, and whenever an error is to be reported in a northbound interface, the callback will first be invoked and given the opportunity to return a message that is different from the default.

The callback will receive user session information, the error type and code, the default error message, and the parameters used to create the default message. For errors of type "validation", the callback also has access to the contents of the transaction that failed validation. See the section called “ERROR FORMATTING CALLBACK” in the confd_lib_dp(3) manual page for the details of the callback registration and invocation.

28.15. Using a different version of OpenSSL

ConfD depends on the OpenSSL libcrypto shared library for a number of cryptographic functions. (The libssl library is not used by ConfD.) Currently most ConfD releases, in particular all releases for Linux systems, are built with OpenSSL version 1.0.0, and thus require that the libcrypto library from this version is present when ConfD is run. Some releases for other systems require libcrypto from OpenSSL version 0.9.8 or 0.9.7. It is also possible that a given version, even though it is the one that ConfD requires, has been built with configuration parameters that make the interface incompatible with the build that is expected by ConfD.

However the libcrypto dependency is limited to two components in the ConfD release, the libconfd library used by applications, and a shared object called crypto.so, that is used by the ConfD daemon as an interface to libcrypto. Both these components are included in source form in the confd-<vsn>.libconfd.tar.gz tar archive that is provided with each ConfD release.

To use a different OpenSSL version than the one the ConfD release is built with, e.g. due to a Linux development or target environment having OpenSSL version 0.9.8 installed for other purposes, it is sufficient to use the provided sources to rebuild these two components with the desired OpenSSL version, and replace them in the ConfD release. The toplevel README file included in the tar archive has instructions on how to do the build of both libconfd and crypto.so.

While libconfd can be located wherever it is convenient for application use, crypto.so must be placed in the $CONFD_DIR/lib/confd/lib/core/crypto/priv/lib directory in the ConfD installation. The Makefiles in the tar archive have install targets for libconfd and crypto.so that will do a copy to the appropriate place in the ConfD installation if CONFD_DIR is set to the installation directory.

28.16. Using shared memory for schema information

It is possible to use shared memory to make schema information (see the section called “USING SCHEMA INFORMATION” in confd_types(3)) available to multiple processes on a given host, without requiring each of them to load the information directly from ConfD by calling one of the schema-loading functions (confd_load_schemas() etc, see the confd_lib_lib(3) and confd_lib_maapi(3) manual pages). This can be a very significant performance improvement for system startup, where multiple application processes will otherwise load schema information more or less simultaneously, and can also reduce RAM usage.

The mechanism uses a shared memory mapping created by mmap(2), backed by a file. One process needs to call first confd_mmap_schemas_setup(), and then one of schema-loading functions, to populate the shared memory segment. Once this has been done, any process (including the one doing the initial load) can call confd_mmap_schemas() to map the shared memory segment into its address space and make the information available to the libconfd library and for direct access by the application. See the confd_lib_lib(3) manual page for the specification of these functions.

The mechanism can be used in different ways, but assuming that persistent storage for the backing file is available, the optimal approach is to do the load and file creation step only on first system start and when a data model upgrade is done. Then it is sufficient to call confd_mmap_schemas() on all other occasions. If persistent storage is not available, a RAM-based file system such as Linux "tmpfs" can be used for the backing file, in which case the load and file creation step needs to be done on each boot (and on data model upgrade). It is also possible to request that ConfD creates and maintains the backing file, see /confdConfig/enableSharedMemorySchema in confd.conf(5) and maapi_get_schema_file_path() in confd_lib_maapi(3).

Since the schema information includes absolute pointers (e.g. the parent, children, and next pointers in a struct confd_cs_node), it is necessary to map the shared memory at the same virtual address in all processes. The addr argument to confd_mmap_schemas_setup() is passed to mmap(2), and the address returned by mmap(2) is used for the mapping. The address is also recorded in the shared memory segment to make it available for confd_mmap_schemas(). The value of the size argument is also passed in the initial mmap(2) invocation, unless it is smaller than the first allocation done (e.g. if it is 0). In any case, unless the CONFD_MMAP_SCHEMAS_KEEP_SIZE flag is passed to confd_mmap_schemas_setup(), the loading will extend the mapped segment as needed, and the final size will only be as large as needed for the data, even if a larger value was passed as size.

Ideally we would give NULL for the addr argument and an approximate size for size, letting the kernel choose a suitable address and letting the load step adjust the final size based on the amount of data loaded. Unfortunately this often results in an address that is not honored on the subsequent mmap(2) call done by confd_mmap_schemas(), which thus fails. The possible choices of addr and/or size to get the desired result are OS- and OS-version-dependant, but on Linux it generally works to use an addr argument that is at an offset from the top of the heap that is larger than expected heap usage, and give size as 0, as shown in the sample code below using a 256 MB offset. (It is not a fatal error if heap usage later exceeds this offset, as malloc(3) etc will skip over the mapped area, but it may have some performance impact.)

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <assert.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <netdb.h>

#include <confd_lib.h>

#define MB (1024 * 1024)
#define SCHEMA_FILE "/etc/schemas"

#define OK(E) do {                                                      \
        int _ret = (E);                                                 \
        if (_ret != CONFD_OK) {                                         \
            confd_fatal(                                                \
                "%s returned %d, confd_errno=%d, confd_lasterr()='%s'\n", \
                #E, _ret, confd_errno, confd_lasterr());                \
        }                                                               \
    } while (0)

static void *get_shm_addr(size_t offset)
{
    size_t pagesize;
    char *addr;

    pagesize = (size_t)sysconf(_SC_PAGESIZE);
    addr = malloc(1);
    free(addr);
    addr += offset;
    /* return pagesize-aligned address */
    return addr - ((uintptr_t)addr % pagesize);
}

int main(int argc, char **argv)
{
    struct sockaddr_in addr;
    void *shm_addr;

    addr.sin_addr.s_addr = inet_addr("127.0.0.1");
    addr.sin_family = AF_INET;
    addr.sin_port = htons(CONFD_PORT);

    confd_init(argv[0], stderr, CONFD_TRACE);
    shm_addr = get_shm_addr(256 * MB);
    OK(confd_mmap_schemas_setup(shm_addr, 0, SCHEMA_FILE ".tmp", 0));
    OK(confd_load_schemas((struct sockaddr *)&addr,
                          sizeof(struct sockaddr_in)));
    if (rename(SCHEMA_FILE ".tmp", SCHEMA_FILE) != 0)
        confd_fatal("Failed to rename\n");
    return 0;
}

This code uses a temporary file that is renamed after the load is complete. This is not necessary, but ensures that the SCHEMA_FILE always represents complete schema info if it exists. It can also serve as a simple synchronization mechanism to let other processes know when they can do their confd_mmap_schemas() call.

On Solaris (at least Solaris 10), the address passed to mmap(2) is effectively ignored, and the returned address depends strictly on the size of the mapping. Thus there is no point passing anything other than NULL for the addr to confd_mmap_schemas_setup(), but instead the size must be big enough for the loaded schema info, and the CONFD_MMAP_SCHEMAS_KEEP_SIZE flag must be used.

In a multi-node system, with application processes connecting to ConfD across a network, shared memory can of course not be used between the nodes. The most straightforward way to handle this is to do the initial load and file creation step on each node. If the nodes have the same HW architecture and OS, a possible alternative could be to copy the backing store file from one node to the others using some file transfer mechanism.

28.17. Running application code inside ConfD

28.17.1. The econfd API

The Erlang API to ConfD is implemented as an Erlang/OTP application called econfd. This application comes i two flavours. One is builtin in ConfD in order to support applications running in the same Erlang VM as ConfD. The other is a separate library which is included in source form in the ConfD release, in the $CONFD_DIR/erlang directory. Building econfd as described in the $CONFD_DIR/erlang/econfd/README file will compile the Erlang code and generate the documentation.

This API can be used by applications written in Erlang in much the same way as the C and Java APIs are used, i.e. code running in an Erlang VM can use the econfd API functions to make socket connections to ConfD for data provider, MAAPI, CDB, etc access. However the API is also available internally in ConfD, which makes it possible to run Erlang application code inside the ConfD daemon, without the overhead imposed by the socket communication.

There is little or no support for testing and debugging Erlang code executing internally in ConfD, since ConfD provides a very limited runtime environment for Erlang in order to minimize disk and memory footprints. Thus the recommended method is to develop Erlang code targeted for this by using econfd in a separate Erlang VM, where an interactive Erlang shell and all the other development support included in the standard Erlang/OTP releases are available. When development and testing is completed, the code can be deployed to run internally in ConfD without changes.

For information about the Erlang programming language and development tools, please refer to www.erlang.org and the available books about Erlang (some are referenced on the web site).

28.17.2. Running inside ConfD

All application code SHOULD use the prefix "ec_" for module names, application names, registered processes (if any), and named ets tables (if any), to avoid conflict with existing or future names used by ConfD itself.

The Erlang code is packaged into applications which are automatically started and stopped by ConfD if they are located at the proper place. ConfD will search the load path as defined by /confdConfig/loadPath for directories called erlang-lib. The structure of such a directory is the same as a standard lib directory in Erlang. The directory may contain multiple Erlang applications. Each one must have a valid .app file. See the Erlang documentation of application and app for more info.

The following config settings in the .app file are explicitly treated by ConfD:

applications

A list of applications which needs to be started before this application can be started. This info is used to compute a valid start order.

included_applications

A list of applications which are started on behalf of this application. This info is used to compute a valid start order.

env

A property list, containing [{Key,Val}] tuples. Besides other keys, used by the application itself, a few predefined keys are used by ConfD. The key confd_start_phase is used by ConfD to determine which start phase the application is to be started in. Valid values are phase0, phase1 and phase2. Default is phase1. The key confd_restart_type is used by ConfD to determine which impact a restart of the application will have. This is the same as the restart_type() type in application. Valid values are permanent, transient and temporary. Default is permanent.

When the application is started, one of its processes should make initial connections to the ConfD subsystems, register callbacks etc. This is typically done in the init/1 function of a gen_server or similar. While the internal connections are made using the exact same API functions (e.g. econfd_maapi:connect/2) as for an application running in an external Erlang VM, any Address and Port arguments are ignored, and instead standard Erlang inter-process communication is used. The internal_econfd/embedded_applications/transform example in the bundled collection shows a transform written in Erlang and executing internally in ConfD.

An alternate way (the old way) of running custom code in the Erlang VM of ConfD is to load single Erlang modules (as opposed to use proper applications). When ConfD starts, specifically when phase0 is reached, ConfD will search the load path as defined by /confdConfig/loadPath for compiled Erlang modules, i.e. *.beam files. The modules that are found will be loaded, unless the module name conflicts with an existing ConfD module. If there is a module name conflict, ConfD will terminate with an error message and exit code 21. The -on_load() directive can be used to spawn a process that makes initial connections to the ConfD subsystems, registers callbacks, sets up supervision if desired, etc. The internal_econfd/single_modules/transform example in the bundled collection shows a transform written in Erlang and executing internally in ConfD.

The --printlog option to confd, which prints the contents of the ConfD errorLog, is normally only useful for Tail-f support and developers, but it may also be relevant for debugging problems with application code running inside ConfD. The errorLog collects the events sent to the OTP error_logger, e.g. crash reports as well as info generated by calls to functions in the error_logger(3) module. Another possibility for primitive debugging is to run confd with the --foreground option, where calls to io:format/2 etc will print to standard output. Printouts may also be directed to the developer log by using econfd:log/3.

While Erlang application code running in an external Erlang VM can use basically any version of Erlang/OTP, this is not the case for code running inside ConfD, since the Erlang VM is evolving and provides limited backward/forward compatibility. To avoid incompatibility issues when loading the beam files, the Erlang compiler erlc in same version the ConfD distribution should be used.

ConfD provides the VM, erlc and the kernel, stdlib, and crypto OTP applications.

Note

Obviously application code running internally in the ConfD daemon can have an impact on the execution of the standard ConfD code. Thus it is critically important that the application code is thoroughly tested and verified before being deployed for production in a system using ConfD.

28.17.3. User-defined types

We can implement user-defined types with Erlang code in a manner similar to what is described for C in the section called “USER-DEFINED TYPES” in confd_types(3). In the econfd API, we populate a #confd_type_cbs{} record and register it using econfd_schema:register_type_cbs/1. For an application running inside ConfD, this registration will have the same effect as using a shared object in the C API, i.e. the callback functions will be used internally by ConfD for doing string <-> value translation and syntax validation.

Callbacks for user-defined types may in general be required to be registered very early in the ConfD startup, in particular default values specified in the YANG data model will be translated from string form to internal representation when the corresponding .fxs file is loaded. A really early start of the application is achieved by using the early_phase0 as confd_start_phase in the application .app file. An application started in this early phase should not have e.g. registration of normal data provider callbacks, since ConfD is not prepared to handle such registrations at this early point in the startup. The internal_econfd/embedded_applications/user_type example shows how the callbacks can be implemented in Erlang.

An alternate way (the old way) of defining ConfD user-defined-types in Erlang is to load a single module (as opposed to use a proper application). By giving a module implementing such callbacks a name starting with "ec_user_type" (i.e. file name ec_user_type*.beam), we can tell ConfD that it should be loaded early enough for default value translation. The internal_econfd/single_modules/user_type example shows how the callbacks can be implemented in Erlang. It uses this naming convention to be able to handle the translation of a default value specified in the data model.