Table of Contents
A datastore is a complete set of configuration parameters for the device stored and manipulated as a single entity.
A datastore can be locked in its entirety with a global write lock.
ConfD supports three different named configuration datastores - running, startup, and candidate. The respective datastores support a set of capabilities as explained below:
The running datastore contains the complete configuration currently active on the device. Running can be configured to support the read-write or the writable-through-candidate modes. Writable-through-candidate means that running can only be modified by making changes to the candidate datastore (see below), and by committing these changes to the candidate.
The startup datastore is a persistent datastore which the device reads every time it reboots.
If running is read-write and the device has a startup datastore, a manager can try changes by writing them to running. If things look good, the changes can be made persistent by copying them to startup. This ensures that the device uses the same configuration after reboot.
The candidate datastore is used to hold configuration data that can be manipulated without impacting the current configuration. The candidate configuration is a full configuration data set that serves as a workspace for creating and manipulating configuration data. Additions, deletions, and changes may be made to this data to construct the desired configuration.
The candidate datastore can be committed, which means that the device's running configuration is replaced with the contents of the candidate datastore.
The candidate can be used in two different modes, with different characteristics:
It can be modified without first taking a lock on the datastore. If it is modified outside a lock, it is marked as being dirty. When the candidate is dirty it means that it is (potentially) different from the running configuration. When it is dirty, a lock cannot be taken. It leaves the dirty state by being committed to running, or by discarding all changes (which effectively resets it to the contents of running).
If the candidate is not dirty, and a lock is taken, no one but the owner of the lock can modify the database. If changes are made to the candidate while it is locked, and the owner unlocks it (or closes the CLI, Web UI or NETCONF session), all changes are discarded, and the datastore is unlocked.
The candidate can be committed to running with a specified timeout. In this case, running is set to the contents of the candidate. If a second commit, called a confirming-commit, is given within the timeout, the changes are made permanent. If no confirming-commit is given within the timeout period, running is reverted to the state it had before the first commit.
A project using ConfD must choose a valid combination of datastores to support. Which combination to choose depends on the system resources available on the device, and which characteristics the end-product should have.
The following is a list of valid combinations:
A single, non-volatile datastore is used.
Once changes are written to the datastore, they are persistent, and cannot automatically be rolled back.
The application needs to react to changes to the database. If CDB is used, this means that the application must use the subscription mechanism.
startup is stored in non-volatile memory, and running in read-write RAM.
The application needs to be written in such a way that it reacts to changes to the database. If CDB is used, this means that the application must use the subscription mechanism.
Both running and candidate are stored in non-volatile memory.
NOTE: This combination is NOT RECOMMENDED. When a manager reconfigures a node that has the candidate and also read-write running, the manager can never know that running is up to date with the candidate and must thus always (logically) copy running to the candidate prior to modifying the candidate. This introduces unnecessary overhead, and makes automation more complicated.
The application needs to react to changes to the database. If CDB is used, this means that the application must use the subscription mechanism.
In this mode, running can be modified without going through the candidate. This means that a client that wishes to work with the candidate may need to copy running into the candidate, to ensure that no changes to running are lost when the candidate is committed.
Both running and candidate are stored in non-volatile memory, but the candidate can efficiently be implemented as a diff against running.
The application needs to react to changes to the database. If CDB is used, this means that the application must use the subscription mechanism.
In this mode, all changes always go through the candidate, so a client does never have to copy running to candidate in order to not lose any data.
ConfD ensures that running and startup are always consistent, in the sense that the validation constraints defined in the data model hold. The candidate is allowed to be temporarily inconsistent, but if it is committed to running, it must be valid.
ConfD by default implements the datastores chosen in CDB. However, ConfD can also be configured to use an external database. If an external database is used, this database must implement the running and startup datastores if applicable. If the candidate is used, it may be implemented with CDB or as an external database.
This section will explain the different locks that exist in ConfD and how they interact. It is important to understand the architecture of ConfD with its management backplane, and the transaction state machine as described in Section 7.5, “User sessions and ConfD Transactions” to be able to understand how the different locks fit into the picture.
The ConfD management backplane keeps a lock for each datastore: running, startup and candidate. These locks are usually referred to as the global locks and they provide a mechanism to grant exclusive access to the datastore the lock guards.
The global locks are the only locks that can explicitly be
taken through a northbound agent, for example by the NETCONF
<lock>
operation, or by calling
maapi_lock()
.
A global lock can be taken for the whole datastore, or it can be a partial lock (for a subset of the datamodel). Partial locks are exposed through NETCONF and MAAPI.
An agent can request a global lock to ensure that it has
exclusive write-access to a datastore. When a global lock is
held by an agent it is not possible for anyone else to write to
the datastore the lock guards - this is enforced by the
transaction engine. A global lock on a datastore is granted to
an agent if there are no other holders of it (including partial
locks), and if all dataproviders approve the lock request. Each
dataprovider (CDB and/or external dataproviders) will have its
lock()
callback invoked to get a chance to refuse or
accept the lock. The output of confd --status includes
locking status. For each user session locks (if any) per
datastore is listed.
A northbound agent starts a user session towards ConfD's management backplane. Each user session can then start multiple transactions. A transaction is either read/write or read-only and is always started against a specific datastore.
The transaction engine has its internal locks, one for every datastore. These transaction locks exists to serialize configuration updates towards the datastore and are separate from the global locks.
As a northbound agent wants to update a datastore with a new configuration it will implicitly grab and release the transactional lock corresponding to the datastore it is trying to modify. The transaction engine takes care of managing the locks, as it moves through the transaction state machine and there is no API that exposes the transactional locks to the northbound agents.
When the transaction engine wants to take a lock for a
transaction (for example when entering the validate state) it
first checks that no other transaction has the lock. Then it
checks that no user session has a global lock on that
datastore. Finally each dataprovider is invoked by its
trans_lock()
callback.
In contrast to the implicit transactional locks, some northbound agents expose explicit access to the global locks. This is done a bit differently by each agent.
The management API exposes the global locks by providing
maapi_lock()
and maapi_unlock()
functions (and the corresponding maapi_lock_partial()
maapi_unlock_partial()
for partial locking). Once a user
session is established (or attached to) these functions can be
called.
In the CLI the global locks are taken when entering different configure modes as follows:
When the candidate datastore is enabled both the running and candidate global locks will be taken.
When the candidate datastore is disabled and the startup datastore is enabled both running (if enabled) and startup global locks are taken.
Does not grab any locks
The global locks are then kept by the CLI until either the configure mode is exited, or in the case of commit confirmed <timeout> the lock is released when it returns.
The Web UI behaves in the same way as the CLI (it presents three edit tabs called "Edit private", "Edit exclusive", and "Edit shared" which corresponds to the CLI modes described above).
The NETCONF agent translates the <lock>
operation into
a request for the global lock for the requested
datastore. Partial locks are also exposed through the
partial-lock rpc.
Implementing the lock()
and unlock()
callbacks
is not required of an external dataprovider. ConfD will never
try to initiate the trans_lock()
state transition (see
the transaction state diagram in
Section 7.5, “User sessions and ConfD Transactions”)
towards a data provider while a global lock is taken - so the
reason for a dataprovider to implement the locking callbacks is
if someone else can write (or lock for example to take a backup)
to the data providers database.
CDB ignores the lock()
and unlock()
callbacks
(since the data-provider interface is the only write interface
towards it).
CDB has its own internal locks on the database. The running and startup datastore each has a single write and multiple read locks. It is not possible to grab the write-lock on a datastore while there are active read-locks on it. The locks in CDB exists to make sure that a reader always gets a consistent view of the data (in particular it becomes very confusing if another user is able to delete configuration nodes in between calls to get_next() on YANG list entries).
During a transaction trans_lock()
takes a CDB
read-lock towards the transactions datastore and write_start()
tries to release the read-lock and grab the write-lock
instead.
A CDB external client (usually referred to as an MO, managed
object) implicitly takes a CDB read-lock between
cdb_start_session()
and cdb_end_session()
on the
specified datastore (running or startup). This means that while
an MO is reading, a transaction can not pass through
write_start()
(and conversely a CDB reader can not start
while a transaction is in between write_start()
and
commit()
or abort()
).
The Operational store in CDB does not have any locks. ConfD's transaction engine can only read from it, and the MO writes are atomic per write operation.
When a session tries to modify a data store that is locked in some way, it will fail. For example, the CLI might print:
admin@host% commit Aborted: the configuration database is locked [error][2009-06-11 16:27:21]
Since some of the locks are short lived (such as a CDB read lock), ConfD can be configured to retry the failing operation for a short period of time. If the data store still is locked after this time, the operation fails.
To configure this, set
/confdConfig/commitRetryTimeout
in
confd.conf
.
The ConfD installation package contains both binaries for the target system and a development environment including documentation. Many of these files are not needed on a target, and can be excluded. Additional files can be removed depending on the feature configuration on the target.
In the following description, $CONFD_DIR
refers to the
directory where ConfD has been installed.
A minimal example set of files on a target system can be:
$CONFD_DIR/bin/confd $CONFD_DIR/bin/confd_cli $CONFD_DIR/etc/confd/* $CONFD_DIR/lib/confd/bin/confd.boot $CONFD_DIR/lib/confd/lib/core/* $CONFD_DIR/lib/confd/lib/cli/*
$CONFD_DIR/bin/confd_cli
is the command line interface
(CLI) agent program and can be removed together with
$CONFD_DIR/lib/confd/lib/cli/
if the CLI is not used.
$CONFD_DIR/etc/confd/
contains configuration files.
Support for Symmetric Multiprocessing (SMP) introduces some
overhead both in CPU and memory usage, and in order to give
optimal performance in all scenarios, the installation includes
two separate executables for the ConfD daemon,
$CONFD_DIR/lib/confd/erts/bin/confd
(no SMP
support) and
$CONFD_DIR/lib/confd/erts/bin/confd.smp
(with SMP
support). If ConfD will always be run either with or without SMP
support, one of these executables can be removed. See also the
--smp
option in the confd(1) manual page
If $CONFD_DIR/lib/confd/erts/bin/confd
is
removed, ConfD will always run with SMP support, although with a
single thread on a single-processor system or if it is started
with --smp 1
. If
$CONFD_DIR/lib/confd/erts/bin/confd.smp
is
removed, ConfD will never run with SMP support, and the
--smp
option has no effect other than refusing
to start the daemon if the argument is bigger than 1.
Files associated with certain features can be removed if the system is set up not to use them:
$CONFD_DIR/bin/confd_cli $CONFD_DIR/lib/confd/lib/cli/
$CONFD_DIR/lib/confd/lib/netconf/
$CONFD_DIR/lib/confd/lib/webui/
$CONFD_DIR/var/confd/webui/
$CONFD_DIR/bin/smidump $CONFD_DIR/lib/confd/lib/snmp/
smidump is only used for producing YANG files - it is not used by ConfD itself, and therefore not likely to be needed on the target.
The integrated SSH server is not needed if OpenSSH is used to terminate SSH for NETCONF and the CLI:
$CONFD_DIR/lib/confd/lib/core/ssh*
The compiler can be removed unless we plan to to compile YANG files on the host.
$CONFD_DIR/bin/confdc $CONFD_DIR/lib/confd/lib/confdc
See documentation on AAA - basically this is a pre-compiled example program which probably won't be used on target:
$CONFD_DIR/lib/confd/lib/core/capi/priv/confd_aaa_bridge
When ConfD is started, it reads its configuration file and starts all subsystems configured to start (such as NETCONF, CLI etc.). If a configuration parameter is changed, ConfD can be reloaded by issuing:
$ confd --reload
This command also tells ConfD to close and reopen all log files, which makes it suitable to use from a system like logrotate.
There is also another way, whereby the ConfD configuration parameters that can be changed in runtime are loaded from an external namespace. Thus allowing the user to store ConfD's configuration in ConfD (specifically in CDB) itself. This will be described further down.
On a typical system, the configuration data resides in ConfD's
database CDB. Some of the parameters in the configuration are
intended for the target OS environment, such as the IP address
of the management interface. The OS reads this information
from its own configuration files, such as
/etc/conf.d/net
. This means that the application
typically reads this data from CDB, and generates
configuration files needed by the system before starting
them. If a manager changes one of these parameters, the
application subscribes to changes in CDB, regenerates the
files, and restarts the system daemons. This mechanism can
also be used for the configuration of ConfD itself. The
application must subscribe to changes to any parameter
affecting ConfD (such as management IP address), update the
ConfD configuration file confd.conf
, and then
instruct ConfD to reload it.
ConfD comes bundled with a small example tool which can be
used to patch confd.conf
files:
$CONFD_DIR/src/confd/tools/xmlset.c
. This tool
uses the light-weight Expat XML Parser
(http://expat.sourceforge.net/).
This example changes confd.conf
to disable the Web UI:
$ xmlset C false confdConfig webui enabled < confd.conf
This example changes confd.conf
to removes the
encryptedStrings container:
$ xmlset R confdConfig encryptedStrings < confd.conf
In the ConfD distribution in the
$CONFD_DIR/src/confd/dyncfg
directory the
confd_dyncfg.yang
YANG module is included.
The module defines the
namespace http://tail-f.com/ns/confd_dyncfg/1.0
which contains all the ConfD configuration parameters that
can be modified in runtime. I.e. it is a subset of the
namespace that defines the ConfD configuration file
(confd.conf
).
To enable the feature of storing ConfD's configuration in
CDB the
setting /confdConfig/runtimeReconfiguration
has
to be set to namespace in the configuration
file. This instructs ConfD to read all its "static"
configuration from the configuration file, and then load
the rest of the configuration from the confd_dyncfg
namespace (which must be served by CDB). A requirement is
that the confd_dyncfg.fxs is in
ConfD's loadPath. It is also advisable to have a
suitable _init.xml
file in ConfD's CDB directory.
The best way to understand how to use this feature is the
example confdconf/dyncfg
in the bundled example
collection.
In most cases the interesting use of this feature is to be
able to expose a particular aspect of ConfD's
configuration to the end-user and hide the rest. This can
be achieved by combining the use of the
--export none flag when compiling the
confd_dyncfg.yang
module with the use of the symlink feature (exactly how
they work are explained in
Section 10.7, “Hidden Data”).
The
snmpa/6-dyncfg
example in the
example collection shows how to expose a small subset of
the SNMP agent configuration (as well as some minor
aspects of the CLI parameters) in a private namespace.
For example, if we want to be able to expose the
ConfD's built-in SNMP agents listen port as an end-user
configurable as the leaf /sys/snmp-port
, we
could write a YANG model like this:
container sys { tailf:symlink snmp-port { tailf:path "/dyncfg:confdConfig/dyncfg:snmpAgent/dyncfg:port"; } }
When a transaction containing changes to /confdConfig
is committed ConfD will pick up the changes made and act
accordingly. Thus there is no longer a need for
confd --reload except for closing/re-opening
of log-files
(as described above) or to update the fxs files for
sub-agents.
When /confdConfig/runtimeReconfiguration
is set
to namespace, any settings in
confd.conf
for the parameters that
exist in the confd_dyncfg
namespace are ignored,
with one exception: the configuration under
/confdConfig/logs
. This configuration is needed
before CDB has started, and ConfD will therefore initially
use the settings from confd.conf
, with
the CDB settings taking precedence once CDB has started
(i.e. when the transition to phase1 is completed).
By default, ConfD starts in the background without an associated terminal. If it is started as confd --foreground, it starts in the foreground attached to the current terminal. This feature can be used to start ConfD from a process manager. In order to properly stop ConfD in the foreground case, close ConfD's standard input, or use confd --stop as usual. When ConfD is started in the foreground, the commands confd --wait-phase0 and confd --wait-started can be used to synchronize the startup sequence. See below for more details.
If startup or candidate with confirming-commit is used, the system might need to use a configuration which is different from the previous running when it reboots. An example of this is if startup is used, and a manager writes a configuration into running which renders the device unstable, and it is rebooted. It might be that the management IP address used by the OS is not the one that should be used (if it was changed before reboot). We'd like to be able to change this address in the OS configuration files before bringing up the interface. But we don't know the address until ConfD has been started, and ConfD itself needs to listen to this address. To solve this dilemma, ConfD's startup sequence can be split into several phases. The first phase brings up the ConfD daemon, but no subsystems that listen to the management IP address (such as NETCONF and CLI). This phase must be started after the loopback interface has been brought up, since the loopback interface is used to communicate between the application and ConfD.
It is also necessary to use the start phases when CDB is used and semantic validation via external callbacks has been implemented. CDB will validate the new configuration when ConfD is started without an existing database, as well as when a schema upgrade has caused configuration changes. This validation is done on the transition to phase1, which means that validation callbacks must be registered before this.
If an application has both validation
callbacks and other callbacks (e.g. data provider), and uses the
same daemon structure and control socket through all the phases,
it must register all the callbacks in phase0. This is because
the confd_register_done()
function (see confd_lib_dp(3))
must be called after all registrations are done, and no callbacks
will be invoked before this function has been called. The tables
below reflect this requirement, but it is also possible to
register all callbacks in phase0, which may simplify the startup
sequence (however CDB subscribers can not be added until phase1).
The sequence to start up the system should be like this:
bring up the loopback interface
confd --start-phase0
start applications that implement validation callbacks
confd --start-phase1
start remaining applications, read from CDB
potentially update confd.conf
and do confd --reload
bring up the management interface
confd --start-phase2
Note that if ConfD is started without any parameters, it will bring up the entire system at once.
This table summarizes the different start-phases and what they do.
Table 28.1. ConfD Start Phases
Command line | When command returns ConfD has | After which application can/should |
confd --start-phase0 |
|
|
|
||
confd --start-phase1 |
|
|
|
||
confd --start-phase2 |
|
This table summarizes the different start-phases when ConfD is started in the foreground.
Table 28.2. ConfD Start Phases, running in foreground
Command line | When command returns ConfD has | After which application can/should |
confd --foreground --start-phase0 |
This command never returns. | |
confd --wait-phase0 |
|
|
|
||
confd --start-phase1 |
|
|
|
||
confd --start-phase2 |
|
Client libraries connect to ConfD using TCP. We tell ConfD
which address to use for these connections through the
/confdConfig/confdIpcAddress/ip
(default value
127.0.0.1) and /confdConfig/confdIpcAddress/port
(default value 4565) elements in confd.conf
.
It is possible to change these
values, but it requires a number of steps to also configure the
clients. Also there are security implications, see section
Security issues below.
Some clients read the environment
variables CONFD_IPC_ADDR
and CONFD_IPC_PORT
to
determine if something other than the default is to be used,
others might need to be recompiled. This is a list of
clients which communicate with ConfD, and what needs to be
done when confdIpcAddress
is changed.
Client | Changes required |
Remote commands via the confd command | Remote commands, such as confd --reload, check the environment variables CONFD_IPC_ADDR and CONFD_IPC_PORT. |
CDB and MAAPI clients |
The address supplied to cdb_connect() and
maapi_connect() must be changed.
|
Data provider API clients |
The address supplied to confd_connect()
must be changed.
|
confd_cli |
The Command Line Interface (CLI) client,
confd_cli, checks the environment
variables NOTE: confd_cli is provided as source, in $CONFD_DIR/src/confd/cli, so it is also possible to re-compile it using the new address as default. |
Notification API clients |
The new address must be supplied to
confd_notifications_connect()
|
To run more than one instance of ConfD on the same host
(which can be useful in development scenarios) each instance
needs its own IPC port. For each instance set
/confdConfig/confdIpcAddress/port
in
confd.conf
to something different.
There are two more instances of ports that will have to be
modified, NETCONF and CLI over SSH. The netconf (SSH and
TCP) ports that ConfD listens to by default are 2022 and
2023 respectively. Modify
/confdConfig/netconf/transport/ssh
and
/confdConfig/netconf/transport/tcp
, either by
disabling them or changing the ports they listen to. The CLI
over SSH by default listens to 2024; modify
/confdConfig/cli/ssh
either by disabling or
changing the default port.
We can set up ConfD to use a different IPC mechanism than TCP for the client library connections, as well as for the communication between ConfD nodes in a HA cluster. This can be useful e.g. in a chassis system where ConfD runs on a management blade, while the managed objects run on data processing blades that may not have a TCP/IP implementation.
There are several requirements that must be fulfilled by such an IPC mechanism:
It must adhere to the standard socket API, with
SOCK_STREAM semantics. I.e. it must provide an ordered,
reliable byte stream, with connection management via the
connect()
, bind()
,
listen()
, and accept()
primitives.
It must support non-blocking operations (requested
via fcntl(O_NONBLOCK)
), for
accept()
as well
as for read and write operations.
Ideally non-blocking connect()
should
also be supported, but this is not currently used by ConfD
in this case.
It must support the use of poll()
for I/O
multiplexing.
For ConfD to be able to use this mechanism without knowledge
of address format etc, we must provide C code in the form of a
shared object, which is dynamically loaded by ConfD. The
interface between ConfD and the shared object code is defined
in the ipc_drv.h
file in the
$CONFD_DIR/src/confd/ipc_drv
directory in
the release. The shared object must be named
ipc_drv_ops.so
and installed in the
$CONFD_DIR/lib/confd/lib/core/confd/priv
directory of the ConfD installation, see the sample Makefile in the
ipc_drv
directory. The interface is
implemented via the confd_ext_ipc_init()
function. This function must be provided by the shared object,
and it must return a pointer to a callback structure defined
in the shared object:
struct confd_ext_ipc_cbs { int (*getaddrinfo)(char *address, int *family, int *type, int *protocol, struct sockaddr **addr, socklen_t *addrlen, char **errstr); int (*socket)(int family, int type, int protocol, char **errstr); int (*getpeeraddr)(int fd, char **address, char **errstr); /* optional */ int (*connect)(char *address, char **errstr); int (*bind)(char *address, char **errstr); void (*unbind)(int fd); /* optional */ };
The structure must provide (i.e. have non-NULL function
pointers for) either both of the
getaddrinfo()
and
socket()
callbacks, or both of the
connect()
and bind()
callbacks - it may of course provide all of them. The
getpeeraddr()
and
unbind()
callbacks are optional.
If both getaddrinfo()
and
socket()
are provided, the shared object
can also be used by applications using the C APIs to connect
to ConfD (see e.g. the confd_cmd.c
source
code in the $CONFD_DIR/src/confd/tools
directory).
All the callbacks except unbind()
can
report an error by returning -1
, and in
this case optionally provide an error message via the
errstr
parameter. If an error message
is provided, errstr
must point to
dynamically allocated memory - ConfD will free it through a
call to free(3)
after reporting the
error.
This callback should parse the given text-format
address
(see below). If the parsing
is successful, the callback should return 0 and provide
data that can be used for the
socket()
callback and for the standard
bind(2)
and/or
connect(2)
system calls via the
family
,
type
,
protocol
,
addr
, and
addrlen
parameters. The structure
pointed to by addr
must be
dynamically allocated - ConfD will free it after use
through a call to free(3)
.
This callback should create a socket, and if successful return the socket file descriptor.
This optional callback should create a text representation
of the address of the remote host/node connected via the
socket fd
, and if successful return
0 and provide the text-format address via the
address
parameter. The main purpose
of the callback is to make it possible to use the
maapi_disconnect_remote()
function
(see the confd_lib_maapi(3)
manual page), but the provided address will also be used
in e.g. HA status and notifications, and will be included
in ConfD debug dumps.
This callback should create a socket, connect it to the
given address
(see below), and if
successful return the socket file descriptor.
This callback should create a socket, bind it to the given
address
(see below), and if
successful return the socket file descriptor.
This is an optional callback that can be used if we need to do any special cleanup when a bound socket is closed. In this case the callback must also close the file descriptor - otherwise the function pointer can be set to NULL, and ConfD will close the file descriptor.
Two examples using this interface are provided in the
$CONFD_DIR/src/confd/ipc_drv
directory. One of them (ipc_drv_unix.c
)
uses AF_UNIX sockets, and implements only the
connect()
, bind()
,
and unbind()
callbacks. The other
(ipc_drv_etcp.c
) actually uses standard
AF_INET/AF_INET6 TCP sockets just like the "normal" ConfD IPC
- this can be meaningful if we need to set some non-standard
socket options such as Linux SO_VRF for all IPC sockets. This
example implements the getaddrinfo()
,
socket()
, and
getpeeraddr()
callbacks.
An older version of this interface (also defined in
ipc_drv.h
) used a
confd_ipc_init()
function and a
struct confd_ipc_cbs
callback structure. This
interface is deprecated, but will continue to be supported.
The main differences are that the old interface lacks the
getaddrinfo()
,
socket()
, and
getpeeraddr()
callbacks, and that any
error message would be provided via a static
errstr
structure element.
To enable the use of this alternate IPC mechanism for the
client library connections, we need to set
/confdConfig/confdExternalIpc/enabled
to "true" in
confd.conf
. This causes any settings for
/confdConfig/confdIpcAddress/ip
and
/confdConfig/confdIpcAddress/port
to be ignored,
and we can instead specify the address to use in
/confdConfig/confdExternalIpc/address
. The address
is given in text form, and ConfD passes it to the
getaddrinfo()
,
bind()
, and/or
connect()
callbacks without any
interpretation.
If we want to use the alternate IPC for the inter-node HA
communication, we can in the same way set
/confdConfig/ha/externalIpc/enabled
and
/confdConfig/ha/externalIpc/address
in
confd.conf
. Additionally the HA API uses
a struct that holds a node address:
struct confd_ha_node { confd_value_t nodeid; int af; /* AF_INET | AF_INET6 | AF_UNSPEC */ union { /* address of remote note */ struct in_addr ip4; struct in6_addr ip6; char *str; } addr; char buf[128]; /* when confd_read_notification() and */ /* confd_ha_status() populate these structs, */ /* if type of nodeid is C_BUF, the pointer */ /* will be set to point into this buffer */ char addr_buf[128]; /* similar to the above, but for the address */ /* of remote node when using external IPC */ /* (from getpeeraddr() callback for slaves) */ };
When this struct is used to specify the address of the master
in the confd_ha_beslave()
call, the
af
element should be set to
AF_UNSPEC
, and the str
element of the addr
union should point to
the text form of the master node's address. When the struct is
used to deliver information from ConfD, in the HA event
notifications and the result of a
confd_ha_status()
call,
af
will also be set to
AF_UNSPEC
, but str
will be NULL
for slave nodes unless a
peer address has been provided via the
getpeeraddr()
callback.
The client changes we need to do are analogous to those listed in the table above for the case of using a different IP address and/or port for TCP - the differences are:
Instead of CONFD_IPC_ADDR
and
CONFD_IPC_PORT
, the environment variable
CONFD_IPC_EXTADDR
is used to specify the
address. This should be in the same form as used in
confd.conf
, and if the variable is set
it causes any CONFD_IPC_ADDR
and
CONFD_IPC_PORT
settings to be
ignored.
The confd_cli program also needs to
be told where to find the shared object that it should use
for the connect()
operation. This is
done via the CONFD_IPC_EXTSOPATH
environment variable, i.e. it typically needs to be set to
$CONFD_DIR/lib/confd/lib/core/confd/priv/ipc_drv_ops.so
.
Provided that the getaddrinfo()
and socket()
callbacks are provided by
the shared object, the confd_cmd,
confd_load, and maapi
commands included in the release can also use the shared
object if the CONFD_IPC_EXTSOPATH
environment variable is set. Otherwise these programs will
assume that any setting of environment
CONFD_IPC_EXTADDR
is the pathname of an
AF_UNIX socket.
As noted above, confd_cli is provided as source, so we can alternatively modify it to support the alternate IPC mechanism "natively". This is also the case for confd_cmd, confd_load, and maapi.
If we rebuild confd_cli or the other
commands from source, but want to keep
the support for alternate IPC via the environment variables
and shared object, the preprocessor macro
EXTERNAL_IPC
must be defined. This can be done by
un-commenting the #define
in the source, or by
using a -D option to the compiler.
By default, the clients connecting to the ConfD IPC port are
considered trusted, i.e. there is no authentication required,
and we rely on the use of 127.0.0.1 for
/confdConfig/confdIpcAddress/ip
to prevent remote
access. In case this is not sufficient, it is possible to
restrict the access to the IPC port by configuring an access
check.
The access check is enabled by setting the
confd.conf
element
/confdConfig/confdIpcAccessCheck/enabled
to "true",
and specifying a filename for
/confdConfig/confdIpcAccessCheck/filename
. The file
should contain a shared secret, i.e. a random character string.
Clients connecting to the IPC port will then be required to
prove that they have knowledge of the secret through a challenge
handshake, before they are allowed access to the ConfD functions
provided via the IPC port.
Obviously the access permissions on this file must be restricted via OS file permissions, such that it can only be read by the ConfD daemon and client processes that are allowed to connect to the IPC port. E.g. if both the ConfD daemon and the clients run as root, the file can be owned by root and have only "read by owner" permission (i.e. mode 0400). Another possibility is to have a group that only the ConfD daemon and the clients belong to, set the group ID of the file to that group, and have only "read by group" permission (i.e. mode 040).
To provide the secret to the client libraries, and inform them
that they need to use the access check handshake, we have to set
the environment variable CONFD_IPC_ACCESS_FILE
to the
full pathname of the file containing the secret. This is
sufficient for all the clients mentioned above, i.e. there is no
need to change application code to support or enable this check.
The access check must be either enabled or disabled for both
the ConfD daemon and the clients. E.g. if
/confdConfig/confdIpcAccessCheck/enabled
in
confd.conf
is not set
to "true", but clients are started with the environment
variable CONFD_IPC_ACCESS_FILE
pointing to a file
with a secret, the client connections will fail.
If the ConfD daemon is shut down, all applications connected to the ConfD daemon must enter an indefinite reconnect loop. If ConfD has been configured to use a startup datastore, all applications keeping configuration data in their run-time state must re-read the configuration data from CDB, when the daemon comes back.
If ConfD has been setup to not use a startup datastore, all applications which keep configuration data in their run-time state can just proceed its processing without any re-read of the configuration data from CDB, when the daemon comes back.
The ConfD daemon must be restarted if .fxs files in a running system are to be changed. It is not enough to issue a:
$ confd --reload
Before we restart the daemon we need to stop all applications relying on the .fxs files that are updated. Whenever the daemon is up and running the stopped applications can be restarted.
Applications which do not rely on the updated .fxs files can safely be kept running. However, be sure to follow the startup datastore reconnect strategy above.
ConfD requires some privileges to perform certain tasks. The following tasks may, depending on the target system, require root privileges.
Binding to privileged ports. The
confd.conf
configuration file specifies
which port numbers ConfD should bind(2)
to. If any of these port numbers are lower than 1024, ConfD
usually requires root privileges unless the target operating
system allows ConfD to bind to these ports as a non-root
user.
If PAM is to be used for authentication, the program
installed as
$CONFD_DIR/lib/confd/lib/core/pam/priv/epam
acts
as a PAM client. Depending on the local PAM configuration,
this program may require root privileges. If PAM is
configured to read the local passwd
file, the program must either run as root, or be setuid
root. If the local PAM configuration instructs ConfD to run
for example pam_radius_auth, root
privileges are possibly not required depending on the local
PAM installation.
If the CLI is used and we want to create CLI commands
that run executables, we may want to modify the permissions
of the
$CONFD_DIR/lib/confd/lib/core/confd/priv/cmdptywrapper
program.
To be able to run an executable as root or a specific user, we need to make cmdptywrapper setuid root, i.e.:
# chown root cmdptywrapper
# chmod u+s cmdptywrapper
Failing that, all programs will be executed as the user running the confd daemon. Consequently, if that user is root we do not have to perform the chmod operations above.
The same applies for executables run via actions, but then
we may want to modify the permissions of the
$CONFD_DIR/lib/confd/lib/core/confd/priv/cmdwrapper
program instead:
# chown root cmdwrapper
# chmod u+s cmdwrapper
ConfD can be instructed to terminate NETCONF over clear text TCP. This is useful for debugging since the NETCONF traffic can then be easily captured and analyzed. It is also useful if we want to provide some local proprietary transport mechanism which is not SSH. Clear text TCP termination is not authenticated, the clear text client simply tells ConfD which user the session should run as. The idea is that authentication is already done by some external entity, such as an SSH server. If clear text TCP is enabled, it is very important that ConfD binds to localhost (127.0.0.1) for these connections.
Client libraries connect to ConfD. For example the CDB API
is TCP based and a CDB client connects to ConfD. We instruct
ConfD which address to use for these connections through the
confd.conf parameters
/confdConfig/confdIpcAddress/ip
(default address 127.0.0.1) and
/confdConfig/confdIpcAddress/port
(default port 4565).
ConfD multiplexes different kinds of connections on the same socket (IP and port combination). The following programs connect on the socket:
Remote commands, such as e.g. confd --reload
CDB clients.
External database API clients.
MAAPI, The Management Agent API clients.
The confd_cli program
All of the above are considered trusted. MAAPI clients and confd_cli should supposedly authenticate the user before connecting to ConfD whereas CDB clients and external database API clients are considered trusted and do not have to authenticate.
Thus, since the confdIpcAddress socket allows full unauthenticated access to the system, it is important to ensure that the socket is not accessible from untrusted networks. However it is also possible to restrict access to this socket by means of an access check, see Section 28.6.2, “Restricting access to the IPC port” above.
A common misfeature found on UN*X operating systems is the restriction that only root can bind to ports below 1024. Many a dollar has been wasted on workarounds and often the results are security holes.
Both FreeBSD and Solaris have elegant configuration options to turn this feature off. On FreeBSD:
$ sysctl net.inet.ip.portrange.reservedhigh=0
The above is best added to your /etc/sysctl.conf
Similarly on Solaris we can just configure this. Assuming we want to run ConfD under a non-root user "confd". On Solaris we can do that easily by granting the specific right to bind privileged ports below 1024 (and only that) to the "confd" user using:
$ /usr/sbin/usermod -K defaultpriv=basic,net_privaddr confd
And check the we get what we want through:
$ grep confd /etc/user_attr confd::::type=normal;defaultpriv=basic,net_privaddr
Linux doesn't have anything like the above. There are a couple
of options on Linux. The best is to use an auxiliary program
like authbind http://packages.debian.org/stable/authbind
or privbind http://sourceforge.net/projects/privbind/
These programs are run by root. To start confd under e.g authbind we can do:
privbind -u confd /opt/confd/confd-2.7/bin/confd \ -c /etc/confd.conf
The above command starts confd as user confd and binds to ports below 1024
Using the tailf:des3-cbc-encrypted-string
or the
tailf:aes-cfb-128-encrypted-string
built-in types it is
possible to store encrypted values in ConfD (see
confd_types(3)).
The keys used to encrypt these values
are stored in confd.conf
. Whenever an encrypted
leaf is read using the CDB API or MAAPI it is possible to
decrypt the returned string using the confd_decrypt()
function. When the keys in confd.conf
are changed, the
encrypted values will not be decryptable any longer, so care
must be taken to re-install the values using the new
keys. This section will provide an example on how to do this.
The encrypted values can only be decrypted using
confd_decrypt()
, which only works when ConfD is running
with the correct keys, so the procedure to update the
encrypted values is:
Read all the encrypted values and decrypt them
Stop the ConfD daemon
Restart it with the new encryption keys
Write back the values in clear-text, which will cause ConfD to encrypt them again
A very simple YANG model to store encrypted strings could be:
module enctest { namespace "http://www.example.com/ns/enctest"; prefix e; import tailf-common { prefix tailf; } container strs { list str { key nr; max-elements 64; leaf nr { type int32; } leaf secret { type tailf:aes-cfb-128-encrypted-string; mandatory true; } } } }
The we could write a function which would read all the encrypted leafs and save the clear-text equivalent. Such a function (without error checking) could look like this:
static void install_keys(struct sockaddr_in *addr) { struct confd_daemon_ctx *dctx; int ctlsock = socket(PF_INET, SOCK_STREAM, 0); dctx = confd_init_daemon(progname); confd_connect(dctx, ctlsock, CONTROL_SOCKET, (struct sockaddr*)addr, sizeof (*addr)); confd_install_crypto_keys(dctx); close(ctlsock); confd_release_daemon(dctx); } static void get_clear_text(struct sockaddr_in *addr, FILE *f) { int rsock = socket(PF_INET, SOCK_STREAM, 0); int i, n; install_keys(addr); cdb_connect(rsock, CDB_READ_SOCKET, (struct sockaddr*)addr, sizeof(*addr)); cdb_start_session(rsock, CDB_RUNNING); cdb_set_namespace(rsock, smp__ns); n = cdb_num_instances(rsock, "/strs/str"); for(i=0; i<n; i++) { int nr; char cstr[BUFSIZ], dstr[BUFSIZ]; cdb_get_str(rsock, cstr, sizeof(cstr), "/strs/str[%d]/secret", i); cdb_get_int32(rsock, &nr, "/strs/str[%d]/nr", i); memset(dstr, 0, sizeof(dstr)); confd_decrypt(cstr, strlen(cstr), dstr); fprintf(f, "/strs/str{%d}/secret=$0$%s\n", nr, dstr); } cdb_end_session(rsock), cdb_close(rsock); }
Note the prefixing of the clear-text output of $0$
-
this is what indicates to the ConfD daemon that the strings
are in clear text, causing it to encrypt them when we install
them again.
Now the opposite function, reading lines on the form
"keypath=value"
and using the maapi_set_elem2()
function to write them back to the ConfD daemon.
static void set_values(struct sockaddr_in *addr, FILE *f) { int msock = socket(PF_INET, SOCK_STREAM, 0); int th; struct confd_ip ip; const char *groups[] = { "admin" }; maapi_connect(msock, (struct sockaddr*)addr, sizeof(*addr)); ip.af = AF_INET; inet_aton("127.0.0.1", &ip.ip.v4); maapi_start_user_session(msock, "admin", progname, groups, sizeof(groups) / sizeof(*groups), &ip, CONFD_PROTO_TCP); maapi_start_trans(msock, CONFD_RUNNING, CONFD_READ_WRITE); maapi_set_namespace(msock, th, smp__ns); for (;;) { char *key, *val, line[BUFSIZ]; if (fgets(line, sizeof(line), f) == NULL) { break; } key = line; val = strchr(key, (int)'='); *val++ = 0; /* NUL terminate the key, make val point to value */ maapi_set_elem2(msock, th, val, key); } maapi_apply_trans(msock, th, 0); maapi_end_user_session(msock); close(msock); }
Putting it together with this main()
function makes a
useful utility program for the task at hand.
int main(int argc, char **argv) { char *confd_addr = "127.0.0.1"; int confd_port = CONFD_PORT; struct sockaddr_in addr; int c, mode = 0; /* 1 = get, 2 = set */ /* Parse command line */ while ((c = getopt(argc, argv, "gs")) != EOF) { switch (c) { case 'g': mode = 1; break; case 's': mode = 2; break; default: printf("huh?\n"); exit(1); } } if (!mode) { fprintf(stderr, "%s: must provide either -s or -g\n", argv[0]); exit(1); } /* Initialize address to confd daemon */ { struct in_addr in; inet_aton(confd_addr, &in); addr.sin_addr.s_addr = in.s_addr; addr.sin_family = AF_INET; addr.sin_port = htons(confd_port); } confd_init(argv[0], stderr, dbg); switch (mode) { case 1: get_clear_text(&addr, stdout); break; case 2: set_values(&addr, stdin); break; } exit(0); }
Using this utility, called crypto_keys, installing new encryption keys could be done using a shell script like this.
# First save clear text version of the keys in a temporary file crypto_keys -g > TOP_SECRET # Now stop the daemon confd --stop # Install the new AES encryption key (provided to this script in $1) mv confd.conf confd.conf.old xmlset C "$1" confdConfig encryptedStrings AESCFB128 key < \ confd.conf.old > confd.conf rm -f confd.conf.old # Bring the daemon up to start-phase 1 confd -c confd-conf --start-phase0 confd --start-phase1 # Now write back the keys, and remove the temporary file crypto_keys -s < TOP_SECRET rm -f TOP_SECRET # We are done confd --start-phase2
In this example we are only using AES encryption, and only
modifying the key, not the initial vector - but it is easy to
extend to use the 3DES keys as well. The xmlset utility
(provided as example source in $CONFD_DIR/src/confd/tools
) in the
ConfD distribution) is used to modify the key in
confd.conf
. Writing back the encrypted leaf in
start phase 1 ensures that no external method (e.g. a NETCONF
request) modifies the data before it is re-installed with the
new encryption keys.
This section describes a number of disaster scenarios and recommends various actions to take in the different disaster variants.
CDB keeps its data in two files A.cdb
and
C.cdb
. If ConfD is stopped, these two files can
simply be copied, and the copy is then a full backup of
CDB. If ConfD is running, we cannot copy the files, but need
to use confd --cdb-backup file to copy the two CDB
files into a backup file (in gzipped tar format).
Furthermore, if neither A.cdb nor C.cdb exists in the configured CDB directory, CDB will attempt to initialize from all files in the CDB directory with the suffix ".xml".
Thus, there exists two different ways to reinitiate CDB from a previous known good state, either from .xml files of from a CDB backup. The .xml files would typically be used to reinstall "factory defaults" whereas a CDB backup could be used in more complex scenarios.
When ConfD starts and fails to initialize, the following exit codes can occur:
Exit codes 1 and 19 mean that an internal error has occurred. A text message should be in the logs, or if the error occurred at startup before logging had been activated, on standard error (standard output if ConfD was started with --foreground). Generally the message will only be meaningful to the ConfD developers, and an internal error should always be reported to Tail-f support.
Exit codes 2 and
3 are only used for the confd "control
commands" (see the section COMMUNICATING WITH CONFD in the
confd(1) manual page), and mean
that the command failed due to timeout. Code
2 is used when the initial connect to
ConfD didn't succeed within 5 seconds (or the
TryTime
if given), while code
3 means that the ConfD daemon did not
complete the command within the time given by the
--timeout
option.
Exit code 10 means that one of the init files in the CDB directory was faulty in some way. Further information in the log.
Exit code 11 means that the CDB configuration was changed in an unsupported way. This will only happen when an existing database is detected, which was created with another configuration than the current in confd.conf.
Exit code 12 means that the C.cdb file is in an old and unsupported format (this can only happen if the CDB database was created with a ConfD version older than 1.3, from which upgrading isn't supported).
Exit code 13 means that the schema change caused an upgrade, but for some reason the upgrade failed. Details are in the log. The way to recover from this situation is either to correct the problem or to re-install the old schema (fxs) files.
Exit code 14 means that the schema change caused an upgrade, but for some reason the upgrade failed, corrupting the database in the process. This is rare and usually caused by a bug. To recover, either start from an empty database with the new schema, or re-install the old schema files and apply a backup.
Exit code 15 means that A.cdb
or
C.cdb
is corrupt in a non-recoverable way. Remove
the files and re-start using a backup or init files.
Exit code 16 means that CDB ran into an unrecoverable file-error while booting (such as running out of space on the device while writing the initial schema file).
Exit code 20 means that ConfD failed to bind a socket. By default this means that Confd refuses to start. It is however possible to force Confd to ignore this fatal error situation by enabling the parameter /confdConfig/ignoreBindErrors. Instead a warning is issued and the failing northbound agent is disabled. The agent may be enabled by dynamically re-configuring the failing agent to use another port and restart Confd.
Exit code 21 means that some ConfD configuration file is faulty. More information in the logs.
Exit code 22 indicates a ConfD installation related problem, e.g. that the user does not have read access to some library files, or that some file is missing.
If the ConfD daemon starts normally, the exit code is 0.
If CDB is reinitialized to factory defaults, it may not be possible to reach the machine over the network. The only way to reconfigure the machine is through a CLI login over the serial console.
If the AAA database is broken, ConfD will start but with no authorization rules loaded. This means that all write access to the configuration is denied. The ConfD CLI can be started with a flag confd_cli --noaaa which will allow full unauthorized access to the configuration. Usage of the ConfD cli with this flag can possibly be enabled for some special UNIX user which can only login over the serial port. Thus --noaaa provides a way to reconfigure the box although the AAA database is broken.
ConfD attempts to handle all runtime problems without terminating, e.g. by restarting specific components. However there are some cases where this is not possible, described below. When ConfD is started the default way, i.e. as a daemon, the exit codes will of course not be available, but see the --foreground option in the confd(1) manual page.
Out of memory: If ConfD is unable to allocate memory, it will exit by calling abort(3). This will generate an exit code as for reception of the SIGABRT signal - e.g. if ConfD is started from a shell script, it will see 134 as exit code (128 + the signal number).
Out of file descriptors for accept(2): If ConfD fails
to accept a TCP connection due to lack of file descriptors,
it will log this and then exit with code 25. To avoid this
problem, make sure that the process and system-wide file
descriptor limits are set high enough, and if needed configure
session limits in confd.conf
.
When the system is updated, ConfD executes a two phase
commit protocol towards the different participating
databases including CDB. If a participant fails in the
commit()
phase although the participant succeeded in the
prepare phase, the configuration is possibly in an
inconsistent state.
When ConfD considers the configuration to be in a inconsistent state, operations will continue. It is still possible to use NETCONF, the CLI and all other northbound management agents. The CLI has a different prompt which reflects that the system is considered to be in an inconsistent state and also the Web UI shows this:
-- WARNING ------------------------------------------------------ Running db may be inconsistent. Enter private configuration mode and install a rollback configuration or load a saved configuration. ------------------------------------------------------------------
It is slightly more involved using the NETCONF agent. The NETCONF transaction which resulted in a failed commit will fail, but following that the only way to see that the system is considered to be in an inconsistent state is by reading the data defined by tailf-netconf-monitoring.
The MAAPI API has two interface functions which can be used to set and retrieve the consistency status. This API can thus be used to manually reset the consistency state. Apart from this, the only way to reset the state to a consistent state is by reloading the entire configuration.
This section discusses problems that new users have seen when they started to use ConfD. Please do not hesitate to contact our support team (see below) if you are having trouble, regardless of whether your problem is listed here or not.
The installation program gives a lot of error messages, the first few like the ones below. The resulting installation is obviously incomplete.
tar: Skipping to next header gzip: stdin: invalid compressed data--format violated
Cause: This happens if the installation program has been damaged, most likely because it has been downloaded in 'ascii' mode.
Resolution: Remove the installation directory. Download a new copy of ConfD from our servers. Make sure you use binary transfer mode every step of the way.
ConfD terminates immediately with a message similar to the one below.
Internal error: Open failed: /lib/tls/libc.so.6: version `GLIBC_2.3.4' not found (required by .../lib/confd/lib/core/util/priv/syst_drv.so)
Cause: This happens if you are running on a very old Linux version. The GNU libc (GLIBC) version is older than 2.3.4, which was released 2004.
Resolution: Use a newer Linux system, or upgrade the GLIBC installation.
ConfD terminates immediately with a message similar to this:
Bad configuration: .../confd.conf:0: cannot dynamically link with libcrypto shared library
Cause: This normally happens due to the OpenSSL package being of the wrong version or not installed in the operating system.
Resolution: One of
Install the OpenSSL package with the correct version. This is 1.0.0 for Linux releases of ConfD, 0.9.8 or 0.9.7 for some other operating systems. To find out the version to install, run:
$ ldd $CONFD_DIR/lib/confd/lib/core/crypto/priv/lib/crypto.so
Note: only the libcrypto shared library (libcrypto.so.N.N.N) is actually required by ConfD.
Provided that a different version of OpenSSL, 0.9.7 or greater, is installed: Rebuild the ConfD components that depend on libcrypto to use this version, as described in Section 28.15, “Using a different version of OpenSSL”.
ConfD terminates immediately, or when the Web UI is enabled, with a message similar to:
Bad configuration: .../confd.conf:0: libcrypto shared library mismatch (DES_INT) - crypto.so and libconfd must be rebuilt
or:
Bad configuration: .../confd.conf:0: libcrypto shared library mismatch (RC4_CHAR) - crypto.so must be rebuilt for support of default setting for /confdConfig/webui/transport/ssl/ciphers
Cause: This happens if the OpenSSL package is of the correct version, but has been built with a configuration parameter that makes the interface incompatible with the build that is expected by ConfD.
Resolution: Applying resolution 2 above is always
sufficient. Applying resolution 1 is also a possibility,
but requires that the OpenSSL package is built with the
expected configuration parameters. Contact Tail-f support
if this method is desired but unsuccessful in solving the
problem. In case only the second message (with RC4_CHAR)
occurs, yet another way to resolve the issue is to
configure a cipher list for
/confdConfig/webui/transport/ssl/ciphers
in
confd.conf
(or confd_dyncfg) that
does not include any RC4-based ciphers - see confd.conf(5).
Some examples are dependent on features that might only be available on Linux. Before such examples can run, they would have to be ported.
Sending NETCONF commands and queries with 'netconf-console' fails, while it works using 'netconf-console-tcp'. The error message is below.
You must install the python ssh implementation paramiko in order to use ssh.
Cause: The netconf-console command is implemented using the Python programming language. It depends on the python SSH implementation Paramiko. Since you are seeing this message, your operating system doesn't have the python-module Paramiko installed. The Paramiko package, in turn, depends on a Python crypto library (pycrypto).
Resolution: Install Paramiko (and pycrypto, if necessary) using the standard installation mechanisms for your OS. An alternative approach is to go to the project home pages to fetch, build and install the missing packages.
These packages come with simple installation instructions. You will need root privileges to install these packages, however. When properly installed, you should be able to import the paramiko module without error messages
$ python ... >>> import paramiko >>>
Exit the Python interpreter with Ctrl+D.
A workaround is to use 'netconf-console-tcp'. It uses TCP instead of SSH and doesn't require Paramiko or Pycrypto. Note that TCP traffic is not encrypted.
If you have trouble starting or running ConfD, the examples or the clients you write, here are some troubleshooting tips.
When contacting support, it often helps the support engineer to understand what you are trying to achieve if you copy-paste the commands, responses and shell scripts that you used to trigger the problem.
When ConfD is started, give the --verbose (abbreviated -v) and --foreground flags. This will prevent ConfD from starting as a daemon and cause some messages to be printed on the stdout.
$ confd --verbose --foreground ...
To find out what ConfD is/was doing, browsing ConfD's log files is often helpful. In the examples, they are called 'devel.log', 'confd.log', 'audit.log' and 'confd.log'. If you are working with your own system, make sure the log files are enabled in 'confd.conf'. They are already enabled in all the examples.
ConfD will give you a comprehensive status report if you call
$ confd --status
ConfD status information is also available as operational
data under /confd-state
when the
tailf-confd-monitoring.fxs
and
tailf-common-monitoring.fxs
data
model files are present in ConfD's loadPath
.
These files are stored in
$CONFD_DIR/etc/confd
in the ConfD
release, and the functionality thus enabled by default.
See the corresponding YANG modules
tailf-confd-monitoring.yang
and
tailf-common-monitoring.yang
in the
$CONFD_DIR/src/confd/yang
directory
of the ConfD release for documentation of the provided
data. To allow programmatic access to this data via
MAAPI without exposing it to end users, the modules can be
recompiled with the --export none
option
to confdc (see confdc
(1)).
When recompiling these modules, it is critical that the
annotation module
tailf-confd-monitoring-ann.yang
is
used, see
$CONFD_DIR/src/confd/yang/Makefile
.
If you are implementing a data provider (for operational or configuration data), you can verify that it works for all possible data items using
$ confd --check-callbacks
If you suspect you have experienced a bug in ConfD, or ConfD told you so, you can give Support a debug dump to help us diagnose the problem. It contains a lot of status information (including a full confd --status report) and some internal state information. This information is only readable and comprehensible to the ConfD development team, so send the dump to your support contact. A debug dump is created using
$ confd --debug-dump mydump1
Just as in CSI on TV, it's important that the information is collected as soon as possible after the event. Many interesting traces will wash away with time, or stay undetected if there are lots of irrelevant facts in the dump.
Another thing you can do if you suspect you have experienced a bug in ConfD, is to enable the error log. The logged information is only readable and comprehensible to the ConfD development team, so send the log to your support contact.
By default, the error log is disabled. To enable it, add this
chunk of XML between <logs>
and
</logs>
in your
confd.conf
file:
<errorLog> <enabled>true</enabled> <filename>./error.log</filename> </errorLog>
This will actually create a number of files called ./error.log*. Please send them all to us.
If ConfD aborts due to failure to allocate memory
(see Section 28.11, “Disaster management”), and you
believe that this is due to a memory leak in ConfD,
creating one or more debug dumps as described above
(before ConfD aborts) will produce the most useful
information for Support. If this is not possible,
you can make ConfD produce a system dump just before
aborting. To do this, set the environment variable
$CONFD_DUMP
to a file name for the dump
before starting ConfD. The dumped information is only
comprehensible to the ConfD development team, so send
the dump to your support contact.
To catch certain types of problems, especially relating to system start and configuration, the operating system's system call trace can be invaluable. This tool is called strace/ktrace/truss. Please send the result to your support contact for a diagnosis. Running instructions below.
Linux:
$ strace -f -o mylog1.strace -s 1024 confd ...
BSD:
$ ktrace -ad -f mylog1.ktrace confd ... $ kdump -f mylog1.ktrace > mylog1.kdump
Solaris:
$ truss -f -o mylog1.truss confd ...
The primary tool for debugging the interaction between
applications and ConfD is to give the debug level
debug
to
confd_init()
as
CONFD_TRACE
, see the confd_lib_lib(3) manual page. If
more in-depth debugging using e.g.
gdb is needed, it may be useful to
rebuild the libconfd
library from
source with debugging symbols. This can be done by using
the libconfd source package
confd-<vsn>.libconfd.tar.gz
that is delivered with the ConfD release. The package
includes a README
file that
describes how to do the build - note in particular the
"Application debugging" section.
When debugging application memory leaks with a tool like
valgrind, it is often
necessary to rebuild libconfd from
source, since the default build uses a "pool allocator"
that makes the stack trace information for memory leaks
from valgrind completely misleading
for allocations from libconfd. The details of how to do
a build that disables the pool allocator are described
in the "Application debugging" section of the
README
in the libconfd source
package.
The ConfD C API library libconfd
uses a
C struct for passing keypaths to callback functions:
typedef struct confd_hkeypath { int len; confd_value_t v[MAXDEPTH][MAXKEYLEN]; } confd_hkeypath_t;
See the section called “XML PATHS” in the
confd_types(3) manual page for discussion
about how this struct is used. The values used for
MAXDEPTH
and MAXKEYLEN
are
20 and 9, respectively, which should be big enough even for very
large and complex data models. However this comes at a cost in
memory (mainly stack) usage - the size of a confd_hkeypath_t is
approximately 5.5 kB. Also, in some rare cases, we may have a data
model where one or both of these values are not large enough.
It is possible to use other values for
MAXDEPTH
and MAXKEYLEN
,
but this requires both that libconfd
is rebuilt
from source with the new values, and that all applications that use
libconfd
are also compiled with the new values.
It is of course possible to just edit confd_lib.h
with the new values, but the #define
statements for
these in confd_lib.h
are guarded with
#ifndef
directives, which means that they can
alternatively be overridden without changing
confd_lib.h
.
Overriding can be done either via -D
options
on the compiler command line, or via #define
statements
before the #include
for confd_lib.h
.
For building libconfd
itself without source
changes, only the -D
option method is possible,
though. The build procedure supports an
EXTRA_CFLAGS
make variable
that can be used this purpose, see the README
file included in the libconfd source package. E.g. we can do the
libconfd
build with:
$ make EXTRA_CFLAGS="-DMAXDEPTH=10 -DMAXKEYLEN=5"
The -D
option method can of course be used
when building applications too, but it is probably less error-prone
to use the #define
method. E.g. if we make sure that
none of the application C or C++ files include
confd_lib.h
(or
confd.h
) directly, but instead include say
app.h
, we can have this in
app.h
:
#define MAXDEPTH 10 #define MAXKEYLEN 5 #include <confd_lib.h>
Whenever an application connects to ConfD via one of the API
functions (i.e. confd_connect()
,
cdb_connect()
, etc), a check is made that the
MAXDEPTH
and MAXKEYLEN
values used for building the library are large enough for the data
models loaded into ConfD. If they are not, the connection will fail
with confd_errno
set to
CONFD_ERR_PROTOUSAGE
and
confd_lasterr()
giving a message with the
required minimum values. Whether the connection succeeds or not, the
library will also set the global variables
confd_maxdepth
and
confd_maxkeylen
to the minimum values required by
ConfD. Thus the values can be found by simply printing these
variables in any application that connects to ConfD.
The ConfD release includes a XML document,
$CONFD_DIR/src/confd/errors/errcode.xml
, that
specifies all the customizable errors that may be reported in the
different northbound interfaces. The errors are classified with a
type and a code, and for each error a parameterized format string
for the default error message is given.
The purpose of this file is both to serve as a reference list of the possible errors, which could e.g. be processed programmatically when generating end-user documentation, and to provide the basis for error message customization.
All the error messages specified in the file can be customized by means of application callbacks. An application can register a callback for one or more of the error types, and whenever an error is to be reported in a northbound interface, the callback will first be invoked and given the opportunity to return a message that is different from the default.
The callback will receive user session information, the error type and code, the default error message, and the parameters used to create the default message. For errors of type "validation", the callback also has access to the contents of the transaction that failed validation. See the section called “ERROR FORMATTING CALLBACK” in the confd_lib_dp(3) manual page for the details of the callback registration and invocation.
ConfD depends on the OpenSSL libcrypto
shared
library for a number of cryptographic functions. (The
libssl
library is not used by ConfD.)
Currently most ConfD releases, in particular all releases for
Linux systems, are built with OpenSSL version 1.0.0, and thus
require that the libcrypto
library from this
version is present when ConfD is run. Some releases for other
systems require libcrypto
from OpenSSL
version 0.9.8 or 0.9.7. It is also possible that a given version,
even though it is the one that ConfD requires, has been built
with configuration parameters that make the interface
incompatible with the build that is expected by ConfD.
However the libcrypto
dependency is limited
to two components in the ConfD release, the
libconfd
library used by applications, and a
shared object called crypto.so
, that is used
by the ConfD daemon as an interface to
libcrypto
. Both these components are
included in source form in the
confd-<vsn>.libconfd.tar.gz
tar archive
that is provided with each ConfD release.
To use a different OpenSSL version than the one the ConfD release
is built with, e.g. due to a Linux development or target
environment having OpenSSL version 0.9.8 installed for other
purposes, it is sufficient to use the provided sources to rebuild
these two components with the desired OpenSSL version, and replace
them in the ConfD release. The toplevel README file included in
the tar archive has instructions on how to do the build of both
libconfd
and crypto.so
.
While libconfd
can be located wherever it is
convenient for application use, crypto.so
must be placed in the
$CONFD_DIR/lib/confd/lib/core/crypto/priv/lib
directory in the ConfD installation. The Makefiles in the tar
archive have install
targets for
libconfd
and crypto.so
that will do a copy to the appropriate place in the ConfD
installation if CONFD_DIR
is set to the
installation directory.
It is possible to use shared memory to make schema information
(see the section called “USING SCHEMA INFORMATION” in
confd_types(3)) available to multiple
processes on a given host, without requiring each of them to load
the information directly from ConfD by calling one of the
schema-loading functions
(confd_load_schemas()
etc, see the confd_lib_lib(3) and confd_lib_maapi(3) manual pages). This can be a
very significant performance improvement for system startup, where
multiple application processes will otherwise load schema
information more or less simultaneously, and can also reduce RAM
usage.
The mechanism uses a shared memory mapping created by
mmap(2)
, backed by a file. One process needs
to call first confd_mmap_schemas_setup()
, and
then one of schema-loading functions, to populate the shared
memory segment. Once this has been done, any process (including
the one doing the initial load) can call
confd_mmap_schemas()
to map the shared memory
segment into its address space and make the information available
to the libconfd
library and for direct access
by the application. See the confd_lib_lib(3) manual page for the
specification of these functions.
The mechanism can be used in different ways, but assuming that
persistent storage for the backing file is available, the optimal
approach is to do the load and file creation step only on first
system start and when a data model upgrade is done. Then it is
sufficient to call confd_mmap_schemas()
on
all other occasions. If persistent storage is not available, a
RAM-based file system such as Linux "tmpfs" can be used for the
backing file, in which case the load and file creation step needs
to be done on each boot (and on data model upgrade). It is also
possible to request that ConfD creates and maintains the backing
file, see /confdConfig/enableSharedMemorySchema
in
confd.conf(5) and
maapi_get_schema_file_path()
in confd_lib_maapi(3).
Since the schema information includes absolute pointers (e.g. the
parent
, children
, and
next
pointers in a struct
confd_cs_node), it is necessary to map the shared memory at
the same virtual address in all processes. The
addr
argument to
confd_mmap_schemas_setup()
is passed to
mmap(2)
, and the address returned by
mmap(2)
is used for the mapping. The address
is also recorded in the shared memory segment to make it available
for confd_mmap_schemas()
. The value of the
size
argument is also passed in the initial
mmap(2)
invocation, unless it is smaller than
the first allocation done (e.g. if it is 0). In any case, unless
the CONFD_MMAP_SCHEMAS_KEEP_SIZE
flag is
passed to confd_mmap_schemas_setup()
, the
loading will extend the mapped segment as needed, and the final
size will only be as large as needed for the data, even if a
larger value was passed as size
.
Ideally we would give NULL for the addr
argument and an approximate size for size
,
letting the kernel choose a suitable address and letting the load
step adjust the final size based on the amount of data loaded.
Unfortunately this often results in an address that is not
honored on the subsequent mmap(2)
call done
by confd_mmap_schemas()
, which thus fails.
The possible choices of addr
and/or
size
to get the desired result are OS- and
OS-version-dependant, but on Linux it generally works to use an
addr
argument that is at an offset from the
top of the heap that is larger than expected heap usage, and give
size
as 0, as shown in the sample code
below using a 256 MB offset. (It is not a fatal error if heap
usage later exceeds this offset, as malloc(3)
etc will skip over the mapped area, but it may have some
performance impact.)
#include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <unistd.h> #include <assert.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <netdb.h> #include <confd_lib.h> #define MB (1024 * 1024) #define SCHEMA_FILE "/etc/schemas" #define OK(E) do { \ int _ret = (E); \ if (_ret != CONFD_OK) { \ confd_fatal( \ "%s returned %d, confd_errno=%d, confd_lasterr()='%s'\n", \ #E, _ret, confd_errno, confd_lasterr()); \ } \ } while (0) static void *get_shm_addr(size_t offset) { size_t pagesize; char *addr; pagesize = (size_t)sysconf(_SC_PAGESIZE); addr = malloc(1); free(addr); addr += offset; /* return pagesize-aligned address */ return addr - ((uintptr_t)addr % pagesize); } int main(int argc, char **argv) { struct sockaddr_in addr; void *shm_addr; addr.sin_addr.s_addr = inet_addr("127.0.0.1"); addr.sin_family = AF_INET; addr.sin_port = htons(CONFD_PORT); confd_init(argv[0], stderr, CONFD_TRACE); shm_addr = get_shm_addr(256 * MB); OK(confd_mmap_schemas_setup(shm_addr, 0, SCHEMA_FILE ".tmp", 0)); OK(confd_load_schemas((struct sockaddr *)&addr, sizeof(struct sockaddr_in))); if (rename(SCHEMA_FILE ".tmp", SCHEMA_FILE) != 0) confd_fatal("Failed to rename\n"); return 0; }
This code uses a temporary file that is renamed after the load is
complete. This is not necessary, but ensures that the SCHEMA_FILE
always represents complete schema info if it exists. It can also
serve as a simple synchronization mechanism to let other processes
know when they can do their
confd_mmap_schemas()
call.
On Solaris (at least Solaris 10), the address passed to
mmap(2)
is effectively ignored, and the
returned address depends strictly on the size of the mapping. Thus
there is no point passing anything other than NULL for the
addr
to
confd_mmap_schemas_setup()
, but instead the
size
must be big enough for the loaded
schema info, and the
CONFD_MMAP_SCHEMAS_KEEP_SIZE
flag must be
used.
In a multi-node system, with application processes connecting to ConfD across a network, shared memory can of course not be used between the nodes. The most straightforward way to handle this is to do the initial load and file creation step on each node. If the nodes have the same HW architecture and OS, a possible alternative could be to copy the backing store file from one node to the others using some file transfer mechanism.
The Erlang API to ConfD is implemented as an Erlang/OTP
application called econfd
. This application comes
i two flavours. One is builtin in ConfD in order to support
applications running in the same Erlang VM as ConfD. The other
is a separate library which is included in source form in the
ConfD release, in the $CONFD_DIR/erlang
directory. Building econfd
as described in the
$CONFD_DIR/erlang/econfd/README
file will
compile the Erlang code and generate the documentation.
This API can be used by applications written in Erlang in much
the same way as the C and Java APIs are used, i.e. code running
in an Erlang VM can use the econfd
API functions to
make socket connections to ConfD for data provider, MAAPI, CDB,
etc access. However the API is also available internally in
ConfD, which makes it possible to run Erlang application code
inside the ConfD daemon, without the overhead imposed by the
socket communication.
There is little or no support for testing and debugging Erlang
code executing internally in ConfD, since ConfD provides a very
limited runtime environment for Erlang in order to minimize disk
and memory footprints. Thus the recommended method is to develop
Erlang code targeted for this by using econfd
in a
separate Erlang VM, where an interactive Erlang shell and all
the other development support included in the standard
Erlang/OTP releases are available. When development and testing
is completed, the code can be deployed to run internally in
ConfD without changes.
For information about the Erlang programming language and development tools, please refer to www.erlang.org and the available books about Erlang (some are referenced on the web site).
All application code SHOULD use the prefix "ec_" for module names, application names, registered processes (if any), and named ets tables (if any), to avoid conflict with existing or future names used by ConfD itself.
The Erlang code is packaged into applications which are
automatically started and stopped by ConfD if they are located
at the proper place. ConfD will search the load path as
defined by /confdConfig/loadPath
for directories
called erlang-lib
. The structure of such
a directory is the same as a standard lib
directory in Erlang. The directory may contain multiple Erlang
applications. Each one must have a valid .app
file. See the Erlang documentation of application
and app
for more info.
The following config settings in the .app
file are explicitly treated by ConfD:
A list of applications which needs to be started before this application can be started. This info is used to compute a valid start order.
A list of applications which are started on behalf of this application. This info is used to compute a valid start order.
A property list, containing [{Key,Val}]
tuples. Besides other keys, used by the application
itself, a few predefined keys are used by ConfD. The key
confd_start_phase
is used by ConfD to
determine which start phase the application is to be
started in. Valid values are phase0
,
phase1
and phase2
. Default is
phase1
. The key
confd_restart_type
is used by ConfD to
determine which impact a restart of the application will
have. This is the same as the restart_type()
type in application
. Valid values are
permanent
, transient
and
temporary
. Default is permanent
.
When the application is started, one of its processes should
make initial connections to the ConfD subsystems, register
callbacks etc. This is typically done in the
init/1
function of a gen_server
or
similar. While the internal connections are made using the
exact same API functions (e.g.
econfd_maapi:connect/2
) as for an
application running in an external Erlang VM, any
Address
and Port
arguments are ignored, and instead standard Erlang
inter-process communication is used. The
internal_econfd/embedded_applications/transform
example in the bundled collection shows a transform written in
Erlang and executing internally in ConfD.
An alternate way (the old way) of running custom code in the
Erlang VM of ConfD is to load single Erlang modules (as
opposed to use proper applications). When ConfD starts,
specifically when phase0
is reached, ConfD will
search the load path as defined by
/confdConfig/loadPath
for compiled Erlang modules,
i.e. *.beam
files. The modules that are
found will be loaded, unless the module name conflicts with an
existing ConfD module. If there is a module name conflict,
ConfD will terminate with an error message and exit code 21.
The -on_load()
directive can be used to spawn a process that makes initial
connections to the ConfD subsystems, registers callbacks, sets
up supervision if desired, etc. The
internal_econfd/single_modules/transform
example in the bundled collection shows a transform written in
Erlang and executing internally in ConfD.
The --printlog
option to
confd, which prints the contents of the ConfD
errorLog, is normally only useful for Tail-f support and
developers, but it may also be relevant for debugging problems
with application code running inside ConfD. The errorLog
collects the events sent to the OTP error_logger, e.g. crash
reports as well as info generated by calls to functions in the
error_logger(3) module. Another possibility for primitive
debugging is to run confd with the
--foreground
option, where calls to
io:format/2
etc will print to standard
output. Printouts may also be directed to the developer log
by using econfd:log/3
.
While Erlang application code running in an external Erlang VM
can use basically any version of Erlang/OTP, this is not the
case for code running inside ConfD, since the Erlang VM is
evolving and provides limited backward/forward compatibility.
To avoid incompatibility issues when loading the
beam
files, the Erlang compiler erlc
in same version the ConfD distribution should be used.
ConfD provides the VM, erlc
and the
kernel
, stdlib
, and
crypto
OTP applications.
Obviously application code running internally in the ConfD daemon can have an impact on the execution of the standard ConfD code. Thus it is critically important that the application code is thoroughly tested and verified before being deployed for production in a system using ConfD.
We can implement user-defined types with Erlang code in a manner
similar to what is described for C in the section called “USER-DEFINED TYPES” in confd_types(3). In the econfd API, we
populate a #confd_type_cbs{}
record and register it
using econfd_schema:register_type_cbs/1
.
For an application running inside ConfD, this registration will
have the same effect as using a shared object in the C API,
i.e. the callback functions will be used internally by ConfD for
doing string <-> value translation and syntax validation.
Callbacks for user-defined types may in general be required to
be registered very early in the ConfD startup, in particular
default values specified in the YANG data model will be
translated from string form to internal representation when
the corresponding .fxs
file is loaded. A
really early start of the application is achieved by using the
early_phase0
as confd_start_phase
in
the application .app
file. An application
started in this early phase should not have e.g. registration
of normal data provider callbacks, since ConfD is not prepared
to handle such registrations at this early point in the
startup. The
internal_econfd/embedded_applications/user_type
example shows how the callbacks can be implemented in Erlang.
An alternate way (the old way) of defining ConfD
user-defined-types in Erlang is to load a single module (as
opposed to use a proper application). By giving a module
implementing such callbacks a name starting with
"ec_user_type
" (i.e. file name
ec_user_type*.beam
), we can tell ConfD
that it should be loaded early enough for default value
translation. The
internal_econfd/single_modules/user_type
example shows how the callbacks can be implemented in
Erlang. It uses this naming convention to be able to handle
the translation of a default value specified in the data
model.