Accumulo 2.x Documentation >> Administration >> Upgrading Accumulo
Upgrading Accumulo
Edit this pageChanges to the upgrade process
In upgrade notes for prior releases we have advised that users should ensure that no FATE
transactions exist (see Upgrading from 1.10 or 2.0 to 2.1
below). This is due to the
fact that the internal serialization of FATE transactions is not guaranteed to be
compatible between versions. Accumulo never provided any tooling around the upgrade process
and left it up the user, which can cause problems if older versions of FATE transactions
exist and the user already deployed the new version of software. The user would have to
re-install the old version of software to remove the FATE transactions.
Starting with Accumulo 4.0 we are modifying the upgrade process steps in an attempt to make it easier for the user to validate their instance is ready for upgrade. In earlier versions the upgrade process would start when the user started the instance with the new version of software. The process ran entirely in the Manager and it was the users responsibility to read the upgrade notes to perform any necessary pre-upgrade steps.
The new upgrade process introduces a new accumulo upgrade
command which will be used
after shutting down the instance with the old version of software and before starting
the instance with the new version of software. The upgrade
command has two options,
--prepare
and --start
. --prepare
is designed to be executed by the user after shutting
down the instance. This option will check that the Manager is down, validate that there
are no existing FATE transactions, remove all of the server locks in ZooKeeper, and place
a node in ZooKeeper that will prevent server processes from being started again. If there
are FATE transactions, the command will fail giving the user the opportunity to clean them
up. The --prepare
option can then be run again.
The --start
option is designed to be executed by the user before starting the instance
with the newer version of software. The --start
option will perform any necessary pre-upgrade
validation, make any changes that are necessary for the new version of software to start, seed
an upgrade progress tracker node in ZooKeeper, and then finally remove the node in ZooKeeper
created by the --prepare
step so that the server processes can be started. If the user did
not run the --prepare
step with the older version of software, then the --start
option
will fail unless the --force
option is used. Running --start
with the --force
option
will perform the same checks that --prepare
executes, but if FATE transactions are found
then the user will need to remove them using the older version of software. Once the --start
option completes, then the user can start the server processes to complete the upgrade
process. The user will want to check the contents of the Manager log to observe the progress
of the upgrade.
The accumulo upgrade --prepare
command will be included with the 2.1.4 and 3.1.0 releases
to assist users in upgrading to the 4.0 release.
Upgrading from 1.10 or 2.0 to 2.1
Please read these directions in their entirety before beginning. Please contact us with any questions you have about this process.
IMPORTANT! Before running any Accumulo 2.1 upgrade utilities or services, you will need to upgrade to Java 11, Hadoop 3, and at least ZooKeeper 3.5 (at least 3.8 was current at the time of this writing and is recommended).
The basic upgrade sequence is:
- upgrade to at least Accumulo 1.10 first (if necessary)
- stop Accumulo 1.10 or 2.0
- prepare your installation of Accumulo 2.1 through whatever means you obtain the binaries and configure it in your environment
- start ZooKeeper and HDFS.
- (optional - but recommended) create a ZooKeeper snapshot
- (optional - but recommended) validate the ZooKeeper ACLs. See ZooKeeper ACLs
- (required if not using the provided scripts to start 2.1) run the
RenameMasterDirInZK
utility - (optional) run the pre-upgrade utility to convert the configuration in ZooKeeper
- start Accumulo 2.1 for the first time to complete the upgrade
IMPORTANT! before starting any upgrade process you need to make sure there are no outstanding
FATE transactions. This includes transactions that have completed with SUCCESS
or FAILED
but
have not been removed by the automatic clean-up process. This is required because the internal
serialization of FATE transactions is not guaranteed to be compatible between versions, so ANY
FATE transaction that is present will fail the upgrade. Procedures to manage FATE transactions,
including commands to fail and delete transactions, are included in the FATE Administration
documentation.
Two significant changes occurred in 2.1 that are particularly important to note for upgrades:
- properties and services that referenced
master
are renamedmanager
and - the internal property storage format in ZooKeeper has changed - instead of each table, namespace, and the system configuration using separate ZooKeeper nodes for each of their properties, they each now use only a single ZooKeeper node for all of their respective properties.
Details on renaming the properties and the ZooKeeper property conversion are provided in the following sections. Additional information on configuring 2.1 is available here.
Create ZooKeeper snapshot (optional - but recommended)
Before upgrading to 2.1, it is suggested that you create a snapshot of the current ZooKeeper contents to be a backup in case issues occur and you need to rollback. There are no provisions to roll back to a previous Accumulo version once an upgrade process has been completed other than restoring from a snapshot of ZooKeeper.
$ACCUMULO_HOME/bin/accumulo dump-zoo --xml --root /accumulo | tee PATH_TO_SNAPSHOT
If you need to restore from the ZooKeeper snapshot see these instructions.
Rename master Properties, Config Files, and Script References
It is strongly recommended as a part of the upgrade to rename any properties in
accumulo.properties
(or properties specified on the command line) starting with master.
to use
the equivalent property starting with manager.
instead, as the old properties will not be
available in subsequent major releases. This version may log or display warnings if older properties
are observed.
Any reference to master
in other scripts (e.g., invoking accumulo-service master
from an init
script) should be renamed to manager
(for example, accumulo-service manager
).
If the manager is not started using the provided accumulo-cluster
or accumulo-service
scripts,
then a one-time upgrade step will need to be performed. Run the RenameMasterDirInZK
utility after
installing 2.1 but before starting it.
${ACCUMULO_HOME}/bin/accumulo org.apache.accumulo.manager.upgrade.RenameMasterDirInZK
Pre-Upgrade the property storage in ZooKeeper (optional)
As mentioned above, the property storage in ZooKeeper has changed from many nodes per table, namespace, and the system configuration, to just a single node for each of those. Upgrading to use the new format does happen automatically when Accumulo 2.1 servers start up. However, you can optionally choose to convert them using a pre-upgrade step with the following command line utility.
The property conversion can be done using a command line utility or it will occur automatically when the manager is started for the first time. Using the command line utility is optional, but may provide more flexibility in handling issues if they were to occur. With ZooKeeper running, the command to convert the properties is:
$ACCUMULO_HOME/bin/accumulo config-upgrade
The utility will print messages about its progress as it converts them.
2022-11-03T14:35:44,596 [conf.SiteConfiguration] INFO : Found Accumulo configuration on classpath at /opt/fluo-uno/install/accumulo-3.0.0-SNAPSHOT/conf/accumulo.properties
2022-11-03T14:35:45,511 [util.ConfigPropertyUpgrader] INFO : Upgrade system config properties for a1518a8b-f007-41ee-af2c-5cc760abe7fd
2022-11-03T14:35:45,675 [util.ConfigTransformer] INFO : property transform for SystemPropKey{InstanceId=a1518a8b-f007-41ee-af2c-5cc760abe7fd'} took 29ms ms, delete count: 1, error count: 0
2022-11-03T14:35:45,683 [util.ConfigPropertyUpgrader] INFO : Upgrading namespace +accumulo base path: /accumulo/a1518a8b-f007-41ee-af2c-5cc760abe7fd/namespaces/+accumulo/conf
...
2022-11-03T14:35:45,737 [util.ConfigPropertyUpgrader] INFO : Upgrading table !0 base path: /accumulo/a1518a8b-f007-41ee-af2c-5cc760abe7fd/tables/!0/conf
2022-11-03T14:35:45,813 [util.ConfigTransformer] INFO : property transform for TablePropKey{TableId=!0'} took 72ms ms, delete count: 26, error count: 0
...
If the upgrade utility is not used, similar messages will print to the server logs when 2.1 starts.
When the property conversion is complete, you can verify the configuration using the zoo-info-viewer utility (new in 2.1)
$ACCUMULO_HOME/bin/accumulo zoo-info-viewer --print-props
Create new cluster configuration file
The accumulo-cluster
script now uses a single file that defines the location of the managers,
tservers, etc. You can create this file using the command accumulo-cluster create-config
. You will
then need to transfer the contents of the current individual files to this new consolidated file.
Encrypted Instances
Warning: Upgrading a previously encrypted instance with the experimental encryption properties is not supported as the implementation and properties have changed. You may be able to disable encryption and compact your files without encryption, in order to upgrade. Encryption remains an experimental feature, and may change between versions. It should be used with care. If you need help, consider reaching out to our mailing list.
Upgrading from 1.8/9/10 to 2.0
Follow the steps below to upgrade your Accumulo instance and client to 2.0.
Upgrade Accumulo instance
IMPORTANT! Before upgrading to Accumulo 2.0, you will need to upgrade to Java 8 and Hadoop 3.x.
Upgrading to Accumulo 2.0 is done by stopping Accumulo 1.8/9 and starting Accumulo 2.0.
Before stopping Accumulo 1.8/9, install Accumulo 2.0 and configure it by following the 2.0 quick start.
There are several changes to scripts and configuration in 2.0 so be careful when using configuration or automated setup designed for 1.8/9. Below are some changes in 2.0 that you should be aware of:
accumulo.properties
has replacedaccumulo-site.xml
. You can either convertaccumulo-site.xml
by hand from XML to properties or use the following Accumulo command.accumulo convert-config -x old/accumulo-site.xml -p new/accumulo.properties
- The following server properties were deprecated for 2.0:
accumulo-client.properties
has replacedclient.conf
. The client properties in the new file are different so take care when customizing.accumulo-cluster
script has replaced thestart-all.sh
&stop-all.sh
scripts.- Default host files (i.e
masters
,monitor
,gc
) are no longer inconf/
directory of tarball but can be created usingaccumulo-cluster create-config
- Tablet server hosts must be listed in a
tservers
file instead of aslaves
file. To minimize confusion, Accumulo will not start if the oldslaves
file is present.
- Default host files (i.e
accumulo-service
script can be used to start/stop Accumulo services (i.e master, tablet server, monitor) on a single node.- Can be used even if Accumulo was started using
accumulo-cluster
script.
- Can be used even if Accumulo was started using
accumulo-env.sh
constructs environment variables (such asJAVA_OPTS
andCLASSPATH
) used when running Accumulo processes- This file was used in Accumulo 1.x but has changed significantly for 2.0
- Environment variables (such as
$cmd
,$bin
,$conf
) are set beforeaccumulo-env.sh
is loaded and can be used to customize environment. - The
JAVA_OPTS
variable is constructed inaccumulo-env.sh
to pass command line arguments to thejava
command that the starts Accumulo processes (i.e.java $JAVA_OPTS main.class.for.$cmd
). - The
CLASSPATH
variable sets the Java classpath used when running Accumulo processes. It can be modified to upgrade dependencies or use vendor-specific distributions of Hadoop.
- Logging is configured in
accumulo-env.sh
for Accumulo processes. The following log4j configuration files in theconf/
directory will be used ifaccumulo-env.sh
is not modified. These files can be modified to turn on/off logging for Accumulo processes:log4j-service.properties
for all Accumulo services (except monitor)logj4-monitor.properties
for Accumulo monitorlog4j.properties
for Accumulo clients and commands
- MapReduce jobs that read/write from Accumulo must configure their dependencies differently.
- Run the command
accumulo shell
to access the shell using configuration inconf/accumulo-client.properties
When your Accumulo 2.0 installation is properly configured, stop Accumulo 1.8/9 and start Accumulo 2.0:
./accumulo-1.9.3/bin/stop-all.sh
./accumulo-2.0.1/bin/accumulo-cluster start
It is recommended that users test this upgrade on development or test clusters before attempting it on production clusters.
Upgrade Accumulo clients
There several client API changes in 2.0. In most cases, new API was introduced and the old API was only deprecated. While it is recommended that users start using the new API, the old API will continue to be supported through 2.x.
Below is a list of client API changes that users are required to make for 2.0:
- Update your pom.xml use Accumulo 2.0. Also, update any Hadoop & ZooKeeper dependencies in your pom.xml to match the versions running on your cluster.
<dependency> <groupId>org.apache.accumulo</groupId> <artifactId>accumulo-core</artifactId> <version>2.0.1</version> </dependency>
- ClientConfiguration objects can no longer be created using
new ClientConfiguration()
.- Use
ClientConfiguration.create()
instead
- Use
- Some API deprecated in 1.x releases was dropped
- Aggregators have been removed
Below is a list of recommended client API changes:
- The API for creating Accumulo clients has changed in 2.0.
- The old API using ZooKeeperInstance, Connector, Instance, and ClientConfiguration has been deprecated.
- Connector objects can be created from an AccumuloClient object using Connector.from()
- Accumulo’s MapReduce API has changed in 2.0.
- A new API has been introduced in the
org.apache.accumulo.hadoop
package of theaccumulo-hadoop-mapreduce
jar. - The old API in the
org.apache.accumulo.core.client
package of theaccumulo-core
has been deprecated and will eventually be removed. - For both the old and new API, you must configure dependencies differently when creating your MapReduce job.
- A new API has been introduced in the
Upgrading from 1.7 to 1.8
Upgrades from 1.7 to 1.8 are possible with little effort as no changes were made at the data layer and RPC changes were made in a backwards-compatible way. The recommended way is to stop Accumulo 1.7, perform the Accumulo upgrade to 1.8, and then start 1.8. Like previous versions, after 1.8 is started on a 1.7 instance, a one-time upgrade will happen by the Master which will prevent a downgrade back to 1.7. Upgrades are still one way. Upgrades from versions prior to 1.7 to 1.8 should follow the below path to 1.7 and then perform the upgrade to 1.8 – direct upgrades to 1.8 for versions other than 1.7 are untested.
Existing configuration files from 1.7 should be compared against the examples provided in 1.8. The 1.7 configuration files should all function with 1.8 code, but you will likely want to include changes found in the 1.8.0 release notes and these release notes for 1.8.1.
For upgrades from prior to 1.7, follow the upgrade instructions to 1.7 first.
Upgrading from 1.7.x to 1.7.y
The recommended way to upgrade from a prior 1.7.x release is to stop Accumulo, upgrade to 1.7.y and then start 1.7.y.
When upgrading, there is a known issue if the upgrade fails due to outstanding FATE operations, see ACCUMULO-4496 The workaround if this situation is encountered:
- Start tservers
- Start shell
- Run
fate print
to list all - If completed, just delete with
fate delete
- Start masters once there are no more fate operations
If any of the FATE operations are not complete, you should rollback the upgrade and troubleshoot completing them with your prior version. When performing an upgrade between major versions, the upgrade is one-way, therefore it is important that you do not have any outstanding FATE operations before starting the upgrade.
Upgrading from 1.6 to 1.7
Upgrades from 1.6 to 1.7 are possible with little effort as no changes were made at the data layer and RPC changes were made in a backwards-compatible way. The recommended way is to stop Accumulo 1.6, perform the Accumulo upgrade to 1.7, and then start 1.7. Like previous versions, after 1.7.0 is started on a 1.6 instance, a one-time upgrade will happen by the Master which will prevent a downgrade back to 1.6. Upgrades are still one way. Upgrades from versions prior to 1.6 to 1.7 should follow the below path to 1.6 and then perform the upgrade to 1.7 – direct upgrades to 1.7 for versions other than 1.6 are untested.
After upgrading to 1.7.0, users will notice the addition of a replication table in the accumulo namespace. This table is created and put offline to avoid any additional maintenance if the data-center replication feature is not in use.
Existing configuration files from 1.6 should be compared against the examples provided in 1.7. The 1.6 configuration files should all function with 1.7 code, but you will likely want to include a new file (hadoop-metrics2-accumulo.properties) to enable the new metrics subsystem. Read the section on Hadoop Metrics2 in the Administration chapter of the Accumulo User Manual.
For each of the other new features, new configuration properties exist to support the feature. Refer to the added sections in the User Manual for the feature for information on how to properly configure and use the new functionality.
Upgrading from 1.5 to 1.6
This happens automatically the first time Accumulo 1.6 is started.
If your instance previously upgraded from 1.4 to 1.5, you must verify that your 1.5 instance has no outstanding local write ahead logs. You can do this by ensuring either:
- All of your tables are online and the Monitor shows all tablets hosted
- The directory for write ahead logs (logger.dir.walog) from 1.4 has no files remaining on any tablet server / logger hosts
To upgrade from 1.5 to 1.6 you must:
- Verify that there are no outstanding FATE operations
- Under 1.5 you can list what’s in FATE by running
$ACCUMULO_HOME/bin/accumulo org.apache.accumulo.server.fate.Admin print
- Note that operations in any state will prevent an upgrade. It is safe to delete operations with status SUCCESSFUL. For others, you should restart your 1.5 cluster and allow them to finish.
- Under 1.5 you can list what’s in FATE by running
- Stop the 1.5 instance.
- Configure 1.6 to use the hdfs directory and zookeepers that 1.5 was using.
- Copy other 1.5 configuration options as needed.
- Start Accumulo 1.6.
The upgrade process must make changes to Accumulo’s internal state in both ZooKeeper and the table metadata. This process may take some time if Tablet Servers have to go through recovery. During this time, the Monitor will claim that the Master is down and some services may send the Monitor log messages about failure to communicate with each other. These messages are safe to ignore. If you need detail on the upgrade’s progress you should view the local logs on the Tablet Servers and active Master.
Upgrading from 1.4 to 1.6
To upgrade from 1.4 to 1.6 you must perform a manual initial step.
Prior to upgrading you must:
- Verify that there are no outstanding FATE operations
- Under 1.4 you can list what’s in FATE by running
$ACCUMULO_HOME/bin/accumulo org.apache.accumulo.server.fate.Admin print
- Note that operations in any state will prevent an upgrade. It is safe to delete operations with status SUCCESSFUL. For others, you should restart your 1.4 cluster and allow them to finish.
- Under 1.4 you can list what’s in FATE by running
- Stop the 1.4 instance.
- Configure 1.6 to use the hdfs directory, walog directories, and zookeepers that 1.4 was using.
- Copy other 1.4 configuration options as needed.
Prior to starting the 1.6 instance you will need to run the LocalWALRecovery tool on each node that previously ran an instance of the Logger role.
$ACCUMULO_HOME/bin/accumulo org.apache.accumulo.tserver.log.LocalWALRecovery
The recovery tool will rewrite the 1.4 write ahead logs into a format that 1.6 can read. After this step has completed on all nodes, start the 1.6 cluster to continue the upgrade.
The upgrade process must make changes to Accumulo’s internal state in both ZooKeeper and the table metadata. This process may take some time if Tablet Servers have to go through recovery. During this time, the Monitor will claim that the Master is down and some services may send the Monitor log messages about failure to communicate with each other. While the upgrade is in progress, the Garbage Collector may complain about invalid paths. The Master may also complain about failure to create the trace table because it already exists. These messages are safe to ignore. If other error messages occur, you should seek out support before continuing to use Accumulo. If you need detail on the upgrade’s progress you should view the local logs on the Tablet Servers and active Master.
Note that the LocalWALRecovery tool does not delete the local files. Once you confirm that 1.6 is successfully running, you should delete these files on the local filesystem.
Upgrading from 1.4 to 1.5
This happens automatically the first time Accumulo 1.5 is started.
- Stop the 1.4 instance.
- Configure 1.5 to use the hdfs directory, walog directories, and zookeepers that 1.4 was using.
- Copy other 1.4 configuration options as needed.
- Start Accumulo 1.5.