Managing Journal Files

The design of the journal files for the transaction log are such that AMPS can archive, compress and remove these files while AMPS is running. AMPS actions provide integrated administration for journal files, as described in Automating AMPS with Actions.

Archiving a file copies the file to an archival directory, typically located on higher-capacity but higher-latency storage. Compressing a file compresses the file in place. Archived and compressed journal files are still accessible to clients for replay and for AMPS to use in rebuilding any SOW files that are damaged or removed.

When defining a policy for archiving, compressing or removing files, keep in mind the amount of time for which clients will need to replay data. Once journal files have been deleted, the messages in those files are no longer available for clients to replay or for AMPS to use in recreating a SOW file. If journal files are removed, and a SOW file is retained, this means that the SOW may have data that is not in the transaction log.

While AMPS is running, the amps-action-do-remove-journal action is the only way to safely remove a journal file. This action correctly updates the internal AMPS data structures that refer to the journal file.

Likewise, the amps-action-do-archive-journal action is the only way to safely move a journal file to the archive directory while AMPS is running, and the amps-action-do-compress-journal action is the only way to safely compress journals while AMPS is running.

To determine how best to manage your journal files, consider your application's access pattern to the recorded messages. Most applications have a period of time (often a day or a week) where historical data is in heavy use, and a period of time (often a week, or a month) where data is infrequently used. One common strategy is to create the journal files on high-throughput storage. The files are archived to slower, higher-capacity storage after a short period of time, compressed, and then removed after a longer period of time. This strategy preserves space on high-throughput storage, while still allowing the journals to be used. For example, if your applications frequently replay data for the last day, occasionally replay data older than the last week, and never request data older than one month, a management strategy that meets these needs would be to archive files after one day, compress them after a week, and remove them after one month. Archival, compression, and removal should be done using AMPS actions.

If you remove journal files when AMPS is shut down, keep in mind that the removal of journal files must be sequential and cannot leave gaps in the remaining files. For example, say there are three journal files, 001, 002 and 003. If only 002 is removed, then the next AMPS restart could potentially overwrite the journal file 003, causing an unrecoverable problem.

When using AMPS actions to manage journal files, AMPS ensures that all replays from a journal file are complete, all queue messages in that journal file have been delivered (and acknowledged, if required), and all messages from a journal file have been successfully replicated before removing the file.

Reference to File Types

AMPS creates the following types of files as part of creating and managing the transaction log. Notice that this includes both files that contain messages (journal files) and a set of files created by AMPS to improve efficiency when the instance is restarting and recovering the state of the transaction log.

The files for a specific instance are prefixed with the instance Name. An AMPS instance will only create files that are prefixed with the Name of the instance, and on startup will only recover files that are prefixed with the Name of the instance.

Extension
File Type
Description

.journal

Journal file

These files contain the messages that comprise the transaction log.

AMPS always writes new messages to an uncompressed journal file.

.journal.gz

Compressed journal file

These files contain messages that comprise the transaction log.

These files have been compressed by AMPS as a result of the amps-action-do-compress-journal action.

Other than being compressed, they are treated identically to uncompressed journal files.

.index.gz

Journal index file

These files are used during recovery to help AMPS quickly rebuild its references to the content of the transaction log without having to completely reprocess each file.

Each index file contains index information for the corresponding journal file.

These files do not contain messages.

.topic.index

Topic index file

These files are used during replay to help AMPS quickly locate messages for a given topic.

Creating a topic index is optional, and is enabled through instance level Tuning configuration.

These files do not contain messages.

.clients.ack

Clients acknowledgment cache

Used during recovery to help AMPS quickly identify the last message persisted from each publisher without having to reprocess each journal file.

These files do not contain messages.

.queues.ack

Queue acknowledgment cache

For each queue, this file stores the point in the transaction log for which that queue has been completely processed (that is, all messages prior to that point in the transaction log have been acknowledged or expired).

On recovery, AMPS can begin restoring the state of the queue from that point rather than reprocessing the entire transaction log.

These files do not contain messages.

.queue.cache

Queue metadata cache

For a queue that specifies that metadata be cached in a file, this is the file that contains the cache.

If this file is removed, the queue state will be restored from the transaction log (using the recovery point stored in the queues.ack file).

These files do not contain messages or message headers. They contain delivery state for messages in the queue and the location of those messages in the transaction log.

Last updated

Copyright 2013-2024 60East Technologies, Inc.