/*
# Licensed Materials - Property of IBM
# Copyright IBM Corp. 2016
*/
/**
* At least once processing using consistent regions.
*
* <H2>Consistent regions</H2>
*
* <H3>Overview</H3>
*
* Because of business requirements, some applications require that all tuples
* in an application are processed at least once. You can use a consistent
* region in your streams processing applications to avoid data loss due to
* software or hardware failure and meet your requirements for at-least-once
* processing.
* <P>
* A consistent region is a subgraph where the states region (the operators
* and transformations within it) become
* consistent by processing all the tuples within defined
* points on a stream. This enables tuples within the subgraph to be processed
* at least once. The consistent region is periodically drained of its current
* tuples. All tuples in the consistent region are processed through to the end
* of the subgraph. In-memory state of operators are automatically serialized
* and stored on checkpoint for each of the operators in the region.
* </P>
* <P>
* If any element in a consistent region fails at run time, IBM Streams detects
* the failure and triggers the restart of the element and the reset of the
* region. In-memory state of the region is automatically reloaded to
* a consistent point.
* </P>
* <P>
* The capability to drain the subgraph, which is coupled with start operators
* that can replay their output streams, enables a consistent region to achieve
* at-least-once processing.
* </P>
* <P>
* A stream processing application can be defined with zero, one, or more
* consistent regions. You can define the start of a consistent region with
* the {@link com.ibm.streamsx.topology.TStream#setConsistent(ConsistentRegionConfig) setConsistent()}.}.
* IBM Streams then determines
* the scope of the consistent region automatically, but you can reduce the
* scope of the region with {@link com.ibm.streamsx.topology.TStream#autonomous()}.
* </P>
* <P>
* When a subgraph is a consistent region, IBM Streams enables the
* operators in that region to drain and reset. When a region is draining, it
* establishes logical divisions in the output streams of each operator in the
* region. A drain is successful when all operators in the region establish a
* logical division in their output streams, and when all tuples before the
* logical division are processed in their input streams. If a drain is
* successful, it means that all operators in the region consumed all input
* streams up until the established logical division. While the region is
* draining or resetting, operators in the region that completed their draining
* or resetting cannot submit new tuples. This behavior means that the tuple
* flow within the subgraph briefly stops while the region is draining or resetting.
* </P>
*
* <H3>Start of a consistent region</H3>
* A consistent region is started by marking a source stream as consistent using
* {@link com.ibm.streamsx.topology.TStream#setConsistent(ConsistentRegionConfig)}.
* The source operator for the {@code TStream} must support logic that,
* after a failure in the region, can replay tuples since the last checkpoint.
* Typically the stream is a source stream that produces tuple from an external system
* like Apache Kakfa.
* <P>
* The logic that produces a replayable stream must be implemented
* using IBM Streams Java or C++ primitive operator apis.
* <BR>
* A functional source
* (e.g. {@link com.ibm.streamsx.topology.Topology#source(com.ibm.streamsx.topology.function.Supplier)})
* cannot be used as the start of a consistent region,
* thus the stream must be from an invocation of an SPL operator through
* {@link com.ibm.streamsx.topology.spl.SPL},
* {@link com.ibm.streamsx.topology.spl.JavaPrimitive}.
* Such an invocation may be wrapped by a Java method that provides
* a simplified version of the invocation for application developers.
* </P>
*
* <H3>Drain-checkpoint cycle</H3>
* When a region is triggered it is:
* <OL>
* <LI> <em>drained</em> of any tuple related processing for
* all tuples seen on each stream in the region</li>
* <LI> <em>checkpointed</em> to reflect the state of each tuple processor (operator)
* after it has processed all tuples on its input streams (it has been <em>drained</em>).
* </OL>
* This drain-checkpoint cycle results in a region where all the operators are
* consistent with having processed all tuples seen on their input streams, and the
* region as a whole is consistent with having processed all tuples the source operator
* has submitted.
* <P>
* After any failure in the region, all operators are reset to a previous consistent point
* and then tuple processing resumes with the source operator replaying tuples since
* the last consistent point.
* </P>
* <H3>At least once/exactly once processing</H3>
* From the point of view of an operator in the region tuple processing is effectively exactly
* once even though tuples are replayed after a failure. This is because the consistent
* region protocol resets each operator's state to a point before the tuples were seen for
* the first time. Each operator forgets it has seen the replayed tuple thus it seems them
* effectively exactly once from an operator state point of view.
* <P>
* At least once processing is only seen when the operator modifies external state that cannot
* be undone during a reset. For example sending an SMS text cannot be undone, so a IBM Streams
* application using a consistent region for monitoring and sending text alerts could result
* in more than once text message indicating an issue.
* <BR>
* With coordination between the operator and the external system exactly once processing
* is possible, depending on the capabilities of the external system, for example exactly
* once can be achieved with database and file systems.
* </P>
* <H3>Functional logic in consistent regions</H3>
*
* The checkpointing of functional logic is identical for
* consistent regions and
* {@link com.ibm.streamsx.topology.Topology#checkpointPeriod(long, java.util.concurrent.TimeUnit) autonomous checkpointing}.
*
* <H4>Stateless functions</H4>
* Stateless functions can be used in a consistent region, e.g. a stateless filter
* like {@code t = t.filter(s -> !s.empty())} on a {@code TStream<String>}.
* During a drain-checkpoint cycle no processing occurs related to the stateless function.
*
* <H4>Stateful functions</H4>
* Stateful functions can be used in a consistent region. During a drain-checkpoint cycle
* the function instance will be serialized as the checkpointed state.
*
* <H3>Autonomous regions</H3>
* By default processing occurs in an autonomous region where
* operator checkpointing and recovery from failure is
* independent of other operators. A consistent region can be
* ended by starting an autonomous region using
* {@link com.ibm.streamsx.topology.TStream#autonomous()}.
* Any processing against the stream returned by {@code autonomous()}
* is outside of the consistent region.
*/
package com.ibm.streamsx.topology.consistent;