Advanced Features
Scheduling
By default, the connector runs continuously, following streaming principles. However, you can control how often data extraction occurs by setting the exec-period
configuration for each ODP source. This defines how frequently data will be extracted within a given time interval.
Alternatively, you can use the cron.expression
and cron.security.time.sec
settings to create a cron-like schedule for extraction, which is helpful if the source system has specific timeframes for when data extraction should occur.
Partitioning
Each record extracted is associated with a partition based on the ODP’s context name and identifier or name. These together act as the primary key for each ODP source, as defined by SAP®.
Advanced OData Querying
Field Projections
Each ODP source provides a set of fields available for extraction. The connector allows you to select a subset of these fields to be sent to Kafka. You can change the field selection at any time without losing delta data, as the source system stores all changes for every field, regardless of your configuration.
Selections
Full Extraction Selection Range
When using full initialization
or full extraction
mode, you can apply filters, known as selection ranges
, to narrow down the data extracted. A selection range includes:
- A field name from the list of available fields
- A sign to indicate whether to include or exclude the field’s values
- An option defining the comparison operation (e.g., equals, not equals, between)
- A low value for comparisons like equals or the lower bound for intervals (e.g., between)
- A high value for the upper bound of interval-based selections
Multiple selection ranges can be applied, and they are combined using logical AND.
Delta Selection Range
Delta selection ranges work similarly to full extraction selection ranges but are only applied during delta initialization and subsequent delta requests. If you don’t need different filters for the initial load and subsequent updates, using delta initialization and delta selection ranges is recommended.
Parallelism
Each ODP source is assigned to a single worker task, limiting parallelism due to how the Kafka Connect API operates. Although the connector can handle data loads in packages across multiple tasks, task assignments are locked when the poll()
method is called. If the number of ODP sources exceeds the available tasks, one task may handle multiple ODPs.
Scaling by adding more tasks only makes sense when you have more ODP sources than available tasks. Adding more topic partitions for scaling isn’t an option because a single partition ensures that data is processed in the correct order.
Realtime Enabled Sources
The ODP Source Connector supports real-time enabled sources.
- Initialization Process: The initialization process is the same for both real-time and non-real-time sources.
- Real-Time Data Flow: Once initialized, the connector creates a real-time request in the ODQ. This request stays open even if the connector goes down, allowing the ODP daemon to continue pushing data into the ODQ. This reduces the latency of data retrieval, as the data is already in the ODQ before the connector starts the extraction.
- Failure Recovery: If the connector needs to recover from failure, it will close any open real-time requests and create a new one, ensuring the process continues smoothly.
- Configuration Options:
realtime.packages
: Automatically close real-time requests after a specific number of data packages are extracted.realtime.timelimit
: Automatically close real-time requests after a time limit is reached.realtime.enable
: Disable real-time requests for real-time enabled sources if needed.
Plugin Discovery
Plugin Discovery is the strategy the Connect worker uses to find plugin classes and make them available for configuration and execution. This process is controlled by the worker’s plugin.discovery
setting.
By default, the plugin.discovery
setting is HYBRID_WARN
, which is compatible with all plugins and logs a warning if it encounters any plugins incompatible with the SERVICE_LOAD
mode. The SERVICE_LOAD
option, which uses the faster ServiceLoader
mechanism, may improve performance during worker startup, but will not load incompatible plugins. See Connect Worker Configuration for all plugin.discovery
values.
The ODP Source Connector supports the ServiceLoader
mechanism.
For more information about Plugin Discovery and the Connect worker configuration refer to Kafka Connect Plugin Discovery as well as KIP-898: Modernize Connect plugin discovery.