Instance Type Configuration and Data Source Previews

Default Stream Cluster Configuration now includes On-demand instances for Driver nodes

Using an on-demand instance for the Spark driver node can make stream processing more reliable by preventing losing the entire cluster to spot termination. By using a mix of instance types, you can ensure reliability while taking advantage of cheaper spot instances for additional processing power.

The new first_on_demand parameter for DatabricksClusterConfig and EMRClusterConfig enables configuring a mix of on-demand and spot instances in a single cluster. When configured, the first first_on_demand nodes of the cluster will use on_demand instances. The rest will use the type specified by instance_availability.

If not specified, then Tecton will default to first_on_demand=1 for StreamFeatureView and StreamWindowAggregateFeatureView.

Spot with fallback instance availability for Databricks

Materialization jobs on Databricks can now be configured to use the spot with fallback availability option.

DatabricksClusterConfig.instance_availability now supports the spot_with_fallback option. See the Databricks documentation for more details.

Raw Data Source Preview

You can now use the Tecton SDK to view Data Source inputs before the translator function is applied. Viewing a sample of this raw data can help debug translator and data source issues.

To do so, set apply_translator=False when using the
StreamDataSource.start_stream_preview() or BatchDataSource.get_dataframe() methods.

Add Your Heading Text Here

Instance Type Configuration and Data Source Previews

Default Stream Cluster Configuration now includes On-demand instances for Driver nodes

Spot with fallback instance availability for Databricks

Raw Data Source Preview

Follow Us

Book a Demo

Contact Sales

Request a free trial