Blog

Dec 2, 2025

Alloy: A New Architecture for Declarative Data Engineering

Dec 2, 2025

The Alloy Architecture introduces a structured, five-layer refinement model that eliminates hidden pipeline complexity. By replacing ad-hoc transformation logic with a consistent, predictable flow, Alloy brings clarity, performance, and governance to modern data engineering.

Dec 2, 2025

Vadim Orlov 11/26/24 Vadim Orlov 11/26/24

Refresh Strategies in DataForge

Discover the power of DataForge Cloud's refresh patterns to streamline your data pipelines. In this video, you'll learn about six key refresh methods: full refresh for initial dataset ingestion, append-only for incremental data updates, and advanced options like timestamp, sequence, and custom patterns for handling time-series data or unique scenarios. Watch as we demonstrate configurations, simulate dataset changes, and explore features like watermarks for tracking updates, historical data preservation, and atomic processing. Whether managing small datasets or complex time-series data, DataForge Cloud empowers you to optimize data transformations with precision and flexibility.

Vadim Orlov 11/5/24 Vadim Orlov 11/5/24

Data Transformation at Scale: Rule Templates & Cloning

Vadim Orlov, CTO of DataForge, tackles common data transformation challenges like repetitive coding and platform complexity in this video. He introduces DataForge Cloud’s rule templates and cloning features to streamline data management through a DRY (Don’t Repeat Yourself) approach.

Vadim walks through setting up data connections, creating reusable rule templates across datasets, and calculating metrics like sale prices and totals. He then demonstrates configuring an output table for reporting and, when the company adds a subsidiary, shows how the cloning feature replicates configurations for new platforms effortlessly.

This demonstration reveals how DataForge Cloud’s tools save time and centralize code management, enabling efficient, scalable, and reusable data engineering without constant rewrites.

Vadim Orlov 10/17/24 Vadim Orlov 10/17/24

Mastering Schema Evolution & Type Safety with DataForge

Schema changes are a common cause of pipeline failures. DataForge addresses this by focusing on type safety and schema evolution.

Type safety ensures reliable transformations through compile-time validation, preventing unexpected errors. Schema evolution automates handling of changes like new columns, data type updates, and nested structures.

With DataForge’s configurable strategies, such as upcasting and cloning, pipelines adapt smoothly to schema changes, reducing manual effort and improving reliability.

Vadim Orlov 9/24/24 Vadim Orlov 9/24/24

Sub-Sources: Simplifying Complex Data Structures with DataForge

In DataForge Cloud 8.1, we introduced Sub-Sources, simplifying the handling of nested complex arrays (NCAs) like ARRAY<STRUCT<..>>. This feature allows you to use standard SQL syntax on NCAs without needing to normalize or modify the underlying data. Sub-Sources act as "virtual" tables, enabling easy transformations while preserving the original structure. This innovation saves time and effort for data engineers working with complex, semi-structured data.

Vadim Orlov 9/18/24 Vadim Orlov 9/18/24

DataForge vs. Databricks Delta Live Tables for Change Data Capture

Check out our latest video where Vadim Orlov, CTO of DataForge, compares automating Change Data Capture (CDC) in DataForge Cloud versus Databricks Delta Live Tables. Discover how DataForge simplifies CDC processes, saving time and effort with automation, and watch a live demo showcasing its efficiency in real-world use cases.

Refresh Strategies in DataForge

Data Transformation at Scale: Rule Templates & Cloning

Mastering Schema Evolution & Type Safety with DataForge

Sub-Sources: Simplifying Complex Data Structures with DataForge

DataForge vs. Databricks Delta Live Tables for Change Data Capture

Product

Resources

Legal

Follow