Город МОСКОВСКИЙ
01:06:26

ETL - Extract Trino Load - A Case for Trino as a Batch Processing Engine

Аватар
База данных в действии
Просмотры:
78
Дата загрузки:
08.12.2023 02:25
Длительность:
01:06:26
Категория:
Развлечения

Описание

Trino is a relatively new name in the open source space that was formerly known as the PrestoSQL. Trino is very well known for fast adhoc and exploratory workloads on data lakes and heterogeneous data sources. When you want to provide your data scientists with the ability to query across your data landscape by joining live operational data with historical data, Trino is the state-of-the-art. Trino and Presto were initially built to replace Hive workloads at Facebook and handled massive petabyte-scale batch workloads. Yet across the board, Trino was not being widely adopted as a batch ETL engine to solve these workloads. As it turns out, one of the features that drive Trino's incredible speed was forgoing failure recovery measures to buy faster queries. In practice, many desire the opportunity to have the system running the query to facilitate the recovery from failures. The Trino community has banded around supporting native granular failure recovery to improve resiliency in the event of a failure. This brings Trino to a new frontier by enabling both exploratory and failure recovery for long-running workloads so that engineers and analysts do not have to shift between systems to run their queries.

Рекомендуемые видео