Volltext-Downloads (blau) und Frontdoor-Views (grau)
The search result changed since you submitted your search request. Documents might be displayed in a different sort order.
  • search hit 24 of 345
Back to Result List

nativeNDP: processing big data analytics on native storage nodes

  • Data analytics tasks on large datasets are computationally intensive and often demand the compute power of cluster environments. Yet, data cleansing, preparation, dataset characterization and statistics or metrics computation steps are frequent. These are mostly performed ad hoc, in an explorative manner and mandate low response times. But, such steps are I/O intensive and typically very slow due to low data locality, inadequate interfaces and abstractions along the stack. These typically result in prohibitively expensive scans of the full dataset and transformations on interface boundaries. In this paper, we examine R as analytical tool, managing large persistent datasets in Ceph, a wide-spread cluster file-system. We propose nativeNDP – a framework for Near Data Processing that pushes down primitive R tasks and executes them in-situ, directly within the storage device of a cluster-node. Across a range of data sizes, we show that nativeNDP is more than an order of magnitude faster than other pushdown alternatives.

Download full text files

  • 2442.pdf
    eng

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author of HS ReutlingenVinçon, Tobias; Riegger, Christian; Petrov, Ilia
DOI:https://doi.org/10.1007/978-3-030-28730-6_9
ISBN:978-3-030-28730-6
Erschienen in:Advances in databases and information systems : 23rd European Conference, ADBIS 2019, Bled, Slovenia, September 8–11, 2019, proceedings. - (Lecture notes in computer science ; 11695)
Publisher:Springer
Place of publication:Cham
Editor:Tatjana Welzer
Document Type:Conference proceeding
Language:English
Publication year:2019
Tag:cluster; in-storage processing; native storage; near-data processing
Page Number:12
First Page:139
Last Page:150
PPN:Im Katalog der Hochschule Reutlingen ansehen
DDC classes:004 Informatik
Open access?:Nein
Licence (German):License Logo  In Copyright - Urheberrechtlich geschützt