讀取 Feather 檔案 (Arrow IPC 檔案) — read_feather • Arrow R 套件

Feather 提供資料框的二進制欄式序列化。其設計旨在使讀取和寫入資料框更有效率，並使跨資料分析語言共享資料更容易。read_feather() 可以讀取 Feather Version 1 (V1) 和 Version 2 (V2)。V1 是 2016 年開始提供的舊版格式，V2 則是 Apache Arrow IPC 檔案格式。read_ipc_file() 是 read_feather() 的別名。

用法

read_feather(file, col_select = NULL, as_data_frame = TRUE, mmap = TRUE)

read_ipc_file(file, col_select = NULL, as_data_frame = TRUE, mmap = TRUE)

引數

file: 字元檔案名稱或 URI、連線、raw 向量、Arrow 輸入串流，或具有路徑的 FileSystem (SubTreeFileSystem)。如果是檔案名稱或 URI，將會開啟 Arrow InputStream，並在完成時關閉。如果提供輸入串流，則會保持開啟狀態。
col_select: 要保留的欄位名稱字元向量，如同 data.table::fread() 的 "select" 引數，或欄位的tidy selection 規範，如同 dplyr::select() 中所使用。
as_data_frame: 此函數應傳回 tibble (預設) 還是 Arrow Table？
mmap: 邏輯值：是否將檔案記憶體對應 (預設 TRUE)

Value

如果 as_data_frame 為 TRUE (預設值)，則為 tibble，否則為 Arrow Table

另請參閱

FeatherReader 和 RecordBatchReader，用於更低階地存取讀取 Arrow IPC 資料。

範例

# We recommend the ".arrow" extension for Arrow IPC files (Feather V2).
tf <- tempfile(fileext = ".arrow")
on.exit(unlink(tf))
write_feather(mtcars, tf)
df <- read_feather(tf)
dim(df)
#> [1] 32 11
# Can select columns
df <- read_feather(tf, col_select = starts_with("d"))