實作狀態#
下表總結了各種官方 Arrow 函式庫中可用的功能。所有函式庫目前都遵循 Arrow 格式的 1.0.0 版本,或與 1.0.0 版本相容的後續次要版本。有關版本控制的詳細資訊,請參閱格式版本控制和穩定性。除非另有說明,否則 Python、R、Ruby 和 C/GLib 函式庫均遵循 C++ Arrow 函式庫。
資料類型#
資料類型(基本型別) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Null |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Boolean |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Int8/16/32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
UInt8/16/32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Float16 |
✓ |
✓ (1) |
✓ |
✓ |
✓ (2) |
✓ |
✓ |
✓ |
|
Float32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Decimal32 |
✓ |
✓ |
✓ |
||||||
Decimal64 |
✓ |
✓ |
✓ |
||||||
Decimal128 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Decimal256 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Date32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Time32/64 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Timestamp |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Duration |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Interval |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
Fixed Size Binary |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Binary |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Large Binary |
✓ |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
|
Utf8 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
Large Utf8 |
✓ |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
|
Binary View |
✓ |
✓ |
✓ |
✓ |
|||||
Large Binary View |
✓ |
✓ |
|||||||
Utf8 View |
✓ |
✓ |
✓ |
✓ |
|||||
Large Utf8 View |
✓ |
✓ |
資料類型(巢狀型別) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Fixed Size List |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
List |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Large List |
✓ |
✓ |
✓ |
(4) |
✓ |
✓ |
✓ |
||
List View |
✓ |
✓ |
✓ |
||||||
Large List View |
✓ |
✓ |
|||||||
Struct |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Map |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Dense Union |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
Sparse Union |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
資料類型(特殊型別) |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Dictionary |
✓ |
✓ (3) |
✓ |
✓ |
✓ |
✓ (3) |
✓ |
✓ |
|
Extension |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
Run-End Encoded |
✓ |
✓ |
Canonical Extension types |
C++ |
Java |
Go |
JavaScript |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
Fixed shape tensor |
✓ |
|||||||
Variable shape tensor |
||||||||
JSON |
✓ |
✓ |
||||||
Opaque |
✓ |
✓ |
✓ |
|||||
UUID |
✓ |
✓ |
||||||
8-bit Boolean |
✓ |
✓ |
注意事項
(1) Java 不支援 Float16 的轉換。
(2) C# 中的 Float16 支援僅在目標為 .NET 6+ 時可用。
(3) 不支援巢狀字典
(4) C# 大型陣列類型旨在協助與其他函式庫的互通性,但這些類型不支援大於 2 GiB 的緩衝區,如果嘗試匯入過大的陣列,則會引發例外狀況。
參見
Arrow 欄狀格式和標準擴充類型規範。
IPC 格式#
IPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|
Arrow 串流格式 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ (4) |
Arrow 檔案格式 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|
記錄批次 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
字典 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
替換字典 |
✓ |
✓ |
✓ |
✓ |
|||||
Delta 字典 |
✓ (1) |
✓ (1) |
✓ |
✓ |
✓ |
||||
張量 |
✓ |
||||||||
稀疏張量 |
✓ |
||||||||
緩衝區壓縮 |
✓ |
✓ (3) |
✓ |
✓ |
✓ |
✓ |
|||
位元組序轉換 |
✓ (2) |
✓ (2) |
✓ (2) |
||||||
自訂結構描述中繼資料 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
注意事項
(1) 巢狀字典不支援 Delta 字典
(2) 讀取時,可以自動位元組交換具有非原生位元組序的資料。
(3) LZ4 Codec 目前效率不高。ARROW-11901 追蹤效能改進。
(4) nanoarrow IPC 實作僅針對讀取 IPC 串流實作。
參見
Flight RPC#
Flight RPC 傳輸 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
gRPC 傳輸 (grpc:, grpc+tcp:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
gRPC 網域 socket 傳輸 (grpc+unix:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
gRPC + TLS 傳輸 (grpc+tls:) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
UCX 傳輸 (ucx:) (1) |
✓ |
gRPC 傳輸中支援的功能
Flight RPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
所有 RPC 方法 |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
驗證處理常式 |
✓ |
✓ |
✓ |
✓ (2) |
✓ |
|||
呼叫逾時 |
✓ |
✓ |
✓ |
✓ |
||||
呼叫取消 |
✓ |
✓ |
✓ |
✓ |
||||
並行用戶端呼叫 (3) |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
自訂中介軟體 |
✓ |
✓ |
✓ |
✓ |
||||
RPC 錯誤代碼 |
✓ |
✓ |
✓ |
✓ |
✓ |
UCX 傳輸中支援的功能
Flight RPC 功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
所有 RPC 方法 |
✓ (4) |
|||||||
驗證處理常式 |
||||||||
呼叫逾時 |
||||||||
呼叫取消 |
||||||||
並行用戶端呼叫 |
✓ (5) |
|||||||
自訂中介軟體 |
||||||||
RPC 錯誤代碼 |
✓ |
注意事項
(1) Flight UCX 傳輸已在 19.0.0 版本中棄用。
(2) 支援使用 AspNetCore 驗證處理常式。
(3) 單一用戶端是否可以支援多個並行呼叫。
(4) 僅支援 DoExchange、DoGet、DoPut 和 GetFlightInfo。
(5) 每個並行呼叫都是與伺服器的個別連線(與 gRPC 不同,gRPC 的並行呼叫是透過單一連線多工處理)。這通常會提供更好的輸送量,但會消耗伺服器和用戶端上更多的資源。
參見
Flight SQL#
注意
Flight SQL 仍處於實驗階段。
功能支援僅指用戶端/伺服器函式庫;反過來實作 Flight SQL 協定的資料庫將支援/不支援個別功能。
功能 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
BeginSavepoint |
✓ |
✓ |
||||||
BeginTransaction |
✓ |
✓ |
||||||
CancelQuery |
✓ |
✓ |
||||||
ClosePreparedStatement |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
CreatePreparedStatement |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
CreatePreparedSubstraitPlan |
✓ |
✓ |
||||||
EndSavepoint |
✓ |
✓ |
||||||
EndTransaction |
✓ |
✓ |
||||||
GetCatalogs |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetCrossReference |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetDbSchemas |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetExportedKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetImportedKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetPrimaryKeys |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetSqlInfo |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetTables |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetTableTypes |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
GetXdbcTypeInfo |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
PreparedStatementQuery |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
PreparedStatementUpdate |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
StatementSubstraitPlan |
✓ |
✓ |
||||||
StatementQuery |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
StatementUpdate |
✓ |
✓ |
✓ |
✓ |
✓ |
參見
C 資料介面#
功能 |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
Julia |
C# |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|---|---|---|
結構描述匯出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
陣列匯出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
結構描述匯入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
||
陣列匯入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
參見
C 資料介面規範。
C 串流介面#
功能 |
C++ |
Python |
R |
Rust |
Go |
Java |
C/GLib |
Ruby |
Julia |
C# |
Swift |
nanoarrow |
---|---|---|---|---|---|---|---|---|---|---|---|---|
串流匯出 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
|||
串流匯入 |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
✓ |
參見
C 串流介面規範。
第三方資料格式#
格式 |
C++ |
Java |
Go |
JS |
C# |
Rust |
Julia |
Swift |
---|---|---|---|---|---|---|---|---|
Avro |
R |
R |
||||||
CSV |
R/W |
R (2) |
R/W |
R/W |
R/W |
|||
ORC |
R/W |
R (1) |
||||||
Parquet |
R/W |
R (2) |
R/W |
R/W |
注意事項
R = 支援讀取
W = 支援寫入
(1) 透過 JNI 綁定。(由
org.apache.arrow.orc:arrow-orc
提供)(2) 透過 JNI 綁定到 Arrow C++ Datasets。(由
org.apache.arrow:arrow-dataset
提供)