]>
Commit | Line | Data |
---|---|---|
1d09f67e TL |
1 | .. Licensed to the Apache Software Foundation (ASF) under one |
2 | .. or more contributor license agreements. See the NOTICE file | |
3 | .. distributed with this work for additional information | |
4 | .. regarding copyright ownership. The ASF licenses this file | |
5 | .. to you under the Apache License, Version 2.0 (the | |
6 | .. "License"); you may not use this file except in compliance | |
7 | .. with the License. You may obtain a copy of the License at | |
8 | ||
9 | .. http://www.apache.org/licenses/LICENSE-2.0 | |
10 | ||
11 | .. Unless required by applicable law or agreed to in writing, | |
12 | .. software distributed under the License is distributed on an | |
13 | .. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | |
14 | .. KIND, either express or implied. See the License for the | |
15 | .. specific language governing permissions and limitations | |
16 | .. under the License. | |
17 | ||
18 | ================ | |
19 | VectorSchemaRoot | |
20 | ================ | |
21 | A :class:`VectorSchemaRoot` is a container that can hold batches, batches flow through :class:`VectorSchemaRoot` | |
22 | as part of a pipeline. Note this is different from other implementations (i.e. in C++ and Python, | |
23 | a :class:`RecordBatch` is a collection of equal-length vector instances and was created each time for a new batch). | |
24 | ||
25 | The recommended usage for :class:`VectorSchemaRoot` is creating a single :class:`VectorSchemaRoot` | |
26 | based on the known schema and populated data over and over into the same VectorSchemaRoot in a stream | |
27 | of batches rather than creating a new :class:`VectorSchemaRoot` instance each time | |
28 | (see `Numba <https://github.com/apache/arrow/tree/master/java/flight/src/main/java/org/apache/arrow/flight>`_ or | |
29 | ``ArrowFileWriter`` for better understanding). Thus at any one point a VectorSchemaRoot may have data or | |
30 | may have no data (say it was transferred downstream or not yet populated). | |
31 | ||
32 | ||
33 | Here is the example of building a :class:`VectorSchemaRoot` | |
34 | ||
35 | .. code-block:: Java | |
36 | ||
37 | BitVector bitVector = new BitVector("boolean", allocator); | |
38 | VarCharVector varCharVector = new VarCharVector("varchar", allocator); | |
39 | bitVector.allocateNew(); | |
40 | varCharVector.allocateNew(); | |
41 | for (int i = 0; i < 10; i++) { | |
42 | bitVector.setSafe(i, i % 2 == 0 ? 0 : 1); | |
43 | varCharVector.setSafe(i, ("test" + i).getBytes(StandardCharsets.UTF_8)); | |
44 | } | |
45 | bitVector.setValueCount(10); | |
46 | varCharVector.setValueCount(10); | |
47 | ||
48 | List<Field> fields = Arrays.asList(bitVector.getField(), varCharVector.getField()); | |
49 | List<FieldVector> vectors = Arrays.asList(bitVector, varCharVector); | |
50 | VectorSchemaRoot vectorSchemaRoot = new VectorSchemaRoot(fields, vectors); | |
51 | ||
52 | The vectors within a :class:`VectorSchemaRoot` could be loaded/unloaded via :class:`VectorLoader` and :class:`VectorUnloader`. | |
53 | :class:`VectorLoader` and :class:`VectorUnloader` handles converting between :class:`VectorSchemaRoot` and :class:`ArrowRecordBatch`( | |
54 | representation of a RecordBatch :doc:`IPC <../format/IPC.rst>` message). Examples as below | |
55 | ||
56 | .. code-block:: Java | |
57 | ||
58 | // create a VectorSchemaRoot root1 and convert its data into recordBatch | |
59 | VectorSchemaRoot root1 = new VectorSchemaRoot(fields, vectors); | |
60 | VectorUnloader unloader = new VectorUnloader(root1); | |
61 | ArrowRecordBatch recordBatch = unloader.getRecordBatch(); | |
62 | ||
63 | // create a VectorSchemaRoot root2 and load the recordBatch | |
64 | VectorSchemaRoot root2 = VectorSchemaRoot.create(root1.getSchema(), allocator); | |
65 | VectorLoader loader = new VectorLoader(root2); | |
66 | loader.load(recordBatch); | |
67 | ||
68 | A new :class:`VectorSchemaRoot` could be sliced from an existing instance with zero-copy | |
69 | ||
70 | .. code-block:: Java | |
71 | ||
72 | // 0 indicates start index (inclusive) and 5 indicated length (exclusive). | |
73 | VectorSchemaRoot newRoot = vectorSchemaRoot.slice(0, 5); | |
74 |