ReUseX  0.0.1
3D Point Cloud Processing for Building Reuse
Loading...
Searching...
No Matches
IDataset.hpp
Go to the documentation of this file.
1// SPDX-FileCopyrightText: 2025 Povl Filip Sonne-Frederiksen
2//
3// SPDX-License-Identifier: GPL-3.0-or-later
4#pragma once
6#include <filesystem>
7#include <memory>
8#include <opencv2/core/mat.hpp>
9#include <span>
10#include <vector>
11
12// Forward declaration
13namespace ReUseX::io {
14class RTABMapDatabase;
15}
16
17namespace ReUseX::vision {
18/* Interface for datasets. A dataset is a collection of data samples, where each
19 * sample consists of an image and a label. The dataset is stored in a SQLite
20 * database, where each sample is stored as a row in a table. The table has the
21 * following columns: - id: an integer primary key that uniquely identifies the
22 * sample - image: a blob that contains the image data - label: an integer that
23 * represents the label of the sample. The dataset provides methods for
24 * retrieving samples and saving new samples to the database. The get method
25 * retrieves a sample by its index, and the save method saves a batch of samples
26 * to the database. The dataset also provides methods for retrieving and saving
27 * images, which are used internally by the get and save methods. The dataset is
28 * designed to be used with the IData interface, which represents a single data
29 * sample. The IData interface provides methods for accessing the image and
30 * label of a sample, and for saving the sample to the database. The dataset is
31 * intended to be used in machine learning applications, where it can be used to
32 * train and evaluate models on a collection of labeled images. */
33class IDataset {
34 public:
35 /* A pair of a data sample and its index. The data sample is represented as a
36 * unique pointer to an IData object, and the index is a size_t that
37 * represents the position of the sample in the dataset. The get method
38 * returns a Pair, which allows the caller to access both the data sample
39 * and its index. The save method takes a span of Pairs, which allows the
40 * caller to save a batch of samples to the database. */
41 using Pair = std::pair<std::unique_ptr<IData>, size_t>;
42
43 /* Constructs a new IDataset object with a shared database instance.
44 *
45 * This constructor allows multiple IDataset instances to share the same
46 * database connection. The database is managed by shared_ptr, so it will
47 * remain open as long as any IDataset instance references it.
48 *
49 * @param database Shared pointer to RTABMapDatabase instance
50 */
51 explicit IDataset(std::shared_ptr<io::RTABMapDatabase> database);
52
53 /* Constructs a new IDataset object by opening a database at the given path.
54 *
55 * This convenience constructor creates a new RTABMapDatabase instance
56 * internally and stores it as a shared_ptr. The database connection is
57 * managed by the IDataset and will be closed when the last reference is
58 * destroyed.
59 *
60 * @param dbPath The path to the RTABMap database file.
61 */
62 explicit IDataset(std::filesystem::path dbPath);
63
64 /* Virtual destructor to ensure proper cleanup of derived classes. */
65 virtual ~IDataset() = default;
66
67 /* Returns the number of samples in the dataset. The size method returns the
68 * number of samples in the dataset, which is equal to the size of the ids_
69 * vector. The size method is used by the caller to determine how many samples
70 * are available in the dataset, and to iterate over the samples using their
71 * indices. The size method is a const method, which means that it does not
72 * modify the state of the IDataset object.
73 * @return The number of samples in the dataset.
74 */
75 size_t size() const;
76
77 /* Retrieves a sample by its index. The get method takes an index as input,
78 * which is used to look up the corresponding sample ID in the ids_ vector.
79 * The get method then retrieves the image and label for the sample from the
80 * database, and returns a Pair containing a unique pointer to an IData object
81 * that represents the sample, and the index of the sample in the dataset. The
82 * get method is a const method, which means that it does not modify the state
83 * of the IDataset object. The get method is a pure virtual method, which
84 * means that it must be implemented by derived classes.
85 * @param index The index of the sample to retrieve.
86 * @return A Pair containing a unique pointer to an IData object that
87 * represents the sample, and the index of the sample in the dataset.
88 */
89 virtual Pair get(const std::size_t index) const = 0;
90
91 /* Saves a batch of samples to the database. The save method takes a span of
92 * Pairs as input, which allows the caller to save a batch of samples to the
93 * database. The save method iterates over the span of Pairs, and for each
94 * Pair, it retrieves the IData object and its index, and saves the image and
95 * label for the sample to the database. The save method returns true if all
96 * samples were saved successfully, and false otherwise. The save method is a
97 * pure virtual method, which means that it must be implemented by derived
98 * classes.
99 * @param data A span of Pairs, where each Pair contains a unique pointer to
100 * an IData object that represents a sample, and the index of the sample in
101 * the dataset.
102 * @return true if all samples were saved successfully, and false otherwise.
103 */
104 virtual bool save(const std::span<Pair> &data) = 0;
105
106 protected:
107 /* Retrieves the image data for a sample from the database. The getImage
108 * method takes an index as input, which is used to look up the corresponding
109 * sample ID in the ids_ vector. The getImage method then retrieves the image
110 * data for the sample from the database, and returns it as a cv::Mat object.
111 * The getImage method is a const method, which means that it does not modify
112 * the state of the IDataset object. The getImage method is used internally by
113 * the get method to retrieve the image data for a sample when constructing an
114 * IData object to represent the sample.
115 * @param index The index of the sample whose image data to retrieve.
116 * @return A cv::Mat object containing the image data for the sample.
117 */
118 cv::Mat getImage(const std::size_t index) const;
119
120 /* Saves the image data for a sample to the database. The saveImage method
121 * takes an index and a cv::Mat object as input, which represent the index of
122 * the sample and the image data to save, respectively. The saveImage method
123 * saves the image data for the sample to the database, and returns true if
124 * the image was saved successfully, and false otherwise. The saveImage method
125 * is used internally by the save method to save the image data for a sample
126 * when saving a batch of samples to the database.
127 * @param index The index of the sample whose image data to save.
128 * @param image A cv::Mat object containing the image data to save for the
129 * sample.
130 * @return true if the image was saved successfully, and false otherwise.
131 */
132 bool saveImage(const std::size_t index, const cv::Mat &image);
133
134 /* Access to the underlying database for subclasses.
135 *
136 * Subclasses can use this to access database functionality beyond the
137 * basic getImage/saveImage interface if needed.
138 *
139 * @return Shared pointer to the RTABMapDatabase instance
140 */
141 std::shared_ptr<io::RTABMapDatabase> getDatabase() const;
142
143 private:
144 /* Shared pointer to the RTABMap database. Multiple IDataset instances can
145 * share the same database connection. The database connection is managed
146 * via RAII and will be closed when the last reference is destroyed.
147 */
148 std::shared_ptr<io::RTABMapDatabase> db_;
149
150 /* Cached list of node IDs in the dataset. This is populated once during
151 * construction by querying the database. The IDs are used to map from
152 * dataset indices (0, 1, 2, ...) to RTABMap node IDs.
153 */
154 std::vector<int> ids_;
155};
156} // namespace ReUseX::vision
Core database class that wraps RTABMap's database functionality.
IDataset(std::filesystem::path dbPath)
cv::Mat getImage(const std::size_t index) const
std::shared_ptr< io::RTABMapDatabase > getDatabase() const
virtual bool save(const std::span< Pair > &data)=0
virtual Pair get(const std::size_t index) const =0
virtual ~IDataset()=default
bool saveImage(const std::size_t index, const cv::Mat &image)
std::pair< std::unique_ptr< IData >, size_t > Pair
Definition IDataset.hpp:41
IDataset(std::shared_ptr< io::RTABMapDatabase > database)