Ingesting Data into Shaper
You can ingest data into Shaper’s database through the HTTP API or via NATS. Since DuckDB is not optimized for write operations, we store data in a NATS Jetstream and then write it to DuckDB in batches using DuckDB’s Appender API. We create tables and add columns automatically based on the data ingested.
-
To ingest data you first need to create an API Key with “Ingest Data” permission in the “Admin” settings of the Shaper UI.
-
Then you can write JSON data to the HTTP API or NATS directly:
Endpoint:
POST http://localhost:5454/api/data/:tablenameAuthentication is done through Bearer token in the Authorization header.
You can pass a single JSON object or an array of objects.
Example:
Terminal window curl -X POST http://localhost:5454/api/data/my_table \-H "Authorization: Bearer <your-api-key>" \-H "Content-Type: application/json" \-d '{"col1": "value1", "col2": 124}'Terminal window curl -X POST http://localhost:5454/api/data/my_table \-H "Authorization: Bearer <your-api-key>" \-H "Content-Type: application/json" \-d '[{"col1": "value1", "col2": 123}, {"col1": "value2", "col2": 456}]'Shaper uses NATS internally, but by default NATS is not directly accessible.
You can make NATS reachable by specifying
nats-port. To make sure NATS is secured you also need to specify an adminnats-token. For example:Terminal window docker run --rm -it -p5454:5454 -p4222:4222 taleshape/shaper --nats-port 4222 --nats-token mytokenYou can skip
nats-tokenfor development, but in production you want to make sure it is set.Now you can ingest data like this:
Terminal window nats pub --user '<your-api-key>' shaper.ingest.my_table '{"col1": "value1", "col2": 124}'You can also use any other NATS client. And you can also submit data using the Jetstream API to get ACKs.
-
If you click on “New” in the sidebar now, you can run the following query:
DESC my_table;SELECT * FROM my_table;column_name column_type null key default extra _idVARCHARYES_tsTIMESTAMPYEScol1VARCHARYEScol2DOUBLEYESYou can see that Shaper auto-creates columns with appropriate data types, and we always add
_idand_tscolumns. You can override the default values by passing data in the JSON object for them. If using NATS directly and you set theNats-Msg-Idheader it will be used as the_idcolumn value (if not set in data itself).Shaper detects boolean and numbers in JSON. We also detect date and timestamp strings in various formats. If any data is a complex data type such as an array or object, we store them as
JSONcolumn in DuckDB.
Automate Data Loading
Section titled “Automate Data Loading”Shaper supports tasks to automatically run SQL scripts in the background similar to CRON jobs.
Tasks are especially helpful to routinely load and cleanup data.
Learn more in the Tasks Documentation.