Series Data Artifacts
Series Data Artifacts are containers for time-series data, usually produced by simulations run within Sedaro Workflows. Like File Artifacts, they are version-controlled. However, Series Data Artifacts follow a more specialized format and are produced and handled in different ways.
Structure
A Series Data Artifact is composed of one or more Streams. A Stream is a time-series dataset composed of one or more "frames". Each frame contains a time value, and one or more additional key-value pairs which contain data associated with that time value. time values are in Modified Julian Date (MJD) format.
Each Stream within a Series Data Artifact must have a unique name. A Stream's name may contain up to 64 characters. Each character must be an uppercase or lowercase English letter, a number 0-9, or one of the following: -, _, ..
Each Stream has an associated type signature. This is a string that defines the structure and contents of the Stream's data as SedaroTS types and structures. Data for a Stream is to be ingested already serialized into SedaroTS binary format. When fetching series data, it will be transmitted in the same format. See the "Data Format" section below for details on how to work with this format.
Different Versions of the same Series Data Artifact should not be assumed to contain the same streams, the same time-series for a given stream, or the same data keys within a given stream's data.
Data Format
Each frame's data must be uploaded as a SedaroTS binary. From structured Python data, these binaries can be generated as follows:
### Your data in Python format, and your SedaroTS type signature for this data
### The values here are one example of valid values
my_type_signature = "(time: float, altitude: {float | km}, speed: {float | km/s})"
# Frames will be passed into the ingress client in the form of lists,
# whose fields must be in the same order as specified in the type signature.
# You may find it convenient to work with frames in the form of dictionaries with named keys,
# but if so you'll need to convert them to lists before providing them to the ingress client.
my_frame_data = [60322.9, 99.5, 65.0]
# Ordering of values is as specified in the type signature. For instance, the above list corresponds to:
# my_frame_data = {
# "time": 60322.9,
# "altitude": 99.5,
# "speed": 65.0,
# }
### Convert your data to SedaroTS binary format
from simvm.sv import Type
my_type = Type(my_type_signature)
my_frame = my_type.ser(my_frame_data)
Uploading Series Data
Sedaro provides a Simulation Client, in Python, which can be used to process and upload Series Data for a Stream. Additionally, an API endpoint is exposed which enables ingress of Series Data to Sedaro from external sources. Using environment variables, the Simulation Client can be configured to call this endpoint, allowing Sedaro users to upload Series Data from their own machine.
Set the following environment variables:
API_INGRESS_ENABLED=1PUBLIC_API_HOST: base URL of your Sedaro deployment (for example:https://test.sedaro.us)SEDARO_ACCESS_KEY: an access token for your Sedaro account. Ensure your account has permissions on the relevant Series Data Artifact and Version.
Then, use the client as follows. Important notes:
- A time step value is required, alongside the time value and the data, when ingesting Series Data frames. The time step value, unlike the
time, is in seconds. It indicates the time elapsed between this frame and the next one. In most contexts, it should be set to the value that, when added to the frame'stime, will sum to thetimeof the frame that will follow. - Series Data is uploaded with reference to its Version ID, not its Artifact ID.
### Your series data
frame_times = [x1, x2, ..., xN]
frame_timesteps = [y1, y2, ..., yN]
frame_data_binaries = [z1, z2, ..., zN]
stream_id = "my_stream"
stream_type_signature = "(time: float, altitude: {float | km}, speed: {float | km/s})"
series_data_artifact_version_id = <SedaroID>
### Set up client and use it to ingest and upload your series data for your stream
from sedaro_data_client import SimulationClient
# Instantiate SimulationClient
stream_upload_client = SimulationClient(
series_version_id=series_data_artifact_version_id,
stream_id=stream_id,
svv2_type=stream_type_signature,
)
# Once created, use `init()` to provide the SimulationClient with your simulation's start and end times
# Be sure to use the start and end times defined for the Series Data Artifact as a whole
await stream_upload_client.init(
start=<your start time in MJD>,
end=<your end time in MJD>,
)
# Now, ingest one frame at a time, in order
# The client will periodically upload data to the remote Sedaro server as it is ingested.
for frame_time, frame_time_step, frame_data_bytes in zip(frame_times, frame_timesteps, frame_data_binaries):
await stream_upload_client.enqueue(frame_time, frame_time_step, frame_data_bytes)
# Finally, terminate the client. This will upload any series data not yet uploaded.
await stream_upload_client.term()
Downloading and Processing Series Data
Sedaro exposes an API endpoint that downloads the Series Data for the given Stream of the specified Version of the given Series Data Artifact. If no Version is specified, the most recent Version for the Artifact will be used.
If you need a list of the available Streams for a given Artifact, you can use the Series Data Artifact metadata endpoint.
When downloading Series Data for a Stream, the response will be sent using HTTP Streaming. This enables a Sedaro user to inspect data early in the stream even before the data for the entire stream has finished downloading.
The downloaded response will contain both the Stream's SedaroTS type signature and the Stream's Series Data. Sedaro provides a utility, in our sedaro_data_client package, which can be used to extract these individual components:
import requests
from sedaro_data_client import parse_single_stream_response_sync
request_url = "<your-public-api-host>/api/artifacts/v1/series/id1234/streams/my_data/?version=1234old"
response = requests.get(request_url, stream=True)
assert response.status_code == 200
type_signature, series_data_bytes = parse_single_stream_response_sync(response)
Then, you can use SedaroTS to deserialize the data to a readable format and access values within the data.
from simvm.sv import Type
my_type = Type(type_signature)
stream_frames = my_type.de_array(series_data_bytes)
### Get the value of `altitude` in the first frame
print(f"Initial `altitude (km)`: {ty.td(stream_frames[0])['altitude'].py()}")
### Get that value in meters, using SedaroTS unit conversion
print(f"Initial `altitude (m)`: {ty.td(stream_frames[0])['altitude'].convert(Type('m')).py()}")
### Series data can be transposed from frame-major to key-major format
# Square brackets around a type signature indicate a list of items of that type, such as a list of frames
frame_list_type = f"[{type_signature}]"
stream_frames_transposed = Type(frame_list_type).td(stream_frames).transpose(deep=True)
### Using the transposed data, get the `speed` values for the first five frames
print(f"Values of `speed` for first 5 frames: {stream_frames_transposed['speed'].py()[:5]}")