pict-rs/releases/0.4.0.md
asonix fa74e3c82d
Some checks are pending
continuous-integration/drone/push Build is running
continuous-integration/drone/tag Build is passing
Flesh out remaing feature descriptions, fixes, changes, add upgrade notes
2023-07-08 21:34:21 -05:00

19 KiB

pict-rs 0.4.0

I am excited to announce a new stable version of pict-rs, version 0.4.0. There's a few big things in 0.4 that might be exciting to developers and admins. Below you'll find upgrade notes for administrators, as well as a list of changes from 0.3 to 0.4.

Overview

Headline Features:

Changes

Other Features:

Fixes

Removals

Upgrade Notes

If you're running the provided docker container with default configuration values, the 0.4 container can be pulled and launched with no changes. There is a migration from the 0.3 database format to the 0.4 database format that will occur automatically on first launch. After the migration is complete, the server will start.

If you have any custom configuration, or are running outside of docker, see Reworked Configuration below. It is likely that you will need to change configuration values for the migration to run properly.

More information can be found in the pict-rs readme

Descriptions

Reworked Configuration

Starting off with the most important information for server admins: The configuration format has changed. The pict-rs.toml is now far better organized, and includes more configuration options than before. Every field has been moved, so please take a look at the example pict-rs.toml file for information on how configuration works in 0.4.

Notable changes:

  • RUST_LOG no longer has an effect, see PICTRS__TRACING__LOGGING__TARGETS and PICTRS__TRACING__OPENTELEMETRY__TARGETS for configuring log levels.
  • PICTRS__ADDRESS has become PICTRS__SERVER__ADDRESS
  • PICTRS__PATH has become PICTRS__OLD_DB__PATH, PICTRS__REPO__PATH and PICTRS__STORE__PATH

The pict-rs.toml equivalents:

# pict-rs.toml

[server]
# same format as the previous PICTRS__ADDRESS
address = "0.0.0.0:8080"

[tracing.logging]
# same format as RUST_LOG
targets = "info"

[tracing.opentelemetry]
# same format as RUST_LOG
targets = "debug"

[old_db]
# same as the previous PICTRS__PATH
path = "/mnt"

[repo]
# Added the `sled-repo` subdirectory to the previous PICTRS__PATH
path = "/mnt/sled-repo"

[store]
# Added the `files` subdirectory to the previous PICTRS__PATH
path = "/mnt/files"

As mentioned above, every configuration option has moved. Check the linked example configuration file for more information.

Reworked Commandline

Also notable for server admins: the commandline interface has changed. All configuration that can be expressed via the pict-rs.toml file or related environment variables can be set via the commandline as well.

Running the pict-rs server now requires invoking the run subcommand. Here are some examples:

$ pict-rs run
$ pict-rs -c /etc/pict-rs.toml run
$ pict-rs run filesystem -p /path/to/files sled -p /path/to/metadata

There is one flag that I will call out here: --save-to <path>. This can be incredibly helpful when debugging pict-rs. It dumps the configuration that pict-rs is currently using to a file of your choosing. Example:

$ PICTRS__SERVER__ADDRESS=127.0.0.1:1234 pict-rs --save-to debug.toml run
CTRL^C
$ cat debug.toml

It can also be useful to save your existing commandline arguments into a reusable configuration:

$ pict-rs --save-to config.toml \
    --log-targets warn \
    --log-format json \
    --console-address '127.0.0.1:6669' \
    --opentelemetry-url 'http://localhost:4317' \
    run \
        -a '127.0.0.1:1234' \
        --api-key "$API_KEY" \
        --client-pool-size 10 \
        --media-max-width 200 \
        --media-max-height 200 \
        --media-max-file-size 1 \
        --media-enable-full-video true \
        --media-video-codec av1 \
        --media-format webp \
    filesystem \
        -p /opt/pict-rs/files \
    sled \
        -p /opt/pict-rs/metadata \
        -e /opt/pict-rs/exports
CTRL^C
$ cat config.toml
$ pict-rs -c config.toml run

Object Storage

This is the biggest feature for pict-rs 0.4, but it isn't new. pict-rs 0.3 had initial support for uploading to object storage hidden behind the object-storage compile flag. My docker images and provided binaries did not enable this feature, so only folks compiling from source could even have tried using it. In 0.4 I am now much more confident in the object storage support, and use it myself for my personal setup.

Although pict-rs now supports object storage as a backend for storing uploaded media, it doesn't support generating URLs that bypass the pict-rs server for serving images. I've seen this misconception a couple times during the release candidate process and I would like to put these rumors to rest.

Using object storage when you're in a cloud environment is sensible. The cost for adding object storage is far less than for adding block storage, and although there are egress fees, they are typically very low. Admins currently running pict-rs on a VPS should consider moving to object storage if they find their disk usage is quickly growing.

Using object storage does not fully remove pict-rs' dependence on a local disk. pict-rs stores its metadata in an embedded key/value store called sled, which means it still requires disk access to function.

For new admins, using object storage from the start might be worth considering, but is not required. pict-rs provides a built-in migration path for moving from filesystem storage to object storage, which is documented in the pict-rs readme.

Backgrounded Uploads

This feature is important for keeping pict-rs fast. Since media needs to be processed on upload, upload times may grow long. This is especially true when considering pict-rs' other new features like full video and image preprocessing.

Backgrounded uploads work by deferring the image processing work to a background task in pict-rs, and responding to the upload HTTP request as soon as the provided file has been stored successfully. Instead of returning a file alias like the inline upload endpoint, it instead returns a UUID that represents the current upload process. That UUID can be given to a new claim endpoint to retrieve the uploaded media's alias, or an error in the event the media was rejected or failed to process. The claim endpoint uses a technique called "long polling" in order to notify clients on completion of the background processing. Clients can request to claim uploaded media as soon as the backgrounded endpoint reponds with an upload ID, and the server will hold the connection open for at most 10 seconds while it waits for processing to complete. If processing did not complete in time, the client can request again to claim the upload.

Full Video

pict-rs now supports uploading video with audio. In 0.3, pict-rs would always strip audio from uploaded videos. This was intended as a gif-like feature, since pict-rs' primary use is for storing pictures and not videos. By default, full video is disabled in pict-rs 0.4, but server admins can opt into full video uploads by setting PICTRS__MEDIA__ENABLE_FULL_VIDEO=true.

Along with supporting full video, an additional configuration option has been added to help keep down on file sizes and processing time: PICTRS__MEDIA__MAX_FRAME_COUNT. This option limits how many frames an uploaded video is allowed to have, rejecting the upload if it exceeds the provided value.

It is important to note that while pict-rs is capable of storing videos, it's processing functionality is limited to just the video's thumbnail.

Default Video Codec

pict-rs 0.4 has changed the default video codec from h264 to vp9. This was done because mp4 is still patent-encumbered, and I don't want to ship software the defaults to using it. This change may result is slower transcoding times.

Published Dependencies

pict-rs 0.3's experimental object storage support relied on a custom fork of the rust-s3 library, which did useful things like enable re-use of an HTTP Client between connections, and making the Client a trait, so it could be implemented for any arbitrary HTTP Client. Since then, the PR that was meant to upstream those changes was closed, and so pict-rs needed to find a replacement that suited it's needs.

pict-rs now relies on the rusty-s3 library for enabling object storage. This library takes a sans-io approach, and was simple to integrate with awc, my HTTP Client of choice.

HEAD Endpoints

For the /image/original/{filename} and /image/process.{ext}?src={filename} endpoints, there are now analogous HEAD endpoints for fetching just the headers their GET counterparts would return.

Healthcheck Endpoints

pict-rs now provides a /healthz endpoint that can be used to check if the running process is healthy. It currently attempts to write to the sled database and fetch information from the configured store, returning 200 if all operations succeeded.

Image Preprocessing

pict-rs 0.4 introduces a new configuration option: PICTRS__MEDIA__PREPROCESS_STEPS. This option accepts the same syntax as the process endpoint, and can be used to apply transformations to all uploaded images.

Here are some examples:

# pict-rs.toml

[media]
# Crop all uploaded media to a 16x9 aspect ratio, and then shrink to fit within a 1200 by 1200 pixel box.
preprocess_steps = "crop=16x9&resize=1200"
# docker-compose.yml

services:
  pictrs:
    image: asonix/pictrs:0.4.0
    ports:
      - "127.0.0.1:8080:8080"
    restart: always
    environment:
      # Blur all uploaded images with a sigma value of 1.5
      - PICTRS__MEDIA__PREPROCESS_STEPS=blur=1.5
    volumes:
      - ./volumes/pictrs:/mnt

GIF configuration

pict-rs 0.3 would transcode all uploaded GIF files to video formats. This is no longer the case. Some new configuration options have been introduced to determine when GIF files should and should not be converted to silent videos. By default, GIF files will remain GIFs if their dimensions are smaller than 128x128 px, and they don't have more than 100 frames. GIFs to that do fit these restrictions will be transcoded to a video format.

To tweak these values, the following environment variables can be set:

# docker-compose.yml

services:
  pictrs:
    # ...
    environment:
      - PICTRS__MEDIA__GIF__MAX_WIDTH=128
      - PICTRS__MEDIA__GIF__MAX_HEIGHT=128
      - PICTRS__MEDIA__GIF__MAX_AREA=16384
      - PICTRS__MEDIA__GIF__MAX_FRAME_COUNT=100

Alternatively, these values can be set in pict-rs.toml

# pict-rs.toml

[media.gif]
max_width = 128
max_height = 128
max_area = 16384
max_frame_count = 100

JXL and AVIF

pict-rs 0.4 introduces support for two new image formts: JPEG XL (JXL) and AVIF. These two image formats offer better compression to quality ratios than other standard formats, although they may be less well-supported by browsers or other clients. Not only can these formats be uploaded and handled now, but they can also be used with the PICTRS__MEDIA__FORMAT variable for transcoding all uploaded images.

Exmaples:

# docker-compose.yml

services:
  pictrs:
    # ...
    environment:
      - PICTRS__MEDIA__FORMAT=avif
# pict-rs.toml
[media]
format = "jxl"

H265, VP8, VP9, and AV1

As mentioned earlier, the default video codec has changed from h264 to vp9, however, pict-rs 0.4 introduces support for a broader range of video codecs:

  • in the mp4 container
    • h264
    • h265
  • in the webm container
    • vp8
    • vp9
    • av1

The default video codec can be changed with the following settings:

# docker-compose.yml

services:
  pictrs:
    # ...
    environment:
      - PICTRS__MEDIA__VIDEO_CODEC=av1
# pict-rs.toml

[media]
video_codec = "h265"

Note that h264 and h265 in the mp4 container are still patent-encumbered, and you may wish to avoid them.

Audio Codecs

With the introduction of full video, pict-rs adds the ability to configure the audio codec used for uploaded videos. By default, pict-rs will choose a reasonable codec for the provided video codec. This means that there is no need to configure the audio codec yourself. The option is still available, though.

Here is the existing mapping between video and audio codecs:

  • av1, vp8, and vp9 use opus by default
  • h264 and h265 use aac by default

Available codecs for configuration are

  • opus
  • vorbis
  • aac

Setting a preferred codec can be done with the following configurations:

# docker-compose.yml

services:
  pictrs:
    # ...
    environment:
      - PICTRS__MEDIA__AUDIO_CODEC=vorbis
# pict-rs.toml

[media]
audio_codec = "opus"

404 Image

pict-rs 0.4 adds support for setting a "Not Found" image. This image will be returned for any request for an image that does not exist on the server. This is done through an internal endpoint, which means it requires pict-rs' PICTRS__SERVER__API_KEY to be set.

After uploading an image to pict-rs, it's name can be submitted to the /internal/set_not_found endpoint with the following JSON:

{
    "alias": "79087a4f-ea31-4878-9e8a-0498dada597b.webp"
}

If the alias exists on the server, it will mark that image for use as the "404 image"

DB Export

Part of administering a server means taking regular backups. Many backup systems rely on simply copying files around, which doesn't work well for pict-rs' sled database, which may change while being read by the backup tool.

One solution to this is to use a snapshotting filesystem like btrfs, bcachefs, or zfs. A filesystem snapshot is "atomic", and will represent the database at an exact moment in time.

For admins running on more common filesystems such as ext4 or xfs, there is now a configuration option and internal endpoint for producing immutable copies of the current database state.

# pict-rs.toml

[repo]
export_path = "./data/exports"
# docker-compose.yml

services:
  pictrs:
    # ...
    environment:
      - PICTRS__REPO__EXPORT_PATH=/opt/pict-rs/exports

When this value is set, the /internal/export endpoint becomes available. Hitting that endpoint will result in pict-rs exporting its current state to a new timestamped directory inside the export_path. This directory contains the same structure as the sled-repo directory, and in the event of data loss, can be restored by simply moving it into place.

Here's an example process for restoring from an export:

  1. Stop pict-rs
  2. Move your current sled-repo directory to a safe location (e.g. sled-repo.bak)
    $ mv sled-repo sled-repo.bak
    
  3. Copy an exported database to sled-repo
    $ cp -r exports/2023-07-08T22:26:21.194126713Z sled-repo
    
  4. Start pict-rs

Identifier Endpoint

pict-rs 0.4 introduces a new endpoint for retrieving an alias' "True Name", the Identifier. An Identifier represents the real path to the file in filesystem storage or in object storage. Currently these paths are generated using the same algorithm, but pict-rs retains the right to optimize identifiers differently for filesystem and object storage in the future.

Variant Cleanup Endpoint

This is one of the less exciting features, and won't be very commonly used. A new endpoint at /internal/variants accepts the DELETE verb. This will spawn a background task that iterates over all files in the database, removing any generated variants of an image, keeping only the Original. All variants can be generated again on request.

Client Pool Configuration

Another less exciting feature, the number of connections pict-rs' internal HTTP Client maintains in its pool can now be configured. This value is 100 by default, and shouldn't need to be tweaked. If you are experiencing heavy traffic and are using object storage, you could increase the value. If you are constrained on allowed open file descriptors, you can decrease the value.

Note that this value is multiplied by the number of threads pict-rs spawns, which depends on the number of cores allowed to the application. Limiting pict-rs to 2 CPUs with default pool settings will result in an effective limit of 200 connections.

GIF/webm Transparency

In 0.3, pict-rs removed transparency from uploaded gifs and videos. 0.4 supports preserving the transparency of GIF and webm files on upload.

Image Orientation

In 0.3, pict-rs would strip metadata from all images. This would remove metadata about the intended orientation of an image, and lead to uploaded media being sideways or upside down. pict-rs 0.4 still strips all metadata from uploaded images, but it now applies a rotation to images that require it as well. Images uploaded in version 0.4 should no longer be sideways or upside down.

io_uring File Write

Due to a conflict between the AsyncReadExt trait and StreamReader, actix-web's Payload stream would be polled after completion when pict-rs' io-uring feature was enabled. This has been fixed by using the read_buf API rather than the read_to_end API in the io_uring file implementation.

Note that io_uring is still considered an experimental feature, and probably should not be used in production. YMMV.

Blur Sigma

pict-rs' blur filter would incorrectly apply the provided sigma value as the radius for a gaussian blur. This has been fixed in 0.4 to properly set the sigma value with a default radius of 0. This tells imagemagick to calculate a reasonable radius for the given image and sigma.

Purge by disk name

Due to changes in how pict-rs stores files in 0.4 (introducing a pluggable store backend), the /internal/purge endpoint can no longer be provided with a filename query, and must be given an alias query instead.

Aliases by disk name

Due to changes in how pict-rs stores files in 0.4 (introducing a pluggable store backend), the /internal/aliases endpoint can no longer be provided with a filename query, and must be given an alias query instead.

Filename Endpoint

This was replaced by the Identifier endpoint, since files are no longer required to be stored on the filesystem.